16 days, 4 channels, no AWS Bedrock quota fix

TL;DR: AWS is throttling our Bedrock usage at 1,000x below the minimum quota and won’t increase it.
It has been 16 days since we filed AWS Support case 177762443200796 on the Business Support+ plan (paid). The Bedrock service team denied our quota increase under criteria their own Service Quotas tool prevents us from satisfying. We have worked four documented escalation channels: Business Support+ tier-1, the standard Service Quotas console, AWS Sales, and AWS Activate. None resolved the throttle.
The quotas
Our account runs a small SaaS workload on Bedrock in us-west-2 through cross-region inference profiles . As of May 1 our applied request-per-minute (RPM) values were:
| Model | Applied RPM | AWS default RPM | Ratio |
|---|---|---|---|
| Claude Haiku 4.5 (global cross-region) | 10 | 10,000 | 1,000x below |
| Claude Sonnet 4.5 V1 (global cross-region) | 10 | 10,000 | 1,000x below |
| Claude Opus 4.6 V1 (cross-region) | 5 | 10,000 | 2,000x below |
CloudWatch’s AWS/Bedrock namespace showed a 14% throttle rate on Haiku and 16% on Opus over the 14 days preceding the case. We have seen over $1k MRR in user churn directly attributable to these throttles. The caps were applied to the account at creation. We did not request them and were not told why they were set.
What we tried
- Filed three
RequestServiceQuotaIncreasetickets through the standard Service Quotas console at the AWS-documented default value of 10,000 RPM. - Confirmed our routing matches AWS’s published optimal configuration:
global.cross-region for Haiku and Sonnet,us.cross-region for Opus. - Audited our client retry behavior. Our SDK uses
maxAttempts: 1so a single throttled request does not turn into three additional invocations that consume the quota. - Upgraded our support tier to paid Business Support+ .
- Filed a Business Support+ case (177762443200796) covering all three models in one thread once the individual tickets stalled.
The denial
The Bedrock service team responded through the Business Support+ thread on May 5:
Thank you for providing the required information. I have submitted your request to our service team and they have informed me that they are unable to approve your request at this time.
These quotas help you avoid large bills due to sudden, unexpected spikes in activity. After we have a broader window of usage on your account to review, then we can reassess your request.
Two facts are worth noting. First, Bedrock is priced per token, not per request. RPM throttling does not protect a customer from billing spikes. Only token throughput does. Second, “broader window of usage” was not quantified. We asked what specific thresholds (months of use, dollars billed, requests served) would unblock approval. We did not get a number.
The contradiction
After the denial we asked whether a smaller, graduated increase (for example, 100 RPM rising to 1,000 and then 10,000 over 60 days) could be applied by hand. The case agent confirmed in writing that the Service Quotas tool only accepts requests at or above the default:
For Haiku 4.5 TPM value cannot be below the default quota. The requested value of 47,000 TPM is below the minimum default quota of 2,500,000 TPM for this model and region.
So the situation is:
- The service team requires “broader usage history” before approving a quota increase.
- Our applied cap is 1,000x to 2,000x below the documented default, which throttles the usage the service team wants to see.
- The Service Quotas tool only accepts requests at or above the default value. No graduated request can be filed through the standard channel.
There is no value a customer in our position can request that the system will accept and the service team will approve.
Escalation timeline
| Date | Event |
|---|---|
| May 1 | Filed three Service Quotas tickets at 10,000 RPM. Opened Business Support+ case 177762443200796. |
| May 2 | First substantive reply from tier-1 support. Promised “priority review” by the Bedrock team. |
| May 4 | Tier-1 agent unable to submit our actual numeric request through the internal tool. The floor rule surfaces. |
| May 5 | Service team denial: “broader usage history needed.” |
| May 6 | Tier-1 redirects us to AWS Sales via aws.amazon.com/contact-us/sales-support/ . We file the form. |
| May 8 | Sales has not contacted us in two business days. We ask tier-1 whether the handoff was received. |
| May 10 | We separately contact AWS Activate. Activate replies they cannot help and directs us back to AWS Support. |
| May 11 | An AWS Associate Account Executive emails us a meeting link, acknowledges the revenue loss, and offers to “expedite.” |
| May 12 | We join the scheduled call. The meeting link does not connect. The scheduling page shows “Prelude is no longer available.” We email the AE directly. |
| May 13–16 | The AE has not replied. The case is silent. The applied quotas are unchanged. |
Each denial is technically correct under its own rule. The rules together produce a denial no procedural path can resolve.
Our case number
AWS Support case 177762443200796.
If AWS would like this post amended, address the case. Until then you are actively harming our startup’s growth potential, and we will be staying on Vercel. We highly recommend against using AWS Bedrock, due to the useless support and aggressive throttling.