[
https://issues.apache.org/jira/browse/HBASE-30097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang resolved HBASE-30097.
-------------------------------
Resolution: Duplicate
> Fix flaky TestBlockBytesScannedQuota
> ------------------------------------
>
> Key: HBASE-30097
> URL: https://issues.apache.org/jira/browse/HBASE-30097
> Project: HBase
> Issue Type: Sub-task
> Components: test
> Reporter: Duo Zhang
> Priority: Major
>
> Sonnet 4.5(4.6?) summary
> TestBlockBytesScannedQuota
> The test is flapping due to a timing/race condition in the quota system:
> 5-second timeout too short: The testTraffic method only waited 5 seconds for
> quotas to take effect
> Quota cache not fully propagated: On slower systems (like CI), the quota
> cache refresh can be asynchronous and may not fully propagate in time
> Quotas bypassed: When cache isn't refreshed, the logs show "bypass expected
> false, actual true", meaning all requests succeed instead of being throttled
> Insufficient retries: Each iteration takes ~1.3 seconds, so only 3-4 retries
> fit in 5 seconds, not enough for the quota system to stabilize
> Bad run pattern:
> Test expects 1 successful request but gets 5 (all succeed because quotas not
> enforced)
> Retries every ~1.3 seconds for 4 attempts
> Times out after 5 seconds with "Waiting timed out after [5,000] msec"
> Good run pattern:
> Quotas enforced immediately
> Tests pass quickly (36.97s total vs 63.14s for failed run)
> Increased the timeout in testTraffic() from 5,000ms to 30,000ms (line 263).
> This gives the quota system sufficient time to:
> Complete cache refresh
> Propagate quota settings across all components
> Handle slower CI environments
> This is a conservative fix that maintains the retry logic while allowing
> adequate time for the distributed quota system to stabilize. The 30-second
> timeout is still reasonable for a test and should handle the asynchronous
> nature of quota enforcement.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)