[ https://issues.apache.org/jira/browse/SLING-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Timothee Maret closed SLING-8531. --------------------------------- > Support JournalAvailabilityChecker exponential backoff > ------------------------------------------------------- > > Key: SLING-8531 > URL: https://issues.apache.org/jira/browse/SLING-8531 > Project: Sling > Issue Type: Improvement > Components: Content Distribution > Affects Versions: Content Distribution Journal Core 0.1.2 > Reporter: Timothee Maret > Assignee: Christian Schneider > Priority: Major > Fix For: Content Distribution Journal Core 0.1.4, Content > Distribution Journal Kafka 0.1.4, Content Distribution Journal Messages 0.1.2 > > Time Spent: 20m > Remaining Estimate: 0h > > The average load generated by JournalAvailabilityChecker multiplies quickly > for multi tenant deployments. The checker can be configured (via Sling > Scheduler {{scheduler.period}}) to reduce the polling frequency but doing so > also reduces the sensibility to detect availability changes. > To improve the sensibility we should support an exponential backoff > algorithm. The algorithm would divide the rate by two (up to a limit) every > time the availability status does not change and reset the rate when the > status changes. Steady states (available or unavailable) would eventually > yield the least load. In the average case (availability status is steady) the > load will be reduced up to the limit. In the worst case (availability changes > all the time) the load will not be reduced compared to today. > The base rate would be Sling Scheduler {{scheduler.period}}. The rate at time > t + 1 would be computed as follow: Rate~t+1~ = Multiplier~t+1~ * Rate~t+1~. > The table below summarise how the multiplier would evolve according to the > available status change. > ||State~t~||State~t+1~||Multiplier~t+1~|| > |unavailable|unavailable|max(2 * Multiplier~t~, limit)| > |unavailable|available|1| > |available|unavailable|1| > |available|available|max(2 * Multiplier~t~, limit)| > The limit would be hardcoded to 16 which would reduce the load by an order of > magnitude, we could expose the limit as a configuration later if needed. > There should be no need to randomise the multiplier for now as the checker > are expected to be started at random time. If we hit a scenario where the > checkers start at the same time, we could simply randomise the first > scheduled event. > -- This message was sent by Atlassian Jira (v8.3.4#803005)