[ https://issues.apache.org/jira/browse/OAK-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Francesco Mari resolved OAK-5034. --------------------------------- Resolution: Fixed Fix Version/s: 1.5.13 1.6 Fixed at r1767246. > FileStoreUtil#readSegmentWithRetry max retry delay is too short to be > functional > -------------------------------------------------------------------------------- > > Key: OAK-5034 > URL: https://issues.apache.org/jira/browse/OAK-5034 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar > Affects Versions: Segment Tar 0.0.16 > Reporter: Timothee Maret > Assignee: Francesco Mari > Fix For: 1.6, 1.5.13 > > Attachments: OAK-5034.patch > > > The commit {{1765838}} introduced the {{FileStoreUtil#readSegmentWithRetry}} > util and reduced the period between two tries (from 2sec to 0.125s) while the > total number of tries did not change. > This does not give enough time for the server to find references and > segments, thus causing exceptions such as > {code} > 29.10.2016 05:07:37.242 *ERROR* [sling-default-2-Registered Service.605] > org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSync Failed > synchronizing state. > java.lang.IllegalStateException: Unable to read references of segment > 5168c878-3a3f-49d0-aea9-b8b57d5d867f from primary > at > org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSyncExecution.readReferences(StandbyClientSyncExecution.java:196) > at > org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSyncExecution.copySegmentHierarchyFromPrimary(StandbyClientSyncExecution.java:130) > at > org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSyncExecution.compareAgainstBaseState(StandbyClientSyncExecution.java:94) > at > org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSyncExecution.execute(StandbyClientSyncExecution.java:74) > at > org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSync.run(StandbyClientSync.java:143) > at > org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:118) > at org.quartz.core.JobRunShell.run(JobRunShell.java:202) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > and causing the client to throw exceptions, ultimately causing IT tests to > fail. > IIUC, the minimum period to retry should be bigger than a TarMK flush cycle > (5 sec). -- This message was sent by Atlassian JIRA (v6.3.4#6332)