Added some synchronization to allocate stamp to compliment the sentiment of isLeader and it's sync on byte[] d.
Need to be very cautious but so far the integration tests are passing. I intend on profiling it a lot more but I think I'm off to a decent start now. On Jan 17, 2018 1:16 PM, "GitBox" <g...@apache.org> wrote: > kpm1985 opened a new pull request #1004: FLUO-1000 OracleServer race > conditions > URL: https://github.com/apache/fluo/pull/1004 > > > This pull request is motivated by issue #1000 and is a work in > progress. There are two main issues here that I've identified, they are > both in OracleServer. > > 1) isLeader has a race condition, it is a volatile var so I've set the > flag at the beginning of the LeaderSelector callback method takeLeadership. > > 2) There are two curator frameworks in OracleServer. One comes from > sharedResources and doesn't seem to cause any issues, but the one created > during the start method does cause issues. Specifically when takeLeadership > is called, the curatorFramework may be in a state that is not > CuratorFrameworkState.STARTED. One would think blockUntilConnected() would > resolve this problem, but if you dig into the curator code, the > state.started is not checked. To be clear, blockUntilConnected does not > solve the problem. I have found that if you spin on > CuratorFrameworkState.STARTED these exceptions disappear. > > I'd welcome some analysis when everyone gets a little time. In the > meanwhile I'll continue to post on #1000 and leave this section for the > code changes. > > > > ---------------------------------------------------------------- > This is an automated message from the Apache Git Service. > To respond to the message, please log on GitHub and use the > URL above to go to the specific comment. > > For queries about this service, please contact Infrastructure at: > us...@infra.apache.org > > > With regards, > Apache Git Services >