https://issues.apache.org/jira/browse/CURATOR-294
On Thu, Jan 28, 2016 at 11:38 AM, Scott Blum <dragonsi...@gmail.com> wrote: > Nick, > > Looking at this closer, I see your point! There is a potential race > between an update thread updating stat+data, and a reader thread accessing > it, and it looks like we could make some changes to the implementation to > simplify and fix this without affecting the public API. > > Here's a tentative plan: > > 1) Change the existing ChildData to make a hard data ref instead of an > embedded atomic ref. > 2) Create a package-protected subclass (ChildDataWithClear?) that leaves > 'data' null and has a dataRef atomic ref, move clearData() there. > 3) PathChildrenCache would use the subclass. > 4) Refactor TreeCache to combine the stat and data fields into a single > atomic ref on ChildData as you suggested. > > And as you point out, if we cache ChildData objects in the TreeNodes, then > we can avoid a lot of new object construction on queries. > > Sound good? > Scott > > > On Thu, Jan 28, 2016 at 10:21 AM, Nick Hill <apa...@nickhill.org> wrote: > >> Thanks Cameron, Scott, >> >> I'm not aware of any >>> concurrency bugs in TreeCache right now, I don't think I relied on the >>> atomic refs for safety. >>> >> >> I guess I wasn't aware in general that atomic refs had a purpose other >> than this. >> >> It just seems that not guaranteeing consistency of the stat and data in >> the returned ChildData objects restricts the usefulness of the cache, since >> you would typically have to get the data/stat again directly before doing >> any conditional update to a znode. >> >> Ignoring the less-important suggestion for simplifying ChildData itself, >> wouldn't there still be significant value in a simple change to move the >> TreeNode data and stat atomic refs into a single ChildData atomic ref? >> >> Cheers, >> Nick >> >> >> Quoting Scott Blum <dragonsi...@gmail.com>: >> >> HI Nick, >>> >>> TreeCache came later, and literally the only reason for reusing ChildData >>> was to not create additional API surface area, and make for an easier >>> transition / drop-in replacement for NodeCache & PathChildrenCache. >>> Ordinarily, I'm a big fan of immutable objects. I'm not aware of any >>> concurrency bugs in TreeCache right now, I don't think I relied on the >>> atomic refs for safety. >>> >>> The main issue with changing it would just be additional surface area or >>> breaking old client code. >>> >>> HTH! >>> Scott >>> >>> >>> On Wed, Jan 27, 2016 at 10:16 PM, Cameron McKenzie < >>> cammcken...@apache.org> >>> wrote: >>> >>> hey Nick, >>>> Sounds like a reasonable suggestion to me. I'll let Scott chime in >>>> though >>>> as he wrote that code originally and may have had some reason for >>>> structuring it as he has. >>>> cheers >>>> >>>> On Thu, Jan 28, 2016 at 1:42 PM, Nick Hill <apa...@nickhill.org> wrote: >>>> >>>> > Hi, I have been looking at the TreeCache impl and have some questions. >>>> > >>>> > It doesn't look right to me that there's separate atomic refs for a >>>> node's >>>> > data and stat. It seems the stat in a ChildData object obtained from >>>> > getCurrentData() might not correspond to the data that it's with. This >>>> > could be problematic when doing conditional state changes given >>>> assumptions >>>> > about that data. >>>> > >>>> > An obvious and simple solution to this would be to have a single >>>> > AtomicReference<ChildData> field instead, which would have the >>>> additional >>>> > significant benefit of eliminating ChildData obj creation on every >>>> cache >>>> > access. PathChildrenCache works this way, but my understanding was >>>> that >>>> > TreeCache is intended to be a (more flexible) replacement. >>>> > >>>> > Furthermore I'd propose that the data field of ChildData be just a >>>> final >>>> > byte[] instead of an AtomicReference. This would avoid needing two >>>> volatile >>>> > reads to get to the data, and mean that "sharing" these (per above) >>>> is a >>>> > bit safer. The ChildData byte[] AtomicReference is only used by >>>> > PathChildrenCache.clearDataBytes() (not currently used by TreeCache at >>>> > all), and that capability could be easily maintained by having >>>> > PathChildrenCache use it's own simple subclass of ChildData containing >>>> the >>>> > atomic reference. >>>> > If similar capability were to be added to TreeCache, I'd suggest it >>>> would >>>> > be better to just replace the node's ChildData object with a copy that >>>> has >>>> > the byte[] field nulled out (but same stat ref). >>>> > >>>> > I'm fairly new to the code so apologies if there's something I've >>>> > missed/misunderstood! But if there is agreement, I'd also be happy to >>>> > prepare a PR. >>>> > >>>> > Regards, >>>> > Nick >>>> > >>>> >>>> >> >> >