Re: TreeCache question

Scott Blum Thu, 28 Jan 2016 08:43:02 -0800

https://issues.apache.org/jira/browse/CURATOR-294


On Thu, Jan 28, 2016 at 11:38 AM, Scott Blum <[email protected]> wrote:

> Nick,
>
> Looking at this closer, I see your point!  There is a potential race
> between an update thread updating stat+data, and a reader thread accessing
> it, and it looks like we could make some changes to the implementation to
> simplify and fix this without affecting the public API.
>
> Here's a tentative plan:
>
> 1) Change the existing ChildData to make a hard data ref instead of an
> embedded atomic ref.
> 2) Create a package-protected subclass (ChildDataWithClear?) that leaves
> 'data' null and has a dataRef atomic ref, move clearData() there.
> 3) PathChildrenCache would use the subclass.
> 4) Refactor TreeCache to combine the stat and data fields into a single
> atomic ref on ChildData as you suggested.
>
> And as you point out, if we cache ChildData objects in the TreeNodes, then
> we can avoid a lot of new object construction on queries.
>
> Sound good?
> Scott
>
>
> On Thu, Jan 28, 2016 at 10:21 AM, Nick Hill <[email protected]> wrote:
>
>> Thanks Cameron, Scott,
>>
>> I'm not aware of any
>>> concurrency bugs in TreeCache right now, I don't think I relied on the
>>> atomic refs for safety.
>>>
>>
>> I guess I wasn't aware in general that atomic refs had a purpose other
>> than this.
>>
>> It just seems that not guaranteeing consistency of the stat and data in
>> the returned ChildData objects restricts the usefulness of the cache, since
>> you would typically have to get the data/stat again directly before doing
>> any conditional update to a znode.
>>
>> Ignoring the less-important suggestion for simplifying ChildData itself,
>> wouldn't there still be significant value in a simple change to move the
>> TreeNode data and stat atomic refs into a single ChildData atomic ref?
>>
>> Cheers,
>> Nick
>>
>>
>> Quoting Scott Blum <[email protected]>:
>>
>> HI Nick,
>>>
>>> TreeCache came later, and literally the only reason for reusing ChildData
>>> was to not create additional API surface area, and make for an easier
>>> transition / drop-in replacement for NodeCache & PathChildrenCache.
>>> Ordinarily, I'm a big fan of immutable objects.  I'm not aware of any
>>> concurrency bugs in TreeCache right now, I don't think I relied on the
>>> atomic refs for safety.
>>>
>>> The main issue with changing it would just be additional surface area or
>>> breaking old client code.
>>>
>>> HTH!
>>> Scott
>>>
>>>
>>> On Wed, Jan 27, 2016 at 10:16 PM, Cameron McKenzie <
>>> [email protected]>
>>> wrote:
>>>
>>> hey Nick,
>>>> Sounds like a reasonable suggestion to me. I'll let Scott chime in
>>>> though
>>>> as he wrote that code originally and may have had some reason for
>>>> structuring it as he has.
>>>> cheers
>>>>
>>>> On Thu, Jan 28, 2016 at 1:42 PM, Nick Hill <[email protected]> wrote:
>>>>
>>>> > Hi, I have been looking at the TreeCache impl and have some questions.
>>>> >
>>>> > It doesn't look right to me that there's separate atomic refs for a
>>>> node's
>>>> > data and stat. It seems the stat in a ChildData object obtained from
>>>> > getCurrentData() might not correspond to the data that it's with. This
>>>> > could be problematic when doing conditional state changes given
>>>> assumptions
>>>> > about that data.
>>>> >
>>>> > An obvious and simple solution to this would be to have a single
>>>> > AtomicReference<ChildData> field instead, which would have the
>>>> additional
>>>> > significant benefit of eliminating ChildData obj creation on every
>>>> cache
>>>> > access. PathChildrenCache works this way, but my understanding was
>>>> that
>>>> > TreeCache is intended to be a (more flexible) replacement.
>>>> >
>>>> > Furthermore I'd propose that the data field of ChildData be just a
>>>> final
>>>> > byte[] instead of an AtomicReference. This would avoid needing two
>>>> volatile
>>>> > reads to get to the data, and mean that "sharing" these (per above)
>>>> is a
>>>> > bit safer. The ChildData byte[] AtomicReference is only used by
>>>> > PathChildrenCache.clearDataBytes() (not currently used by TreeCache at
>>>> > all), and that capability could be easily maintained by having
>>>> > PathChildrenCache use it's own simple subclass of ChildData containing
>>>> the
>>>> > atomic reference.
>>>> > If similar capability were to be added to TreeCache, I'd suggest it
>>>> would
>>>> > be better to just replace the node's ChildData object with a copy that
>>>> has
>>>> > the byte[] field nulled out (but same stat ref).
>>>> >
>>>> > I'm fairly new to the code so apologies if there's something I've
>>>> > missed/misunderstood! But if there is agreement, I'd also be happy to
>>>> > prepare a PR.
>>>> >
>>>> > Regards,
>>>> > Nick
>>>> >
>>>>
>>>>
>>
>>
>

Re: TreeCache question

Reply via email to