[jira] [Commented] (CURATOR-294) Optimize TreeCache, fix possible concurrency issue

Scott Blum (JIRA) Thu, 28 Jan 2016 08:43:41 -0800

    [ 
https://issues.apache.org/jira/browse/CURATOR-294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15121851#comment-15121851
 ]


Scott Blum commented on CURATOR-294:
------------------------------------

Nick,

Looking at this closer, I see your point!  There is a potential race between an 
update thread updating stat+data, and a reader thread accessing it, and it 
looks like we could make some changes to the implementation to simplify and fix 
this without affecting the public API.

Here's a tentative plan:

1) Change the existing ChildData to make a hard data ref instead of an embedded 
atomic ref.
2) Create a package-protected subclass (ChildDataWithClear?) that leaves 'data' 
null and has a dataRef atomic ref, move clearData() there.
3) PathChildrenCache would use the subclass.
4) Refactor TreeCache to combine the stat and data fields into a single atomic 
ref on ChildData as you suggested.

And as you point out, if we cache ChildData objects in the TreeNodes, then we 
can avoid a lot of new object construction on queries.

Sound good?

> Optimize TreeCache, fix possible concurrency issue
> --------------------------------------------------
>
>                 Key: CURATOR-294
>                 URL: https://issues.apache.org/jira/browse/CURATOR-294
>             Project: Apache Curator
>          Issue Type: Improvement
>          Components: Recipes
>            Reporter: Nick Hill
>            Assignee: Scott Blum
>
> Hi, I have been looking at the TreeCache impl and have some questions.
> It doesn't look right to me that there's separate atomic refs for a node's 
> data and stat. It seems the stat in a ChildData object obtained from 
> getCurrentData() might not correspond to the data that it's with. This could 
> be problematic when doing conditional state changes given assumptions about 
> that data.
> An obvious and simple solution to this would be to have a single 
> AtomicReference<ChildData> field instead, which would have the additional 
> significant benefit of eliminating ChildData obj creation on every cache 
> access. PathChildrenCache works this way, but my understanding was that 
> TreeCache is intended to be a (more flexible) replacement.
> Furthermore I'd propose that the data field of ChildData be just a final 
> byte[] instead of an AtomicReference. This would avoid needing two volatile 
> reads to get to the data, and mean that "sharing" these (per above) is a bit 
> safer. The ChildData byte[] AtomicReference is only used by 
> PathChildrenCache.clearDataBytes() (not currently used by TreeCache at all), 
> and that capability could be easily maintained by having PathChildrenCache 
> use it's own simple subclass of ChildData containing the atomic reference.
> If similar capability were to be added to TreeCache, I'd suggest it would be 
> better to just replace the node's ChildData object with a copy that has the 
> byte[] field nulled out (but same stat ref).
> I'm fairly new to the code so apologies if there's something I've 
> missed/misunderstood! But if there is agreement, I'd also be happy to prepare 
> a PR.
> Regards,
> Nick



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CURATOR-294) Optimize TreeCache, fix possible concurrency issue

Reply via email to