Re: [Neo4j] Unrolled Linked List

Niels Hoogeveen Fri, 23 Sep 2011 15:20:59 -0700

Read integrity is really a dog. We haven't even begun to address that in the 
other collections. 
With regards to write locks ( and this is something we should check in 
sortedtree too ) is code like:
 page.setProperty( ITEM_COUNT, ((Integer) page.getProperty( ITEM_COUNT )) + 1 );
This is only threadsafe if the value returned by page.getProperty( ITEM_COUNT ) 
is read from a locked node.
Niels


> Date: Sat, 24 Sep 2011 09:14:07 +1200
> From: bryc...@gmail.com
> To: user@lists.neo4j.org
> Subject: Re: [Neo4j] Unrolled Linked List
> 
> For writing that works well, and in fact for add node I am doing that (just
> realised I am not for remove node but I should be).  The problems for
> reading are:
> 
>    - it should allow multiple threads to read at the same time
>    - it shouldn't dictate that client code has a transaction in order to
>    read
> 
> As a simple solution thats probably workable (and probably the safest), and
> means that HA will just work, but restricting one thread at a time into a
> given node collection isn't the best.
> 
> Maybe the client code should set whether it locks the data structure when
> reading, or fails with a ConcurrentModificationException when reading and
> data is changed.
> 
> On Sat, Sep 24, 2011 at 6:00 AM, Niels Hoogeveen
> <pd_aficion...@hotmail.com>wrote:
> 
> >
> > A quick skim of the code shows me you have a baseNode which is an
> > entrypoint for the ULL. This is a logical candidate node to use for the
> > purpose of locking.
> > What are the pros and cons to locking the baseNode on every read and write
> > operation?
> > Niels
> > > Date: Fri, 23 Sep 2011 09:39:38 +1200
> > > From: bryc...@gmail.com
> > > To: user@lists.neo4j.org
> > > Subject: Re: [Neo4j] Unrolled Linked List
> > >
> > > Good stuff.
> > >
> > > I am presently looking into concurrent use of a given UnrolledLinkedList
> > at
> > > least within the same graph database instance, might be a little bit
> > harder
> > > in HA environment.  Its hard enough writing test cases for this, maybe
> > even
> > > harder than making it work properly!  Hoping that some utility code I am
> > > going to produce will help with testing concurrency of other data
> > > structures.
> > >
> > > By concurrent use I mean concurrent use of the data within the graph, not
> > of
> > > the given instantiation of the class, e.g. what happens when one thread
> > gets
> > > an instance of ULL based off a given node and is iterating over it, then
> > > another thread gets an instance of a ULL and writes into it.
> > >
> > > Cheers
> > > Bryce
> > >
> > > On Fri, Sep 23, 2011 at 4:57 AM, Niels Hoogeveen
> > > <pd_aficion...@hotmail.com>wrote:
> > >
> > > >
> > > > It looks really cool.
> > > > I always find it fun to create something and later find out it is an
> > > > already known construction (something worth inventing).
> > > > Anyway, I pulled your code and will removed the dependencies to the
> > > > Enhanced API stuff this week. After that we can start adding some
> > > > documentation.
> > > > Niels
> > > >
> > > > > Date: Thu, 22 Sep 2011 15:57:13 +1200
> > > > > From: bryc...@gmail.com
> > > > > To: user@lists.neo4j.org
> > > > > Subject: [Neo4j] Unrolled Linked List
> > > > >
> > > > > Hi all,
> > > > >
> > > > > I have added an in graph representation of an unrolled linked list to
> > the
> > > > > graph collections code, currently just in my githug repo:
> > > > > https://github.com/brycenz/graph-collections
> > > > >
> > > > > See this in particular:
> > > > >
> > > >
> > https://github.com/brycenz/graph-collections/blob/master/src/main/java/org/neo4j/collections/list/UnrolledLinkedList.java
> > > > >
> > > > > The name comes from:
> > > > > http://en.wikipedia.org/wiki/Unrolled_linked_list
> > > > >
> > > > > And it works roughly in the same manner, though I had the idea prior
> > to
> > > > > reading the wiki article.
> > > > >
> > > > > As the UnrolledLinkedList class implements the NodeCollection
> > interface
> > > > it
> > > > > can be used as the backing of an IndexedRelationship, which is done
> > in
> > > > tests
> > > > > here:
> > > > >
> > > >
> > https://github.com/brycenz/graph-collections/blob/master/src/test/java/org/neo4j/collections/indexedrelationship/TestUnrolledLinkedListIndexedRelationship.java
> > > > >
> > > > > The main reason for me being interested in this, and an example of
> > where
> > > > > this is (probably) really useful is in the following case:
> > > > >
> > > > >    - you have a number of tag (or category, folder etc.) nodes
> > > > >    - they each link to a large number of document (or article,
> > comments,
> > > > >    post etc.) nodes
> > > > >    - using a single relationship type
> > > > >    - you generally only are interested in showing the newest
> > documents in
> > > > >    descending date order (showing the head, in a paged ui)
> > > > >    - documents are generally added in ascending date order (added to
> > the
> > > > >    head)
> > > > >
> > > > > The benefits come from being able to iterate over a small percentage
> > of a
> > > > > collection of nodes in a fixed order without having to first load all
> > the
> > > > > nodes and sort them.  This reduces the amount of data read in from
> > disk,
> > > > > reduces the turnover of data in memory, and therefore aids with
> > reduction
> > > > in
> > > > > garbage collection.  In my case I have a large number of tags with a
> > > > large
> > > > > number of items against them, I might read the first 100-200 items
> > out of
> > > > a
> > > > > collection of 30,000 and therefore by not reading in the other 29800
> > > > > relationships / nodes (per tag) I should be saving 90% or more.....
> > > > here's
> > > > > hoping.
> > > > >
> > > > > From the java doc: The structure is broken into "pages" of links to
> > nodes
> > > > > where the size of the page can be controlled at initial construction
> > > > time.
> > > > > Page size is not fixed but instead can float between a lower bound,
> > and
> > > > an
> > > > > upper bound. The bounds are at a fixed margin from the page size of
> > M.
> > > > When
> > > > > a page drops below the lower bound it will be joined onto the an
> > adjacent
> > > > > page, and when the page goes above the upper bound it will be split
> > in
> > > > half.
> > > > >
> > > > > I am about to do some tests with this based on my use case and will
> > > > report
> > > > > back on the performance impacts.
> > > > >
> > > > > Cheers
> > > > > Bryce
> > > > >
> > > > > P.S. still thinking about how to make this thread safe, any
> > suggestions
> > > > > would be appreciated (presently only one thread will be able to write
> > at
> > > > a
> > > > > time, I am worried about a thread reading while another is writing,
> > > > > especially when it joins / splits pages or changes the head).
> > > > > _______________________________________________
> > > > > Neo4j mailing list
> > > > > User@lists.neo4j.org
> > > > > https://lists.neo4j.org/mailman/listinfo/user
> > > >
> > > > _______________________________________________
> > > > Neo4j mailing list
> > > > User@lists.neo4j.org
> > > > https://lists.neo4j.org/mailman/listinfo/user
> > > >
> > > _______________________________________________
> > > Neo4j mailing list
> > > User@lists.neo4j.org
> > > https://lists.neo4j.org/mailman/listinfo/user
> >
> > _______________________________________________
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
                                          
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Unrolled Linked List

Reply via email to