Re: [Neo4j] IndexedRelationship some observations and questions

Bryce Mon, 05 Sep 2011 03:20:34 -0700

Hi Peter,

I have changed the IndexedRelationshipExpander to public, added some extra
java docs in IndexedRelationship, but more importantly changed
IndexedRelationship to allow for multiple indexed relationships from a given
node.  Will look into more changes in here once I get back onto working with
this (time pressures etc.).


BTW thought you might be interested in this.  As I said above I started
looking into this due to performance issues.  Turns out that the main
performance issues I was having was due to garbage collection.  From my
understanding of Neo4j it stores a lot in caches in weak/soft references,
and therefore it is understandable for the heap to be rather full.  Using
the CMS garbage collector it doesn't kick in by default until the heap is
92% full, so if you are rapidly fully the heap, and it is already rather
full, then the last 8% can full up before the CMS manages to do its sweep.
 At which point java kicks off a "stop the world" garbage collector, and I
was seeing up to 30sec waits!

You can adjust this via, e.g.:
-XX:CMSInitiatingOccupancyFraction=75

I am interested in seeing how the new Garbage First collector would work
out, but not so keen to put it in production right now, so I am running it
in dev.  This can be enabled via:
-XX:+UnlockExperimentalVMOptions -XX:+UseG1GC

No feedback on my other questions about crossover point, and navigating the
relationships from the other end?  Is there documentation about the
SortedTree implementation?

Cheers
Bryce

On Fri, Sep 2, 2011 at 6:55 PM, Peter Neubauer <
peter.neuba...@neotechnology.com> wrote:

> Bryce,
>
> On Fri, Sep 2, 2011 at 1:44 AM, Bryce <bryc...@gmail.com> wrote:
>
> > Hi,
> >
> > I have been looking at performance options for Neo4j as presently I have
> > been observing a number of performance issues.  I am still investigating
> > the
> > way to get the best performance out of what I am doing, and one thing it
> > might be are longer running transactions stopping other work going on
> (but
> > thats an aside to what this message is about).
> >
> > One of the things that I investigated using was the IndexedRelationship
> > work
> > by Niels.  Thought I would give a bit of feedback, although I haven't
> quite
> > got this implemented at present.
> >
> > 1) I had to change the IndexedRelationshipExpander to be a public class
> in
> > order to use it outside the package its in.
> >
> > Maybe you can fork the project and add your fix there?
>
>
> > 2) IndexedRelationship assumes only one tree root per node, whereas the
> > expander allows for multiple (IndexedRelationship uses
> > getSingleRelationship
> > vs expander using getRelationships then matching on tree name).  Having
> > multiple would obviously be good as it means you could have two types of
> > relationships covered by IndexedRelationship's.
> >
> > 3) Might pay to make it clear in the Javadocs for IndexedRelationship
> that
> > the comparator can't be an anonymous inner class.
> >
> > See above, sounds like a good contribution!
>
>
> > Then I have some questions about usage of this.  First a little
> background
> > of the model I have, from reading a few things it seems quite standard.
> >  There are a lot of "document" nodes each of which have a relationship
> with
> > multiple "tag" nodes.  Documents generally have in the order of 10-20
> tags,
> > and tags can have as few as 1 document and sometimes tens of thousands.
> >  When tags are viewed through the UI they are almost always displayed
> with
> > a
> > descending date ordered list of documents.  Seemed to be to fit quite
> well
> > with IndexedRelationship.
> >
> > 1) I was thinking of having a switch over point at say around 500
> documents
> > for a given node where I will switch from using normal relationships to
> an
> > IndexedRelationship as I was thinking at small numbers of relationships
> > normal relationships would be quicker.  Would that be correct, or not
> worth
> > it?
> >
> > 2) On the tag end (which is the incoming end of the document-tag
> > relationship) I was going to use a IndexedRelationshipExpander which
> would
> > cover the case of whether the relationship was done through normal
> > relationships, or through an IndexedRelationship.  I also need to get a
> set
> > of tags from the document end where their may be both normal
> relationships,
> > and relationships coming from multiple IndexedRelationship's.  From
> looking
> > at it IndexedRelationshipExpander doesn't cover the reverse direction,
> but
> > I
> > would imagine using a relationship expander here would be correct.  What
> > would the best way of doing this be?
> >
> > As an aside it may be a good idea to note in the configuration settings
> > page:
> >
> >
> http://wiki.neo4j.org/content/Configuration_Settings#Optimizing_for_traversals_example
> > that -XX:+UseNUMA
> > only works when using the Parallel Scavenger garbage collector (default
> > or -XX:+UseParallelGC) not the concurrent mark and sweep one.  Based on
> >
>
> Changed in the Wiki, see
>
> http://wiki.neo4j.org/content/Configuration_Settings#Configuring_the_memory_for_the_JVM
> .
> Thanks for the hint! We should get this page into the manual.
>
> /peter
>
> >
> > Cheers
> > Bryce
> > _______________________________________________
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] IndexedRelationship some observations and questions

Reply via email to