Agelos, we are just testing help.neo4j.org, you might try to start a discussion there if that is more convenient? It's web based and should read more like a forum if you like that style.
Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Jun 15, 2011 at 8:53 PM, Agelos Pikoulas <agelos.pikou...@gmail.com> wrote: > Dear all > > /**** Im' sorry if I cant use the user@lists properly, I am indeed lost :-( > Neo4J would be so much better as a forum or a stackOverNeo4J :-)***/ > > Allow me to say, that the 50K magic number is not very useful for real & > practical modern Social Network apps. > > What if there's simply a couple of million "Person" nodes that may "LIKE" > the "Movie" nodes? > And what if I have a few million of Movies and many million of Persons ? > Its a typical case a "movie" having a few 100K rating/votes. And imagine if > I have Song, Book & Product nodes! > > I think this issue is *MAJOR* and it needs to be promoted to a high priority > to the neo4j team. > > The proxy solution sounds wonderful, but it can be quite a hassle if its not > rightly encapsulated & transparent. > I think all Traversals will become quite hacked & I can't even think what > will happen to Object mapping etc. > I imagine it COULD be part of an upcoming version of the new & amazing > Spring Data Graph framework (check it out!), > where a simple Annotation such as @NodeWithProxy along with information for > what *RelationshipTypes / Directions > *should go to the real or the proxy Node, could do all of the proxy magic!!! > > But, the *RelationshipType/Direction indexing *I proposed, I dare say, could > be a more generic and cleaner idea, and also a quicker hack! > > All we need is a method TraversalDescription.*index("myIndex");* where we > can declare which "index" should be used for looking up > the (few) RelationshipTypes/Directions among the millions on the Node. > The best thing is that we have already declared those on > TraversalDescription.*relationships(*MyRelationshipType.hasPart, > Direction.OUTGOING). > > The *Traversal *would then follow (only) those found on the index! Bingo!!!! > > We could also have a *.followIndexedOnly(false) *and even > *recreateFollowedIndexes(true) > *to save us next time! > > In any case, something must be implemented! > > Without being an expert on neo4j, I think there is a lot of Indexing > optimization needed yet! > > Michael what do you think ? Could you please see this being promoted to the > team while sharing their views? > > Agelos > > Date: Wed, 15 Jun 2011 17:57:55 +0200 > From: "Balazs E. Pataki" <pat...@dsd.sztaki.hu> > Subject: Re: [Neo4j] Slow Traversals on Nodes with too many > Relationships > To: Neo4j user discussions <user@lists.neo4j.org> > Message-ID: <4df8d683.8010...@dsd.sztaki.hu> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hi, > > when we started to evaluate neo4j we made some measurements and for us > it seemed that 50.000 is a magical number: this many relationships and > properties on one node seemed to be a limit, which once reached makes > things slow. But we didn't actually need that much relationship/property > in our case, so we could live with it, or could make workarounds (eg. > storing things in properties and doing indexed lookups instead of using > relationships) > > An automatic indexed lookup on relationship types and directions would > be awsome, definitely. > > Regards, > --- > balazs > > > > Date: Wed, 15 Jun 2011 23:19:32 +0800 > From: Craig Taverner <cr...@amanzi.com> > Subject: Re: [Neo4j] Slow Traversals on Nodes with too many > Relationships > To: Neo4j user discussions <user@lists.neo4j.org> > Message-ID: <banlktine_mk5+9damh07tsrq6nnxifo...@mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Could this also be related to the possibility that in order to determine > relationship type and direction, the relationships need to be loaded from > disk? If so, then having a large number of relationships on the same node > would decrease performance, if the number was large enough to affect the > disk io caching. > > If this is the case, perhaps adding a proxy node for the incoming > relationships would work-around the problem? Of course then you have doubled > the number of part nodes (two for each part, one part and one containers > proxy). > > > Date: Wed, 15 Jun 2011 18:40:05 +0300 > From: Agelos Pikoulas <agelos.pikou...@gmail.com> > Subject: Re: [Neo4j] Slow Traversals on Nodes with too many > Relationships > To: user@lists.neo4j.org > Message-ID: <banlktinsmadbatf1rglaj4wqngkfekj...@mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Re: [Neo4j] Slow Traversals on Nodes with too many > Relationships > > I have to respectfully agree with Rick Bullotta. > > I was suspecting the big-O is not linear for this case. > > To verify I added x4 Container nodes (400.000) and their appropriate > Relationships, and it is now *unbelievably* slow : > It does not take x4 more, but it takes more than 30-40 seconds for each > next() Remind you 100K nodes = ~2secs for each next() !!! > > And only to make matters worse, the subsequent runs weren't fast either - > they actually took more time than the first > (1st TotalTraversalTime= 389936ms, 2nd TotalTraversalTime= 443948ms) > > The whole setup is running on > Eclipse 3.6, with -Xmx512m on JavaVM, > Windows2003 VMWare machine with 4GB, running on a fast 2nd gen SSD (OCZ > Vertex 2). The neo4J data resides on this SSD. > The 100.000 nodes data files were ~250MB, the 400.000 one is ~1GB. > > I wonder what would happen if the Container nodes were a few million (which > will be my case) - it will run forever. > > Could you please looking into my suggestion - i.e "Using a 'smart' behind > the scenes Indexing on both *RelationshipType* and *Direction* that > Traversals actually use to boost things up" ? > > To another topic, how does one use this mailing list - I use it through > gmail and I am utterly lost - is there a better client/UI to actually > post/reply into threads ? > > > ------------------------------ > > Message: 1 > Date: Wed, 15 Jun 2011 07:27:26 -0700 > From: Rick Bullotta <rick.bullo...@thingworx.com> > Subject: Re: [Neo4j] Slow Traversals on Nodes with too many > Relationships > To: Neo4j user discussions <user@lists.neo4j.org> > Message-ID: > < > 09df3402c845ec489a3323a06208f20d0a9d4...@p3pw5ex1mb14.ex1.secureserver.net> > > Content-Type: text/plain; charset="us-ascii" > > I would respectfully disagree that it doesn't necessarily represent > production usage, since in some cases, each query/traversal will be unique > and isolated to a part of a subgraph, so in some cases, a "cold" query may > be the norm.... > > -----Original Message----- > From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On > Behalf Of Michael Hunger > Sent: Wednesday, June 15, 2011 10:25 AM > To: Neo4j user discussions > Subject: Re: [Neo4j] Slow Traversals on Nodes with too many Relationships > > That is rather a case of warming up your caches. > > Determining the traversal speed from the first run is not a good benchmark > as it doesn't represent production usage :) > The same (warming up) is true for all kinds of benchmarks (except for > startup performance benchmarks). > > Cheers > > Michael > > Am 15.06.2011 um 14:48 schrieb Agelos Pikoulas: > >> I have a few "Part" nodes related with each via "HASPART" >> relationship/edges. >> (eg Part1---HASPART--->Part2--- > HASPART--->Part3 etc) . >> TraversalDescription works fine, following each Part's outgoing HASPART >> relationship. >> >> Then I add a large number (say 100.000) of "Container" Nodes, where each >> "Container" has a "CONTAINS" relation to almost *every* "Part" node. >> Hence each Part node now has a 100.000 incoming CONTAINS relationships > from >> Container nodes, >> but only a few outgoing HASPART relationships to other Part nodes. >> >> Now my previous TraversalDescription run extremely slow (several seconds >> inside each Iterator<Path>.next() call) >> Note that I do define relationships(RT.HASPART, Direction.OUTGOING) on the >> TraversalDescription, >> but it seems its not used by neo4j as a hint. Note that on a subsequent > run >> of the same Traversal, its very quick indeed. >> >> Is there any way to use Indexing on relationships for such a scenario, to >> boost things up ? >> >> Ideally, the Traversal framework could use automatic/declerative indexing > on >> Node Relationship types and/or direction to perform such traversals > quicker. >> >> Regards >> _______________________________________________ >> Neo4j mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user