ok fully reverting the entire project worked (that is without that fix):
with Direction.BOTH:
(one)-->(three) 8,438 ms
(one)-->(two) 873 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(one)-->(three) 734 ms
(one)-->(two) 743 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(one)-->(three) 784 ms
(one)-->(two) 621 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(one)-->(three) 682 ms
(one)-->(two) 733 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(one)-->(three) 620 ms
(one)-->(two) 683 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
with Direction.INCOMING:
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
with Direction.OUTGOING:
(one)-->(three) 762 ms
(one)-->(two) 621 ms
(one)-->(three) 712 ms
(one)-->(two) 631 ms
(one)-->(three) 672 ms
(one)-->(two) 773 ms
(one)-->(three) 620 ms
(one)-->(two) 743 ms
(one)-->(three) 620 ms
(one)-->(two) 683 ms
Environment is about to shut down the neo4j database
Environment shutting down complete 2,721 ms


and with that *fix*:

with Direction.BOTH:
(one)-->(three) 8,710 ms
(one)-->(two) 0 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(one)-->(three) 1,143 ms
(one)-->(two) 0 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(one)-->(three) 763 ms
(one)-->(two) 0 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(one)-->(three) 814 ms
(one)-->(two) 0 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(one)-->(three) 897 ms
(one)-->(two) 0 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
with Direction.INCOMING:
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
(three)<--(one) 0 ms
(two)<--(one) 0 ms
with Direction.OUTGOING:
(one)-->(three) 630 ms
(one)-->(two) 0 ms
(one)-->(three) 763 ms
(one)-->(two) 0 ms
(one)-->(three) 842 ms
(one)-->(two) 0 ms
(one)-->(three) 621 ms
(one)-->(two) 0 ms
(one)-->(three) 742 ms
Environment is about to shut down the neo4j database
(one)-->(two) 0 ms
Environment shutting down complete 2,711 ms


So yeah, one-->two got fixed, what say you about one-->three though ? :D
*whistle* :-"

On Sun, Jul 24, 2011 at 5:28 PM, John cyuczieekc <cyuczie...@gmail.com>wrote:

> updated to latest from github, "Stops the algo as soon as possible in
> findSinglePath"
>
> graphdb contains:
> one-->three
> one-->{ a million other random nodes }
> one-->two
> { 10 random nodes } --> two
>
> (added in that order, except that `one-->two` is somewhere between those 10
> random nodes)
>
> output:
>
> with Direction.BOTH:
> (one)-->(three) 7,660 ms
> (one)-->(two) 0 ms
> (three)<--(one) 0 ms
> (two)<--(one) 0 ms
> (one)-->(three) 2,195 ms
> (one)-->(two) 0 ms
> *(three)<--(one) 0 ms*
> (two)<--(one) 0 ms
> *(one)-->(three) 875 ms*
> (one)-->(two) 0 ms
> (three)<--(one) 0 ms
> (two)<--(one) 0 ms
> (one)-->(three) 630 ms
> (one)-->(two) 0 ms
> (three)<--(one) 0 ms
> (two)<--(one) 0 ms
> (one)-->(three) 723 ms
> (one)-->(two) 0 ms
> (three)<--(one) 0 ms
> (two)<--(one) 0 ms
> with Direction.INCOMING:
> (three)<--(one) 0 ms
> (two)<--(one) 0 ms
> (three)<--(one) 0 ms
> (two)<--(one) 0 ms
> (three)<--(one) 0 ms
> (two)<--(one) 0 ms
> (three)<--(one) 0 ms
> (two)<--(one) 0 ms
> (three)<--(one) 0 ms
> (two)<--(one) 0 ms
> with Direction.OUTGOING:
> (one)-->(three) 802 ms
> (one)-->(two) 0 ms
> with Direction.OUTGOING:
> (one)-->(three) 631 ms
> (one)-->(two) 0 ms
> with Direction.OUTGOING:
> (one)-->(three) 732 ms
> (one)-->(two) 0 ms
> with Direction.OUTGOING:
> (one)-->(three) 873 ms
> (one)-->(two) 0 ms
> with Direction.OUTGOING:
> (one)-->(three) 630 ms
>
> Environment is about to shut down the neo4j database
> (one)-->(two) 0 ms
> Environment shutting down complete 3,171 ms
>
> For some reason, one-->three takes more than 0ms, even though three has
> only 1 incoming rel, and it's the first rel of `one` that outgoes to `three`
>
> (for whoever cares)the test program used is in this commit here:
>
> https://github.com/13th-floor/neo4john/commit/b819a0d418d953a675aaa74749f284b88e4f47ee
>
> I tried to revert that commit, and for some reason I'm getting the same
> results [one-->three is the only one over 0ms] (maybe I failed in reverting
> it, though the 2 sources seem reverted)
>
> Peace :)
>
>
> On Sun, Jul 24, 2011 at 4:49 PM, Mattias Persson <
> matt...@neotechnology.com> wrote:
>
>> It doesn't matter since the algorithm is bi-directional... so:
>>
>>    one --> two
>>
>> will start from "one" OUTGOING and "two" INCOMING, whereas
>>
>>    two <-- one
>>
>> will start from "two" INCOMING and "one" OUTGOING.... see, no difference.
>>
>> It alternates side for each relationship. It will, however depend on where
>> the INCOMING/OUTGOING relationships reside in the relationship chain, but
>> sticking to this discussion: those two calls will yield the exact same
>> speed. Though I just discovered a little thingie where findSinglePath
>> didn't
>> stop right away when after finding the first one, but now it does!
>>
>> 2011/7/24 Niels Hoogeveen <pd_aficion...@hotmail.com>
>>
>> >
>> > Are you sure this is true, Mattias?The response time of a
>> getRelationship
>> > call depends on the total number of relationships on the node. So it
>> makes a
>> > difference which side of the relationship makes the call. It is always
>> > faster to ask it from the side that has the lowest total number of
>> > relationships attached. This is even true, if for both sides of the
>> > relationship there is only one relationship of that particular
>> relationship
>> > type.Niels
>> > > Date: Sun, 24 Jul 2011 16:10:42 +0200
>> > > From: matt...@neotechnology.com
>> > > To: user@lists.neo4j.org
>> > > Subject: Re: [Neo4j] shortestPath slower than it could be
>> > >
>> > > What I'm saying is that:
>> > >
>> > >   one --> two    (one,two) for OUTGOING
>> > >
>> > > and
>> > >
>> > >   two <-- one   (two,one) for INCOMING
>> > >
>> > > should yield the same timing.
>> > >
>> > > 2011/7/24 John cyuczieekc <cyuczie...@gmail.com>
>> > >
>> > > > Thanks Mattias. The way I understand what you said is, that swapping
>> > `one`
>> > > > and `two` in *finder.findSinglePath( one, two );*  should yield the
>> > same
>> > > > timing when Direction.BOTH is used; this would be great. But I don't
>> > see
>> > > > how
>> > > > this is possible (due to not knowing how it's stored too) unless you
>> > know
>> > > > how many rels each node has, OR even better storage is using BTree-s
>> > (?!)
>> > > > Btw, is it possible/practical that neo4j could store IDs in BTrees ?
>> > > > (someone said that for rels it's a double linked list instead- I'll
>> > need to
>> > > > recheck)
>> > > >
>> > > > irrelevant stuff follows (ie. don't read)
>> > > > ----
>> > > > Until then , I lame-tested something (with both neo4j and
>> berkeleydb):
>> > > > a node `one` having 1mil rels to 1million unique nodes, and one more
>> to
>> > a
>> > > > node `two`, and node `two` having 10 incoming rels from other unique
>> > nodes,
>> > > > +1 the one rel which was already from `one`, trying shortestPath
>> > between
>> > > > `one` and `two` (not `two` and `one`) with Direction.BOTH
>> > > > output:
>> > > > just created 1,000,000 rels, took=58,381 ms  (1)
>> > > > Path from one to two: (one)-->(two) 121,310 ms (2)
>> > > > Path from one to two: (one)-->(two) timedelta=1,066 ms
>> > > > Path from one to two: (one)-->(two) timedelta=795 ms
>> > > > Path from one to two: (one)-->(two) timedelta=772 ms
>> > > >
>> > > > (1) and (2) happened in the same transaction (ie. before
>> tx.finish()),
>> > the
>> > > > others after transaction finished (an in no new transaction - since
>> > they
>> > > > were only reads) - this also means some caching must've happened
>> from
>> > > > before.
>> > > > Because it happened in same transaction, adding 1 mil relationships
>> is
>> > > > slower than adding them in bursts of x, I am aware of this (and I'll
>> > try to
>> > > > make a bench with that). For ie. 100k nodes neo4j is actually
>> faster.
>> > > > I even got this once (but in another modified bench):
>> > > > first time creating the relationships...
>> > > > just created 1,000,000 rels, took=49,307 ms (cpu was 100% here,
>> instead
>> > of
>> > > > the usual 77% limit)
>> > > > Exception in thread "main" java.lang.OutOfMemoryError: Java heap
>> space
>> > > > Environment is about to shut down the neo4j database
>> > > > Exception in thread "Thread-0" java.lang.OutOfMemoryError: GC
>> overhead
>> > > > limit
>> > > > exceeded
>> > > >    at java.util.logging.LogManager.reset(LogManager.java:835)
>> > > >    at java.util.logging.LogManager$Cleaner.run(LogManager.java:240)
>> > > > Environment shutting down complete 23,962 ms
>> > > >
>> > > > Exception: java.lang.OutOfMemoryError thrown from the
>> > > > UncaughtExceptionHandler in thread "main"
>> > > >
>> > > >
>> > > > this is the output when running cold, (ie. the data already
>> existed):
>> > > > search started
>> > > > Path from one to two: (one)-->(two) timedelta=7,934 ms
>> > > > search started
>> > > > Path from one to two: (one)-->(two) timedelta=2,027 ms
>> > > > search started
>> > > > Path from one to two: (one)-->(two) timedelta=875 ms
>> > > > Environment is about to shut down the neo4j database
>> > > > Environment shutting down complete 2,971 ms
>> > > >
>> > > > and the exact same thing when using berkeleydb (not using
>> transactions
>> > > > though):
>> > > > just created 1,000,000 rels, took=29,418 ms
>> > > > Path from one to two: true 0 ms
>> > > > Path from one to two: true 0 ms
>> > > > Path from one to two: true 0 ms
>> > > > class org.bdb.BerkEnv$1 shutting down complete 0 ms  (most was 10 ms
>> > here)
>> > > >
>> > > > and when cold:
>> > > > Path from one to two: true 10 ms
>> > > > Path from one to two: true 0 ms
>> > > > Path from one to two: true 0 ms
>> > > > class org.bdb.BerkEnv$1 shutting down complete 0 ms
>> > > > (the most was 10ms on first(sometimes 0ms, 10ms, 0ms), least was 0ms
>> on
>> > > > all)
>> > > >
>> > > > in bdb, this is actually how bdb does the search (I don't need to
>> make
>> > sure
>> > > > I iterate or something similar, on the node `two` which has least
>> > incoming
>> > > > rels), in fact I execute a search on both nodes, one after the other
>> > and
>> > > > make sure one->two and two<-one (since I am using two databases),
>> > anyway,
>> > > > by
>> > > > using db.getSearchBoth(...) the search is done *that *fast. I hope
>> to
>> > see
>> > > > this kind of fastness with neo4j too xD (unless I'm doing it wrong)
>> > > >
>> > > > if I put neo4j to find shortest path on incoming on `two` then:
>> > > > search started
>> > > > Path from one to two: (two)<--(one) timedelta=20 ms
>> > > > search started
>> > > > Path from one to two: (two)<--(one) timedelta=0 ms
>> > > > search started
>> > > > Path from one to two: (two)<--(one) timedelta=0 ms
>> > > > Environment is about to shut down the neo4j database
>> > > > Environment shutting down complete 2,590 ms
>> > > > (most was 20ms, least was 20ms on first)
>> > > >
>> > > > TestLinkage.java (both progs) were used from here:
>> > > >
>> > > >
>> >
>> https://github.com/13th-floor/neo4john/commit/a9f4b274de1d6c9ec9f1ea4a338b5c42325f19a4
>> > > >
>> > > > --------------
>> > > >
>> > > > Here's a lame benchmark for neo4j when I added another node `three`
>> > which
>> > > > is
>> > > > the first rel `one`->`three`, then added 1mil `one`->(random nodes)
>> ,
>> > and
>> > > > then(as before) `one`->`two`
>> > > > `three` has no other relationships
>> > > > Also, creating nodes 10k per transaction until 1mil are reached.
>> > > >
>> > > > first time creating the relationships...
>> > > > just created 1,000,000 rels, took=28,889 ms  (btw my cpu is limited
>> to
>> > 77%
>> > > > of its normal speed; usually!)
>> > > > closed transaction
>> > > > search started
>> > > > (one)-->(three) 3,234 ms
>> > > > search started
>> > > > (one)-->(two) 661 ms
>> > > > search started
>> > > > (three)<--(one) 10 ms
>> > > > search started
>> > > > (two)<--(one) 0 ms
>> > > > search started
>> > > > (one)-->(three) 631 ms
>> > > > search started
>> > > > (one)-->(two) 913 ms
>> > > > search started
>> > > > (three)<--(one) 0 ms
>> > > > search started
>> > > > (two)<--(one) 0 ms
>> > > > search started
>> > > > (one)-->(three) 610 ms
>> > > > search started
>> > > > (one)-->(two) 773 ms
>> > > > search started
>> > > > (three)<--(one) 0 ms
>> > > > search started
>> > > > (two)<--(one) 0 ms
>> > > > Environment is about to shut down the neo4j database
>> > > > Environment shutting down complete 3,801 ms
>> > > >
>> > > > and on cold run:
>> > > > search started
>> > > > (one)-->(three) 8,454 ms
>> > > > search started
>> > > > (one)-->(two) 885 ms
>> > > > search started
>> > > > (three)<--(one) 0 ms
>> > > > search started
>> > > > (two)<--(one) 0 ms
>> > > > search started
>> > > > (one)-->(three) 752 ms
>> > > > search started
>> > > > (one)-->(two) 755 ms
>> > > > search started
>> > > > (three)<--(one) 0 ms
>> > > > search started
>> > > > (two)<--(one) 0 ms
>> > > > search started
>> > > > (one)-->(three) 804 ms
>> > > > search started
>> > > > (one)-->(two) 621 ms
>> > > > search started
>> > > > (three)<--(one) 0 ms
>> > > > search started
>> > > > (two)<--(one) 0 ms
>> > > > Environment is about to shut down the neo4j database
>> > > > Environment shutting down complete 3,141 ms
>> > > > =====
>> > > > and same thing in berkeleydb (no transactions tho):
>> > > > first time creating the relationships...
>> > > > just created 1,000,000 rels, took=29,658 ms
>> > > > Path from one to three: true 10 ms
>> > > > Path from one to two: true 0 ms
>> > > > Path from three to one: false 0 ms
>> > > > Path from two to one: false 0 ms
>> > > > Path from one to three: true 0 ms
>> > > > Path from one to two: true 0 ms
>> > > > Path from three to one: false 0 ms
>> > > > Path from two to one: false 0 ms
>> > > > Path from one to three: true 0 ms
>> > > > Path from one to two: true 0 ms
>> > > > Path from three to one: false 0 ms
>> > > > Path from two to one: false 0 ms
>> > > > class org.bdb.BerkEnv$1 shutting down complete 152 ms
>> > > >
>> > > > and on cold:
>> > > > Path from one to three: true 10 ms
>> > > > Path from one to two: true 10 ms
>> > > > Path from three to one: false 0 ms
>> > > > Path from two to one: false 0 ms
>> > > > Path from one to three: true 0 ms
>> > > > Path from one to two: true 0 ms
>> > > > Path from three to one: false 0 ms
>> > > > Path from two to one: false 0 ms
>> > > > Path from one to three: true 0 ms
>> > > > Path from one to two: true 0 ms
>> > > > Path from three to one: false 0 ms
>> > > > Path from two to one: false 0 ms
>> > > > class org.bdb.BerkEnv$1 shutting down complete 0 ms
>> > > >
>> > > >
>> > > > these progs are on:
>> > > >
>> > > >
>> >
>> https://github.com/13th-floor/neo4john/commit/d2fd5b27dcde5560e6ff980fe9320aedc4421ab7
>> > > >
>> > > > Cheerios,
>> > > > John.
>> > > >
>> > > > Song of the day: Bic Runga - She Left On A Monday
>> > > > PS: asserts were enable ie. vm arg "-ea" (no quotes)
>> > > >
>> > > > On Sat, Jul 23, 2011 at 4:31 PM, Mattias Persson
>> > > > <matt...@neotechnology.com>wrote:
>> > > >
>> > > > > Hi John,
>> > > > >
>> > > > > the algorithm is written to dodge these kinds of pitfalls. Maybe
>> > there's
>> > > > > some issue with the implementation, but in principal it should
>> make
>> > no
>> > > > > difference. I'll look at it when I get the time (I wrote that
>> > > > > implementation).
>> > > > >
>> > > > > 2011/7/23 John cyuczieekc <cyuczie...@gmail.com>
>> > > > >
>> > > > > > Hey guys, me bugging you again :)
>> > > > > >
>> > > > > > (This whole thing is kind of based on the lack of being able to
>> get
>> > the
>> > > > > > number of relationships a node has)
>> > > > > >
>> > > > > >  If I have two nodes, and the first one has 1 million outgoing
>> > > > > > relationships
>> > > > > > of the type X to 1 million unique/different nodes,
>> > > > > > and the second node has 10 incoming relationships of type X
>> (same
>> > type)
>> > > > > of
>> > > > > > which one is from the first node,
>> > > > > > then using GraphAlgoFactory.shortestPath  (or suggest a better
>> > way?)
>> > > > > >  How can I tell neo4j to iterate the search on the second node's
>> > > > incoming
>> > > > > > rels simply because it has 10 relationships instead of 1
>> million,
>> > in
>> > > > > order
>> > > > > > to check if each relationship is in the form of
>> > firstNode-->secondNode
>> > > > ?
>> > > > > >
>> > > > > > For the case when first node has 100,000 relationships and
>> second
>> > node
>> > > > > has
>> > > > > > 10,
>> > > > > > it takes *1.7 seconds* for shortestPath to find the only one
>> link
>> > > > between
>> > > > > > them using:
>> > > > > >
>> > > > > > final PathFinder<Path> finder = GraphAlgoFactory.shortestPath(
>> > > > > > Traversal.expanderForTypes( rel, Direction.OUTGOING  ), 1 );
>> > > > > > final Path foundPath = finder.findSinglePath( *one, two* );
>> > > > > >
>> > > > > > I can put Direction.*BOTH *and get the same amount of time
>> > > > > > *Path from one to two: (one)-->(two) timedelta=1,862,726,634 ns*
>> > > > > >
>> > > > > > *BUT*, get this: if I swap the nodes:
>> > > > > > finder.findSinglePath(* two, one*);
>> > > > > > and i use either Direction.INCOMING or Direction.*BOTH  *(which
>> > makes
>> > > > > sense
>> > > > > > for the second node ,right) then I get *20ms* the time until it
>> > > > > finishes...
>> > > > > > *Path from one to two: (two)<--(one) timedelta=20,830,111 ns*
>> > > > > >
>> > > > > > (both cases are without data being priorly cached)
>> > > > > >
>> > > > > > I was expecting it to act like this: (but only when using
>> > > > Direction.BOTH)
>> > > > > >  see which node has the least number of relationships and
>> iterate
>> > on
>> > > > > those,
>> > > > > > but this would work if findSinglePath would be made for depth 1
>> > (aka
>> > > > > > particular case), but as I read "Tries to find a single path
>> > between
>> > > > > > startand
>> > > > > > end nodes." then it makes sense to me why it works like it
>> does...
>> > that
>> > > > > is,
>> > > > > > iterate on relationships from start node, rather than from end
>> > node...
>> > > > > but
>> > > > > > I'm not sure if it would *not *make sense to iterate on the end
>> > node
>> > > > > > instead
>> > > > > > of start node, when knowing that end node has less
>> relationships,
>> > for
>> > > > > make
>> > > > > > the search faster (well at least if depth is one) - I didn't
>> look
>> > into
>> > > > > how
>> > > > > > neo4j actually does stuff yet :D
>> > > > > >
>> > > > > > anyway, it's fairly clear to me that I could make a simple
>> wrapper
>> > > > method
>> > > > > > to
>> > > > > > make this kind of search faster, *IF* I had the ability to know
>> how
>> > > > many
>> > > > > > relationships each node has, so I can call findSinglePath  with
>> the
>> > > > first
>> > > > > > param being the node with the least relationship count :) But as
>> I
>> > > > > > understood it, it's not possible to find how many rels a node
>> > has...
>> > > > > gimme
>> > > > > > feat! :)) [by not possible I mean, without having to iterate
>> thru
>> > all
>> > > > and
>> > > > > > count them, which would make the use case here obsolete]
>> > > > > >
>> > > > > > PS: clearly all the text I wrote here would benefit from being
>> > > > > represented
>> > > > > > by a graph, just think about all those grouping with autohiding
>> the
>> > ie.
>> > > > > > "[]"
>> > > > > > and all kinds of stuff... heh
>> > > > > > _______________________________________________
>> > > > > > Neo4j mailing list
>> > > > > > User@lists.neo4j.org
>> > > > > > https://lists.neo4j.org/mailman/listinfo/user
>> > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Mattias Persson, [matt...@neotechnology.com]
>> > > > > Hacker, Neo Technology
>> > > > > www.neotechnology.com
>> > > > > _______________________________________________
>> > > > > Neo4j mailing list
>> > > > > User@lists.neo4j.org
>> > > > > https://lists.neo4j.org/mailman/listinfo/user
>> > > > >
>> > > > _______________________________________________
>> > > > Neo4j mailing list
>> > > > User@lists.neo4j.org
>> > > > https://lists.neo4j.org/mailman/listinfo/user
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Mattias Persson, [matt...@neotechnology.com]
>> > > Hacker, Neo Technology
>> > > www.neotechnology.com
>> > > _______________________________________________
>> > > Neo4j mailing list
>> > > User@lists.neo4j.org
>> > > https://lists.neo4j.org/mailman/listinfo/user
>> >
>> > _______________________________________________
>> > Neo4j mailing list
>> > User@lists.neo4j.org
>> > https://lists.neo4j.org/mailman/listinfo/user
>> >
>>
>>
>>
>> --
>> Mattias Persson, [matt...@neotechnology.com]
>> Hacker, Neo Technology
>> www.neotechnology.com
>> _______________________________________________
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>>
>
>
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to