Re: [Neo4j] Re: Returning items regarding the degree of separations between users.

Michael Azerhad Fri, 01 Aug 2014 05:01:30 -0700

I tried one of the query you suggested: 

MATCH (loggedUser:Person{id: 123})-[:KNOWS]-(p1:Person)
OPTIONAL MATCH (p1)-[:SELLS]->(c1:Car)
WITH p1, collect(c1)[0..{limit}] as cars
OPTIONAL MATCH (p1)-[:KNOWS]-(p2:Person)
OPTIONAL MATCH (p2)-[:SELLS]->(c2:Car:Degree2)
WITH p2, case when length(cars) < {limit} then cars + 
collect(c2)[0..({limit}-length(cars))] else cars end  as cars
OPTIONAL MATCH (p2)-[:KNOWS]-(p3:Person)
OPTIONAL MATCH (p3)-[:SELLS]->(c3:Car:Degree3)
WITH p3, case when length(cars) < {limit} then cars + 
collect(c3)[0..({limit}-length(cars))] else cars end as cars
OPTIONAL MATCH (p3)-[:KNOWS]-(p4:Person)
OPTIONAL MATCH (p4)-[:SELLS]->(c4:Car:Degree4)
RETURN case when length(cars) < {limit} then cars + 
collect(c4)[0..({limit}-length(cars))] else cars end as cars


It takes more than 2 seconds to execute :s
When I opened each node like I did before (evoked above), the query took 
150 ms. 
I have only 51 cars in the graph and 15 persons ... 
Would the collect function impact the performance? 

Thanks :)

Michael



On Friday, August 1, 2014 10:56:04 AM UTC+2, Michael Hunger wrote:
>
> You can sort on the client or use this at the end:
>
> unwind cars as car
> return car
> order by car.name asc
>
> Michael
>
>
> On Fri, Aug 1, 2014 at 9:41 AM, Michael Azerhad <michael...@gmail.com 
> <javascript:>> wrote:
>
>> Hello Michael,
>>
>> Thanks for this really great detailed answer ! Like it :) 
>>
>> However I have a question : 
>>
>> I do really prefer the last way (at the very bottom) using cypher and 
>> expressions to lazily limit results.  
>> I wonder whether it's also possible to order by car's name (not only 
>> limiting) the whole final collection. 
>> Indeed, as far as I know, order by could not be applied on Collect 
>> aggregate function directly.
>> We would have to use With before collecting.. but the case here is that I 
>> end up with a collection incrementally built, so I'm forced to order a 
>> collection. 
>> Obviously, I can't incrementally build ordered collection since order 
>> should be applied on the whole directly. 
>> Is there a trick I may ignore to achieve global ordering ? 
>>
>> Thanks a lot again :),
>>
>> Michael
>>
>> Le 1 août 2014 à 07:03, Michael Hunger <michael...@neotechnology.com 
>> <javascript:>> a écrit :
>>
>> Not sure if I'd use cypher for those data volumes.
>>
>> I think in this case some imperative code filling a set of cars might be 
>> more sensible (i.e. a server extension)
>> And using a label for the Car-Degree. (Alternatively you could also use a 
>> SELL1, SELL2 rel-type for the degrees, and then check if it has SELL(n..4) 
>> as sell relationships, that would probably be fastest.
>>
>> Set<Node> getCars(Node person, int level) {
>>     Label degree = DynamicLabel.label("Degree"+level);
>>     for (Relationship knows = person.getRelationships(KNOWS)) {
>>          Node friend = knows.getOtherNode(person);
>>          for (Relationship sells : 
>> friend.getRelationships(SELLS,OUTGOING)) {
>>               Node car = sells.getEndNode();
>>               if (car.hasLabel(degree)) {
>>                    cars.add(car);
>>                    if (cars.size() > limit) return cars;
>>               }
>>          }
>>     }
>>     return cars;
>> }
>>
>> Make sure to do real-sized load tests.
>>
>> Something I thought could work is incrementally building up the data.
>>
>> MATCH (loggedUser:Person{id: 123})-[:KNOWS]-(p1:Person)
>> OPTIONAL MATCH (p1)-[:SELLS]->(c1:Car)
>> WITH p1, collect(c1) as cars
>> OPTIONAL MATCH (p1)-[:KNOWS]-(p2:Person)
>> OPTIONAL MATCH (p2)-[:SELLS]->(c2:Car:Degree2)
>> WITH p2, cars + collect(c2) as cars
>> OPTIONAL MATCH (p2)-[:KNOWS]-(p3:Person)
>> OPTIONAL MATCH (p3)-[:SELLS]->(c3:Car:Degree3)
>> WITH p3, cars + collect(c3) as cars
>> OPTIONAL MATCH (p3)-[:KNOWS]-(p4:Person)
>> OPTIONAL MATCH (p4)-[:SELLS]->(c4:Car:Degree4)
>> WITH cars + collect(c4) as cars
>>
>> If you want to limit the cars, it would probably be more complicated.
>>
>> Something like this:
>>
>> MATCH (loggedUser:Person{id: 123})-[:KNOWS]-(p1:Person)
>> OPTIONAL MATCH (p1)-[:SELLS]->(c1:Car)
>> WITH p1, collect(c1)[0..{limit}] as cars
>> OPTIONAL MATCH (p1)-[:KNOWS]-(p2:Person)
>> OPTIONAL MATCH (p2)-[:SELLS]->(c2:Car:Degree2)
>> WITH p2, case when length(cars) < {limit} then cars + 
>> collect(c2)[0..({limit}-length(cars))] else cars end  as cars
>> OPTIONAL MATCH (p2)-[:KNOWS]-(p3:Person)
>> OPTIONAL MATCH (p3)-[:SELLS]->(c3:Car:Degree3)
>> WITH p3, case when length(cars) < {limit} then cars + 
>> collect(c3)[0..({limit}-length(cars))] else cars end as cars
>> OPTIONAL MATCH (p3)-[:KNOWS]-(p4:Person)
>> OPTIONAL MATCH (p4)-[:SELLS]->(c4:Car:Degree4)
>> RETURN case when length(cars) < {limit} then cars + 
>> collect(c4)[0..({limit}-length(cars))] else cars end as cars 
>>
>> it might even be more sensible to do the matches as expressions and only 
>> collect as few as you need.
>>
>> Like this:
>>
>> MATCH (loggedUser:Person{id: 123})-[:KNOWS]-(p1:Person)
>> // iterate over all paths in the collection, extracting only the last node
>> // but only taking the first 0..{limit} ones lazily from that collection
>> WITH p1, [path in (p1)-[:SELLS]->(:Car) | last(path)][0..{limit}]
>> ...
>>
>>
>> On Fri, Aug 1, 2014 at 4:28 AM, Michael Azerhad <michael...@gmail.com 
>> <javascript:>> wrote:
>>
>>> Fix of my query above, I missed to specify the logged user node:
>>>
>>>
>>> MATCH (d:Degree {id: 1})<-[:TARGET_TO]-(c:Car)<-[:SELLS]-(p:Person)-[
>>> KNOWS]-(loggedUser:Person{id: 123})
>>> RETURN c
>>> UNION
>>> MATCH (d:Degree {id: 2})<-[:TARGET_TO]-(c:Car)<-[:SELLS]-(p:Person)-[
>>> KNOWS*..2]-(loggedUser:Person{id: 123})
>>> RETURN c
>>> UNION
>>> MATCH (d:Degree {id: 3})<-[:TARGET_TO]-(c:Car)<-[:SELLS]-(p:Person)-[
>>> KNOWS*..3]-(loggedUser:Person{id: 123})
>>> RETURN c
>>> UNION
>>> MATCH (d:Degree {id: 4})<-[:TARGET_TO]-(c:Car)<-[:SELLS]-(p:Person)-[
>>> KNOWS*..4]-(loggedUser:Person{id: 123})
>>> RETURN c
>>>
>>> It's important :)
>>>
>>> On Friday, August 1, 2014 4:24:41 AM UTC+2, Michael Azerhad wrote:
>>>>
>>>> Hi,
>>>>
>>>> I really think about making the following scenario optimal using Cypher 
>>>> and Neo4j 2.X.X:
>>>>
>>>> Let's suppose this classic person knowledge pattern:
>>>>
>>>> (a:Person)-[:KNOWS]-(b:Person) 
>>>>
>>>>
>>>> Each person can sell his car by specifying its visibility according to 
>>>> the degree of separation of its choice.
>>>> Example:  
>>>>
>>>> Person A wants to sell a car.
>>>> He expects people distant of 2 degrees maximum to "see" his sale 
>>>> announcement. (friends of friends maximum, including his direct friends)
>>>>
>>>> Therefore, when Person B logs on and click on "list all the sales", 
>>>> only sales that concerns him (according to the specified degree of 
>>>> separation by the seller previously).
>>>>
>>>> So if Person B is distant from Person A of 1 or 2 degrees (2 degrees 
>>>> being the degree effectively specified set by Person A, he could set 
>>>> degree 
>>>> 4 but it's an example), he can see the announcement.
>>>> Otherwise, he can't see it. 
>>>>
>>>> What would be the optimal cypher query to retrieve at once, all the 
>>>> sale announcements that concern the logged user. 
>>>>
>>>> Firstly, I managed the case with this strategy: 
>>>> _ Storing the expected degree for visibility in the Car node => Ferrari 
>>>> (id:..., degreeFilter: 2)
>>>> _ A cypher query that traverse every cars, pick the number and compare 
>>>> it to the length of the result of *shortestPath* cypher function 
>>>> applies to Person(A)-[:KNOWS*....4]-Person(B).   (4 being the maximum 
>>>> degree possible, set by any seller)
>>>> If the length is superior to the degreeFilter, then the user would 
>>>> *not*  be able to see this concerned announcement.
>>>>
>>>> Main drawback of this strategy:  I have to open EVERY car node to check 
>>>> for this degreeFilter property...
>>>> If I have 160 000 000 of car nodes, I easily imagine the impact on 
>>>> query performance.
>>>>
>>>> So I think a little about alternatives strategy and really think about 
>>>> this one.
>>>> Since I limit the maximum degreeFilter being set to 4 (a little and 
>>>> finite number), why not extract the degreeFilter property to a "Degree" 
>>>> node, and make 4 queries with UNION like this:
>>>>
>>>> MATCH (d:Degree {id: 1})<-[:TARGET_TO]-(c:Car)<-[:SELLS]-(p:Person)-[
>>>> KNOWS]-(p2:Person)
>>>> RETURN c
>>>> UNION
>>>> MATCH (d:Degree {id: 2})<-[: TARGET_TO]-(c:Car)<-[:SELLS]-(p:Person)-[
>>>> KNOWS*..2]-(p2:Person)
>>>> RETURN c
>>>> UNION
>>>> MATCH (d:Degree {id: 3})<-[: TARGET_TO]-(c:Car)<-[:SELLS]-(p:Person)-[
>>>> KNOWS*..3]-(p2:Person)
>>>> RETURN c
>>>> UNION
>>>> MATCH (d:Degree {id: 4})<-[: TARGET_TO]-(c:Car)<-[:SELLS]-(p:Person)-[
>>>> KNOWS*..4]-(p2:Person)
>>>> RETURN c
>>>>
>>>>
>>>> There wouldn't be any duplicates, therefore no need to use UNION ALL 
>>>> but UNION.
>>>>
>>>> However, I just read that post-processing on the whole joined result 
>>>> set is not an actual feature in Neo4j 2.X.X.
>>>>
>>>> Indeed,* what if I would like to paginate sale announcements.*
>>>> The basic ideal way would be to just add :  SKIP/LIMIT 10 (it's an 
>>>> example, not the real syntax) at the end of those 4 queries... 
>>>> But it would be only apply to the last, not the whole...
>>>>
>>>> So to sum up: 
>>>>
>>>>
>>>>    1. Is my query optimal if I use UNION or a better alternative 
>>>>    exists regarding this specific scenario ? 
>>>>    2. If UNION is the best solution, how to handle some kind of 
>>>>    post-processing, like ordering all the unified announcements by car's 
>>>> name.
>>>>
>>>> Thanks a lot for any potential answers :)  
>>>> I spent some times on it since it's interesting and I really want to 
>>>> find an optimal query, allowing to deal with millions of sales :)
>>>>
>>>>
>>>> Michael
>>>>
>>>>
>>>>
>>>>
>>>>  
>>>>
>>>>    
>>>>
>>>>
>>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to neo4j+un...@googlegroups.com <javascript:>.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "Neo4j" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/neo4j/3vH1kaNC6a8/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> neo4j+un...@googlegroups.com <javascript:>.
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to neo4j+un...@googlegroups.com <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Re: Returning items regarding the degree of separations between users.

Reply via email to