Re: [Neo4j] Cypher Aggregation functions - specifically SUM()
Now that we've figured all that out, and determined that it's not built-in to Cypher yet... What is the best practice for doing this with the available tools? I mean effectively what I'm looking for is the heaviest branch on a tree right? -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Cypher-Aggregation-functions-specifically-SUM-tp3450203p3453168.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Cypher Aggregation functions - specifically SUM()
Hi, Thanks for considering my input and getting back to me on this issue. I'm glad to hear that this kind of functionality is being though out, because its addition to Cypher will significantly enhance its usefulness as a robust graph traversal/query language. Andres Taylor wrote: > >1. Cypher needs a way to turn an iterable of nodes to an iterable of of >numeric values. Something like Scala collection's map method. It could > look >something like this: RETURN MAP(x in NODES(p) : x.votes) > It's funny you use that notation, because I tried several different forms of that after I saw its use in the predicate functions. Except I'm not sure I see how that notation deals with the aggregation issue. It would seem to still need an aggregation applied to the elements of the List collection like: RETURN SUM(MAP(x in NODES(p) : x.votes)) However, the alternative would be to allow nesting when the result set is comprised of nodes. As in, if you could use: START n = node(START i = node(0) MATCH p = i--() RETURN NODES(p)) RETURN SUM(n.votes) It's cumbersome, but being able to nest query start conditions might be more useful generally. It provides the sort of explicit end-user distinction between the ambiguity you highlight below by forcing them to choose. The two problems with this I see are you can't perform relationship property aggregation the same way and it's almost certainly not as efficient as it otherwise could be because you've broken up what should be two interleaved processes (caching totals and aggregating at the next traversal step) into two serially dependent ones. The SUM(MAP( : _._)) expression makes a lot more sense. Andres Taylor wrote: > >2. Aggregate functions need to be able to work on iterables, and not > just >on multiple subgraphs. The problem here is how to make it obvious which > one >you are trying to use, e.g. >RETURN foo.bar, COUNT(NODES(path)) >Does that mean aggregate on foo.bar and return the number of paths, or >does it mean that you want to know the number of nodes in path? > I'm a little confused by what you mean. Isn't there always only one path returned per row when you provide an additional columns like "foo.bar"? I mean "RETURN foo.bar, path" will always produce one Node (or its property in this case) and the one path traversed to reach that node per row. So by default it would have to mean "return the number of nodes in path" (for this return row). Andres Taylor wrote: > > If/when this is done, your query would look something like this: > > RETURN SUM(MAP(x in NODES(path) : x.votes)) > I should read more carefully before I start typing. :-) Yes. This makes the most sense to me. On Tue, Oct 25, 2011 at 12:02 PM, n_aschbacherwrote: > Hi, > > You're correct in one sense. If I remove the path, or other columns, from > the RETURN statement then I can get a single SUM value back for all the > properties in the entire tree below my starting node. > > My problem is that I want to return multiple rows, a row for each path > through the graph, with the SUM of the properties of the nodes traversed > so > far on that single path. > > The idea is that I want to know which branch in the tree of posts and > replies has the highest total vote count. > > Removing other columns from the RETURN statement as you suggest will only > give me the SUM of votes for the whole tree, not per branch. > > Cheers! > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Cypher-Aggregation-functions-specifically-SUM-tp3450203p3450996.html > Sent from the Neo4j Community Discussions mailing list archive at > Nabble.com. > ___ > Neo4j mailing list > User@.neo4j > https://lists.neo4j.org/mailman/listinfo/user > ___ Neo4j mailing list User@.neo4j https://lists.neo4j.org/mailman/listinfo/user -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Cypher-Aggregation-functions-specifically-SUM-tp3450203p3453162.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Urgent: 1.4.2 Github tag does not seem to match the 1.4.2 distro JAR
Ok, phew. Got a little scared there :) 2011/10/26 Rick Bullotta > Sorry for not reporting back. It was an eclipse issue. All good now! > > On Oct 25, 2011, at 7:21 PM, "Mattias Persson" > wrote: > > > Odd, could you just give a sample of a line that is wrong? just to help > me > > get started looking at this > > > > 2011/10/25 Rick Bullotta > > > >> When attempting to debug an issue with the index framework, the debugger > is > >> clearly on the wrong source lines, so I suspect there's some type of > >> mismatch. > >> > >> Thoughts? > >> > >> ___ > >> Neo4j mailing list > >> User@lists.neo4j.org > >> https://lists.neo4j.org/mailman/listinfo/user > >> > > > > > > > > -- > > Mattias Persson, [matt...@neotechnology.com] > > Hacker, Neo Technology > > www.neotechnology.com > > ___ > > Neo4j mailing list > > User@lists.neo4j.org > > https://lists.neo4j.org/mailman/listinfo/user > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Default Analyzer in Index Framework
Hi, Mattias. That's exactly what we did. One interesting note: the query and get methods seemed to work without lower casing the search term (maybe the analyzer is used to parse the query?), but for native lucene queries we needed to lowercase them. All good now! Thanks for the tips. Rick On Oct 25, 2011, at 7:08 PM, "Mattias Persson" wrote: > Hi Rick, > > yes you can do that, but not in a super easy way. What you'd have to do > right now to get it working is to make sure you create an index with a > special analyzer which converts everything to lower case (both additions and > queries), effectively making it case insensitive. So create a class like > this: > >public class LowerCaseAnalyzer extends Analyzer >{ >@Override >public TokenStream tokenStream( String fieldName, Reader reader ) >{ >return new LowerCaseFilter( Version.LUCENE_31, new > KeywordTokenizer( reader ) ); >} >} > > and make sure you create your index with a configuration map like: > >Index index = graphDb.index().forNodes( >"myCaseInsensitiveIndex", MapUtil.stringMap( "analyzer", > LowerCaseAnalyzer.class.getName() ) ); > > then this will work: > >index.add( node, "name", "Rick Bullotta" ); >index.query( "name:\"rick bullotta\"" ); // ==> returns that node. > > 2011/10/25 Rick Bullotta > >> Anyone able to provide some insights on this? >> >> Thanks. >> >> From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On >> Behalf Of Rick Bullotta [rick.bullo...@thingworx.com] >> Sent: Monday, October 24, 2011 6:16 PM >> To: Neo4j user discussions >> Subject: [Neo4j] Default Analyzer in Index Framework >> >> When not using fulltext indexing, what Lucene Analyzer class does Neo4J >> use? It seems that non-fulltext index searches are case sensitive - we'd >> like to change that behavior. >> >> Thanks for any help/guidance/examples! >> >> Rick >> >> ___ >> Neo4j mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user >> ___ >> Neo4j mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user >> > > > > -- > Mattias Persson, [matt...@neotechnology.com] > Hacker, Neo Technology > www.neotechnology.com > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] InvalidRecordException with BatchInserter in 1.5.M02
Yes this have been fixed, sorry for not notifying you about it! The fix will be included in the 1.5 GA release. Thank you very much for you patience and responses. Best, Mattias 2011/10/25 Dennis Hendriksen > Hello Mattias, > > I'm curious to know if you were able to reproduce the problem. I realize > that it must be quiet a pain to do so because of the huge amounts of > data ;-) > > Greetings, > Dennis > > On Mon, 2011-10-17 at 10:26 +0200, Mattias Persson wrote: > > Sorry, false alarm... I got them after some retries. > > > > 2011/10/17 Mattias Persson > > And if you could send the source it would be great (or if it's > > packaged in the jar file, which it might be but I cannot see > > it since I cannot open the jar file). > > > > > > > > 2011/10/17 Mattias Persson > > Thanks, but unfortunately I cannot open any of these > > files. I get "unexpected EOF" and such. > > > > > > > > 2011/10/17 Dennis Hendriksen > > > > Hello Mattias, > > > > Download and extract the following (> 7.0GB) > > resource: > > > http://download.wikimedia.org/enwiki/20110803/enwiki-20110803-pages-articles.xml.bz2 > > > > Download the runnable jar and source files: > > http://download.kalooga.com/wikiparser.jar > > > http://download.kalooga.com/kalooga-wikiparser.tar.gz (eclipse) > > > > Run the application as follows: > > java -Xmx9g -jar wikiparser.jar > > /enwiki-20110803-pages-articles.xml > > /graphdb > > > > You might want to add some additional jvm > > flags to speed up execution: > > -XX:+DoEscapeAnalysis -XX:+AggressiveOpts -XX: > > +UseNUMA -XX: > > +UseCompressedStrings -XX: > > +OptimizeStringConcat -XX: > > +UseFastAccessorMethods -XX:+UseBiasedLocking > > > > Thank you for taking the time to look into the > > problem! > > > > Greetings, > > Dennis > > > > > > On Sat, 2011-10-15 at 17:08 +0200, Mattias > > Persson wrote: > > > Thanks Dennis for reporting it. > > > > > > I would like to run your code to be able to > > reproduce it locally, then I can > > > probably fix the bug. Would that be > > possible? > > > > > > Best, > > > Mattias > > > > > > 2011/10/14 Dennis Hendriksen > > > > > > > > > Hi all, > > > > > > > > Since upgrading neo4j 1.4.1 to 1.5.M02 I > > get a InvalidRecordException > > > > while importing data in a new store using > > BatchInserter (never seen this > > > > exception with 1.4.1). > > > > > > > > For identical program executions the > > exceptions occur at different > > > > moments. The problem only occurs after > > inserting millions of > > > > nodes/relations, but occurs in each run. > > The data is inserted in a > > > > single thread. Properties are not > > necessarily added to a node right > > > > after node creation. > > > > > > > > I've included configuration details below, > > including the exception > > > > output during three identical runs. > > > > > > > > Is this a known issue in 1.5.M02? Anyone > > encountered the same exception? > > > > Any suggestions how to tackle this > > problem? > > > > > > > > Greetings, > > > > Dennis > > > > > > > > *** java > > > > java version "1.6.0_27" > > > > Java(TM) SE Runtime Environment (build > > 1.6.0_27-b07) > > > > Java HotSpot(TM) 64-Bit Server VM (build > > 20.2-b06, mixe
Re: [Neo4j] Urgent: 1.4.2 Github tag does not seem to match the 1.4.2 distro JAR
Sorry for not reporting back. It was an eclipse issue. All good now! On Oct 25, 2011, at 7:21 PM, "Mattias Persson" wrote: > Odd, could you just give a sample of a line that is wrong? just to help me > get started looking at this > > 2011/10/25 Rick Bullotta > >> When attempting to debug an issue with the index framework, the debugger is >> clearly on the wrong source lines, so I suspect there's some type of >> mismatch. >> >> Thoughts? >> >> ___ >> Neo4j mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user >> > > > > -- > Mattias Persson, [matt...@neotechnology.com] > Hacker, Neo Technology > www.neotechnology.com > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Urgent: 1.4.2 Github tag does not seem to match the 1.4.2 distro JAR
Odd, could you just give a sample of a line that is wrong? just to help me get started looking at this 2011/10/25 Rick Bullotta > When attempting to debug an issue with the index framework, the debugger is > clearly on the wrong source lines, so I suspect there's some type of > mismatch. > > Thoughts? > > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Default Analyzer in Index Framework
Hi Rick, yes you can do that, but not in a super easy way. What you'd have to do right now to get it working is to make sure you create an index with a special analyzer which converts everything to lower case (both additions and queries), effectively making it case insensitive. So create a class like this: public class LowerCaseAnalyzer extends Analyzer { @Override public TokenStream tokenStream( String fieldName, Reader reader ) { return new LowerCaseFilter( Version.LUCENE_31, new KeywordTokenizer( reader ) ); } } and make sure you create your index with a configuration map like: Index index = graphDb.index().forNodes( "myCaseInsensitiveIndex", MapUtil.stringMap( "analyzer", LowerCaseAnalyzer.class.getName() ) ); then this will work: index.add( node, "name", "Rick Bullotta" ); index.query( "name:\"rick bullotta\"" ); // ==> returns that node. 2011/10/25 Rick Bullotta > Anyone able to provide some insights on this? > > Thanks. > > From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On > Behalf Of Rick Bullotta [rick.bullo...@thingworx.com] > Sent: Monday, October 24, 2011 6:16 PM > To: Neo4j user discussions > Subject: [Neo4j] Default Analyzer in Index Framework > > When not using fulltext indexing, what Lucene Analyzer class does Neo4J > use? It seems that non-fulltext index searches are case sensitive - we'd > like to change that behavior. > > Thanks for any help/guidance/examples! > > Rick > > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Spring Data Graph 2.0.0 and Cypher queries using Repositories
Hi Tero, This has changed because Cypher now supports parameters. SDN used to hand-craft your query strings for you, but now it uses Cypher parameter instead. And Cypher doesn't support params in these areas. It definitely should. Andrés On Mon, Oct 24, 2011 at 9:41 PM, Tero Paananen wrote: > I just upgraded to Spring Data Graph for Neo4J 2.0.0.M1 and it > looks like certain things changed for the worse for my needs. > > Just looking for clarification of whether these changes are > permanent or to be addressed before final release. > > I'm using Cypher queries defined in Repository classes, e.g.: > > @Repository > public interface CustomRepository extends GraphRepository { > @Query(value = "start u = node:user(name = '%foo') match > (u)-[:KNOWS*1..%depth]->() return u", type = QueryType.Cypher) > Iterable getConnections(@Param("foo") String foo, @Param("depth") > Integer depth); > } > > This used to work just fine in 1.1.0.RELEASE. > > In 2.0.0.M01 %foo should be {foo} and %depth produces a syntax > error regardless of whether I specify it with %depth or {depth}. > Same with skip and limit instructions: > > .. return u skip %skip limit %limit > > used to work just fine, however > > .. return u skip {skip} limit {limit} > > no longer works. > > I know I could probably replicate this behavior using the > Neo4JTemplate functionality, but I'm not sure that's actually > a better way of doing that, considering how convenient it is > to create queries with the @Query annotation. > > Your thoughts? And what would be my best options for alternatives > at this point? > > Thanks! > > -TPP > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] cypher feature suggestion
Thanks for the suggestion! I took the liberty of opening an issue about it in github: https://github.com/neo4j/community/issues/74 Andrés On Tue, Oct 25, 2011 at 4:15 PM, F. De Haes wrote: > Hi, > > A feature that would be nice to add to the Cypher language features AVG, > SUM, etc. would be the ability to use a percentile function. > > In Oracle: percentile_cont and percentile_disc > > http://en.wikipedia.org/wiki/Percentile > > A use case could be to ask for certain percentile (e.g. 90) for a 'time > elapsed' property on a bunch of selected 'helpdesk call' nodes with status > 'closed'. The result is the time in which 90% of the closed 'helpdesk call' > nodes were solved. 10% of the selected nodes have a 'time elapsed' larger > than the result. > > You might know a median, which is actually percentile 50. > > Greetings, > Filip > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Cypher Aggregation functions - specifically SUM()
Hi there, Aggregation today only aggregates data over multiple subgraphs. Your query produces one subgraph per path, and that is why you can't do what you want to do. This is definitely something I want Cypher to support. The way I see this working is two steps: 1. Cypher needs a way to turn an iterable of nodes to an iterable of of numeric values. Something like Scala collection's map method. It could look something like this: RETURN MAP(x in NODES(p) : x.votes) 2. Aggregate functions need to be able to work on iterables, and not just on multiple subgraphs. The problem here is how to make it obvious which one you are trying to use, e.g. RETURN foo.bar, COUNT(NODES(path)) Does that mean aggregate on foo.bar and return the number of paths, or does it mean that you want to know the number of nodes in path? If/when this is done, your query would look something like this: RETURN SUM(MAP(x in NODES(path) : x.votes)) Any feedback on this is most welcome! tl;dr; Cypher can't do it today. It's a use case that's very interesting for us to solve. On Tue, Oct 25, 2011 at 12:02 PM, n_aschbacher wrote: > Hi, > > You're correct in one sense. If I remove the path, or other columns, from > the RETURN statement then I can get a single SUM value back for all the > properties in the entire tree below my starting node. > > My problem is that I want to return multiple rows, a row for each path > through the graph, with the SUM of the properties of the nodes traversed so > far on that single path. > > The idea is that I want to know which branch in the tree of posts and > replies has the highest total vote count. > > Removing other columns from the RETURN statement as you suggest will only > give me the SUM of votes for the whole tree, not per branch. > > Cheers! > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Cypher-Aggregation-functions-specifically-SUM-tp3450203p3450996.html > Sent from the Neo4j Community Discussions mailing list archive at > Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Spring Data Graph / Neo4j – Problems with OO Inheritance & Polymorphism relation semantics
Hi to all graphistas, I have two issues, a major and a minor : 1) 1) Starting with SDG 1.0/1.1, I have been developing a MetaModel (and a generator & hopefully later a MetaMetaModel:) and I have been facing walls in regards to @NodeEntity / NodeBacked classes inheritance and their related @RelatedXXX feature set. To the specifics, when a @NodeEntity class extends another, there is limited support to how these can consistently be related to other nodes (and retrieved) based not just on the RelationShip name, but also on their class (and inheritance/subclass tree). To better describe things, look at this simplistic example : I have @NodeEntity public class Content {} @NodeEntity public class ExtContent extends Content {} and @NodeEntity public class Person { public Set getLikedPersons() {return likedPersons;} @RelatedToVia(elementClass = Person.class, type = "LIKES", direction = Direction.OUTGOING) private Set likedPersons; public Iterable getLikedContent() { return likedContent; } @Query ("start n=node({self}) match (n) -[:LIKES]-(content) return content") @RelatedToVia(elementClass = Content.class, type = "LIKES", direction = Direction.OUTGOING) private Iterable likedContent; } As you can see, I want Person object/nodes to 'LIKES' other Person objects, or Content objects (and below in the inheritance tree, such as ExtContent) : @Test public void oo_testing() { Person p1 = new Person().persist(); Person p2 = new Person().persist(); Content c1 = new Content().persist(); ExtContent c2 = new ExtContent().persist(); p1.relateTo(c1, "LIKES"); p1.relateTo(c2, "LIKES"); p1.relateTo(p2, "LIKES"); for (Content cc : p1.getLikedContent()) logger.warn("Class : " + cc.getClass().getName() + ", __type__ : " + cc.getPersistentState().getProperty("__type__")); assertEquals("LikedContent objects are 2", 2, p1.getLikedContent().size()); // fails, cause it returns all of 'LIKES' related nodes } In version 1.1.RELEASE, retrieving any of the collections fails at rutime (can't recall the Exception, but it was the sort of "wrong class found"). In version 2.0.0.M1 & today's 2.0.0.BUILD-SNAPSHOT, getLikedPersons() & getLikedContent() both contain ALL nodes that are related via 'LIKES' relationship, irrespective of class. So, this is what p1.getLikedContent() contains : Class : sdnTests.domain.Content, __type__ : sdnTests.domain.Person Class : sdnTests.domain.Content, __type__ : sdnTests.domain.Content Class : sdnTests.domain.ExtContent, __type__ : sdnTests.domain.ExtContent One would expect that since elementClass defines the class of nodes to fill the collection with, this it would be obeyed. To a better extend, Iterable likedContent should contain all object/nodes of type Content AND Content subclasses (eg ExtContent), since do we have this type information. The same lacking of OO/polymorphism exists with repositories, as discussed here http://lists.neo4j.org/pipermail/user/2011-October/012654.html (haven’t tested this with 2.0.0.M01, I m still on 1.1) One can overcome these problems by hand & on top of SDN, but I guess it wouldn't be very elegant & efficient - I think SDN should inherently allow these OO/polymorphic features and emerge itself as a lean & robust OO + Graph repository. 2) 2) The minor issue I have regards (the otherwise brilliant) @Query, due to its constraint of annotating (mainly) Iterable and NOT allowing Set, List etc (a runtime exception is thrown org.springframework.data.neo4j.conversion.QueryResultBuilder$1 cannot be cast to java.util.List). This wouldn’t be a huge problem, but the JSP/JSLT tag DOESNOT iterate Iterable (!!!), nor you can directly call .iterator() from within JSP, making life hard on both ends. Regards ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4J 1.4.2 on Ubuntu 11.04
Yes, I managed to solve my problem. Aparently the problem was that I was using ReadOnlyGraphDatabase: GraphDatabaseService sourceDb = new EmbeddedReadOnlyGraphDatabase("/home/fmagalhaes/GraphDB/CineastDB"); I exchanged this line for this one: GraphDatabaseService sourceDb = new EmbeddedGraphDatabase("/home/fmagalhaes/GraphDB/CineastDB"); and my problem disappeared. Thanks . -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4J-1-4-2-on-Ubuntu-11-04-tp3439028p3452739.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Last call for passengers - 23-October-2011 / 19:30h / 1st Graph Coding Dojo - Berlin
See you there :-) Jordi. El 25/10/2011, a las 22:07, Peter Neubauer escribió: > Thanks Pere for the effort, > > hope there are many graphistas showing up! > > Cheers, > > /peter neubauer > > GTalk: neubauer.peter > Skype peter.neubauer > Phone +46 704 106975 > LinkedIn http://www.linkedin.com/in/neubauer > Twitter http://twitter.com/peterneubauer > > http://www.neo4j.org - NOSQL for the Enterprise. > http://startupbootcamp.org/- Öresund - Innovation happens HERE. > > > > On Tue, Oct 25, 2011 at 2:51 PM, Pere Urbón Bayes > wrote: >> --- SORRY FOR THE CROSS-POSTING --- >> >> Hi! >> this is the last call for passengers with destination the first >> Graph coding dojo - Berlin. >> >> 23 - October - 2011 19:30h >> co-up.de Adalbertstr. 7-8 10999 Berlin >> >> Bring your laptop, and some interest on doing thing with graph >> databases, graph frameworks, etc. We will provide you with the rest ( >> food, drinks, internet, etc ... ) >> >> More information at >> >> Event Registration: http://bit.ly/qeHf2b >> Call for participation: http://bit.ly/ofGzoB >> >> Preliminary website: http://bit.ly/uA0Lem >> >> See you, >> >> -- >> /purbon >> - @purbon >> - http://www.purbon.com >> >> -- >> You received this message because you are subscribed to the Google Groups >> "neo4jrb" group. >> To post to this group, send email to neo4...@googlegroups.com. >> To unsubscribe from this group, send email to >> neo4jrb+unsubscr...@googlegroups.com. >> For more options, visit this group at >> http://groups.google.com/group/neo4jrb?hl=en. >> >> > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Gremlin help
On Tue, Oct 25, 2011 at 12:35 PM, Peter Neubauer < peter.neuba...@neotechnology.com> wrote: > Yes, > that is true. We are still in QA with 1.5 GA, expect it during the > next few weeks as we are hunting down HA potential issues. Hope it is > ok to wait for some more days? > Sure no problem. :) > > Cheers, > > /peter neubauer > > GTalk: neubauer.peter > Skype peter.neubauer > Phone +46 704 106975 > LinkedIn http://www.linkedin.com/in/neubauer > Twitter http://twitter.com/peterneubauer > > http://www.neo4j.org - NOSQL for the Enterprise. > http://startupbootcamp.org/- Öresund - Innovation happens HERE. > > > > On Tue, Oct 25, 2011 at 2:32 PM, Nuo Yan wrote: > > Hi Marko, > > > > I believe 1.5 milestone release has Gremlin 1.3 and Blueprints 1.0 but > > before 1.5 stable release I'm going to be using 1.4.x. In 1.4.2 it only > has > > Gremlin 1.2 and doesn't appear to have the setTransactionBufferSize > stuff. > > > > On Tue, Oct 25, 2011 at 11:52 AM, Marko Rodriguez >wrote: > > > >> Hi, > >> > >> Note that with Blueprints 1.0, you do not have to deal with a commit > >> manager. You can do: > >> > >>graph.setTransactionBufferSize(50); > >> > >> ...and then simply do your traversal. No manager.incrCount() needed. I > >> believe the latest Neo4j release uses Gremlin 1.3 and Blueprints 1.0. ?? > >> Peter? > >> > >> Take care, > >> Marko. > >> > >> http://markorodriguez.com > >> > >> On Oct 25, 2011, at 12:43 PM, Nuo Yan wrote: > >> > >> > For the record, in case someone else has similar need, I came up with > the > >> > following query that does what I described in the last email below > (still > >> on > >> > gremlin 1.2 so still using Commit Manager): > >> > > >> > manager = TransactionalGraphHelper.createCommitManager(g, 50); > >> > g.v(1).out('foo').transform{[it, it.name, > >> > > >> > it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value > >> > -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0) > >> > {g.removeVertex(a[0]); manager.incrCounter()}}} > >> > manager.close(); > >> > > >> > After going through this I got a lot better understanding in Gremlin. > >> Thanks > >> > Peter and Marko. > >> > > >> > > >> > On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan wrote: > >> > > >> >> Thanks very much Marko. I researched the query one step at a time and > >> >> gained much more knowledge about gremlin. > >> >> > >> >> However, I wanted to do something a little bit different, instead of > >> >> comparing the "name" property of the children nodes to the source > node, > >> I > >> >> wanted to compare among the siblings of the children nodes (only > first > >> level > >> >> under the source node) and if there are duplicates, only keep the one > >> with > >> >> the biggest degree of "bar" relationship. (The source node doesn't > have > >> a > >> >> "name" property). > >> >> > >> >> For example, > >> >> > >> >> v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) > >> >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) > >> >> v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) > >> >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) > >> >> > >> >> would become: > >> >> > >> >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) > >> >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) > >> >> > >> >> So instead of doing > >> >> > >> >> > >> >> g.v(1).sideEffect{x = > >> >> > >> > it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} > >> >> > >> >> I proposed doing: > >> >> > >> >> g.v(1).out("foo").transform{[it, it.name, > >> >> it.out("bar").count]}.aggregate.cap > >> >> > >> >> to get an array of first level children nodes, their names, and > degree > >> of > >> >> "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", > >> 15], > >> >> [v(5), "xyz", 20] > >> >> > >> >> And then I can sort the array by the name property, and iterate > through > >> >> that array to delete nodes that have a smaller count based on the > count > >> >> value specified in each sub array. > >> >> > >> >> But since my gremlin knowledge is still very limited, before digging > too > >> >> much into this proposed solution I want to verify with you that it > would > >> >> work and see if you have better or easier approach to do it (i.e. > maybe > >> one > >> >> simple method that I can make use that I'm not aware of). Thanks > very > >> much > >> >> again. > >> >> > >> >> > >> >> On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez < > okramma...@gmail.com > >> >wrote: > >> >> > >> >>> Hi, > >> >>> > >> Currently I'm doing the following in my own code with multiple > >> requests > >> >>> to the standalone neo4j server. I wonder if it's possible to achieve > in > >> one > >> >>> gremlin query/script so that I can post the gremlin query to the > server > >> as 1 > >> >>> request and done. What I'm trying to achieve is: > >> > >> Start from one given node (e.g. v1), get all of the nodes connected > >> >>> thr
Re: [Neo4j] Gremlin help
Yes, that is true. We are still in QA with 1.5 GA, expect it during the next few weeks as we are hunting down HA potential issues. Hope it is ok to wait for some more days? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. On Tue, Oct 25, 2011 at 2:32 PM, Nuo Yan wrote: > Hi Marko, > > I believe 1.5 milestone release has Gremlin 1.3 and Blueprints 1.0 but > before 1.5 stable release I'm going to be using 1.4.x. In 1.4.2 it only has > Gremlin 1.2 and doesn't appear to have the setTransactionBufferSize stuff. > > On Tue, Oct 25, 2011 at 11:52 AM, Marko Rodriguez wrote: > >> Hi, >> >> Note that with Blueprints 1.0, you do not have to deal with a commit >> manager. You can do: >> >> graph.setTransactionBufferSize(50); >> >> ...and then simply do your traversal. No manager.incrCount() needed. I >> believe the latest Neo4j release uses Gremlin 1.3 and Blueprints 1.0. ?? >> Peter? >> >> Take care, >> Marko. >> >> http://markorodriguez.com >> >> On Oct 25, 2011, at 12:43 PM, Nuo Yan wrote: >> >> > For the record, in case someone else has similar need, I came up with the >> > following query that does what I described in the last email below (still >> on >> > gremlin 1.2 so still using Commit Manager): >> > >> > manager = TransactionalGraphHelper.createCommitManager(g, 50); >> > g.v(1).out('foo').transform{[it, it.name, >> > >> it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value >> > -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0) >> > {g.removeVertex(a[0]); manager.incrCounter()}}} >> > manager.close(); >> > >> > After going through this I got a lot better understanding in Gremlin. >> Thanks >> > Peter and Marko. >> > >> > >> > On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan wrote: >> > >> >> Thanks very much Marko. I researched the query one step at a time and >> >> gained much more knowledge about gremlin. >> >> >> >> However, I wanted to do something a little bit different, instead of >> >> comparing the "name" property of the children nodes to the source node, >> I >> >> wanted to compare among the siblings of the children nodes (only first >> level >> >> under the source node) and if there are duplicates, only keep the one >> with >> >> the biggest degree of "bar" relationship. (The source node doesn't have >> a >> >> "name" property). >> >> >> >> For example, >> >> >> >> v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) >> >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> >> v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) >> >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> >> >> would become: >> >> >> >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> >> >> So instead of doing >> >> >> >> >> >> g.v(1).sideEffect{x = >> >> >> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} >> >> >> >> I proposed doing: >> >> >> >> g.v(1).out("foo").transform{[it, it.name, >> >> it.out("bar").count]}.aggregate.cap >> >> >> >> to get an array of first level children nodes, their names, and degree >> of >> >> "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", >> 15], >> >> [v(5), "xyz", 20] >> >> >> >> And then I can sort the array by the name property, and iterate through >> >> that array to delete nodes that have a smaller count based on the count >> >> value specified in each sub array. >> >> >> >> But since my gremlin knowledge is still very limited, before digging too >> >> much into this proposed solution I want to verify with you that it would >> >> work and see if you have better or easier approach to do it (i.e. maybe >> one >> >> simple method that I can make use that I'm not aware of). Thanks very >> much >> >> again. >> >> >> >> >> >> On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez > >wrote: >> >> >> >>> Hi, >> >>> >> Currently I'm doing the following in my own code with multiple >> requests >> >>> to the standalone neo4j server. I wonder if it's possible to achieve in >> one >> >>> gremlin query/script so that I can post the gremlin query to the server >> as 1 >> >>> request and done. What I'm trying to achieve is: >> >> Start from one given node (e.g. v1), get all of the nodes connected >> >>> through a given type of relationship (e.g. relationship "foo"), within >> all >> >>> of these nodes, see if their "name" property has the same value, and if >> so, >> >>> delete the node (and the "foo" relationship connected to it) with >> smaller >> >>> outgoing degree (on a specific type of relationship, say, "bar"). If >> there >> >>> are more than two nodes with the same "name" property, only keep the >> one >> >>> with biggest outgoing degree (on typ
Re: [Neo4j] Gremlin help
Hi Marko, I believe 1.5 milestone release has Gremlin 1.3 and Blueprints 1.0 but before 1.5 stable release I'm going to be using 1.4.x. In 1.4.2 it only has Gremlin 1.2 and doesn't appear to have the setTransactionBufferSize stuff. On Tue, Oct 25, 2011 at 11:52 AM, Marko Rodriguez wrote: > Hi, > > Note that with Blueprints 1.0, you do not have to deal with a commit > manager. You can do: > >graph.setTransactionBufferSize(50); > > ...and then simply do your traversal. No manager.incrCount() needed. I > believe the latest Neo4j release uses Gremlin 1.3 and Blueprints 1.0. ?? > Peter? > > Take care, > Marko. > > http://markorodriguez.com > > On Oct 25, 2011, at 12:43 PM, Nuo Yan wrote: > > > For the record, in case someone else has similar need, I came up with the > > following query that does what I described in the last email below (still > on > > gremlin 1.2 so still using Commit Manager): > > > > manager = TransactionalGraphHelper.createCommitManager(g, 50); > > g.v(1).out('foo').transform{[it, it.name, > > > it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value > > -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0) > > {g.removeVertex(a[0]); manager.incrCounter()}}} > > manager.close(); > > > > After going through this I got a lot better understanding in Gremlin. > Thanks > > Peter and Marko. > > > > > > On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan wrote: > > > >> Thanks very much Marko. I researched the query one step at a time and > >> gained much more knowledge about gremlin. > >> > >> However, I wanted to do something a little bit different, instead of > >> comparing the "name" property of the children nodes to the source node, > I > >> wanted to compare among the siblings of the children nodes (only first > level > >> under the source node) and if there are duplicates, only keep the one > with > >> the biggest degree of "bar" relationship. (The source node doesn't have > a > >> "name" property). > >> > >> For example, > >> > >> v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) > >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) > >> v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) > >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) > >> > >> would become: > >> > >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) > >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) > >> > >> So instead of doing > >> > >> > >> g.v(1).sideEffect{x = > >> > it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} > >> > >> I proposed doing: > >> > >> g.v(1).out("foo").transform{[it, it.name, > >> it.out("bar").count]}.aggregate.cap > >> > >> to get an array of first level children nodes, their names, and degree > of > >> "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", > 15], > >> [v(5), "xyz", 20] > >> > >> And then I can sort the array by the name property, and iterate through > >> that array to delete nodes that have a smaller count based on the count > >> value specified in each sub array. > >> > >> But since my gremlin knowledge is still very limited, before digging too > >> much into this proposed solution I want to verify with you that it would > >> work and see if you have better or easier approach to do it (i.e. maybe > one > >> simple method that I can make use that I'm not aware of). Thanks very > much > >> again. > >> > >> > >> On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez >wrote: > >> > >>> Hi, > >>> > Currently I'm doing the following in my own code with multiple > requests > >>> to the standalone neo4j server. I wonder if it's possible to achieve in > one > >>> gremlin query/script so that I can post the gremlin query to the server > as 1 > >>> request and done. What I'm trying to achieve is: > > Start from one given node (e.g. v1), get all of the nodes connected > >>> through a given type of relationship (e.g. relationship "foo"), within > all > >>> of these nodes, see if their "name" property has the same value, and if > so, > >>> delete the node (and the "foo" relationship connected to it) with > smaller > >>> outgoing degree (on a specific type of relationship, say, "bar"). If > there > >>> are more than two nodes with the same "name" property, only keep the > one > >>> with biggest outgoing degree (on type "bar"). > >>> > >>> > >>> The query below is to warm you up. It will delete all vertices with > same > >>> property value as source vertex that are 'foo' related to source > vertex. > >>> Given that you are mutating the graph, you will want to deal with > >>> transaction buffers so you don't do one transaction per mutations: > >>> https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions > >>> > >>> g.v(1).sideEffect{x = > >>> > it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.sideEffect{g.removeVertex(it)} > >>> > >>> - > >>> > >>> To do the stuff with the smaller counts, etc. You can do: > >>> > >>>
Re: [Neo4j] Gremlin help
Cool. Keep it coming Nuo! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. On Tue, Oct 25, 2011 at 1:43 PM, Nuo Yan wrote: > For the record, in case someone else has similar need, I came up with the > following query that does what I described in the last email below (still on > gremlin 1.2 so still using Commit Manager): > > manager = TransactionalGraphHelper.createCommitManager(g, 50); > g.v(1).out('foo').transform{[it, it.name, > it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value > -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0) > {g.removeVertex(a[0]); manager.incrCounter()}}} > manager.close(); > > After going through this I got a lot better understanding in Gremlin. Thanks > Peter and Marko. > > > On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan wrote: > >> Thanks very much Marko. I researched the query one step at a time and >> gained much more knowledge about gremlin. >> >> However, I wanted to do something a little bit different, instead of >> comparing the "name" property of the children nodes to the source node, I >> wanted to compare among the siblings of the children nodes (only first level >> under the source node) and if there are duplicates, only keep the one with >> the biggest degree of "bar" relationship. (The source node doesn't have a >> "name" property). >> >> For example, >> >> v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> would become: >> >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> So instead of doing >> >> >> g.v(1).sideEffect{x = >> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} >> >> I proposed doing: >> >> g.v(1).out("foo").transform{[it, it.name, >> it.out("bar").count]}.aggregate.cap >> >> to get an array of first level children nodes, their names, and degree of >> "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", 15], >> [v(5), "xyz", 20] >> >> And then I can sort the array by the name property, and iterate through >> that array to delete nodes that have a smaller count based on the count >> value specified in each sub array. >> >> But since my gremlin knowledge is still very limited, before digging too >> much into this proposed solution I want to verify with you that it would >> work and see if you have better or easier approach to do it (i.e. maybe one >> simple method that I can make use that I'm not aware of). Thanks very much >> again. >> >> >> On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez wrote: >> >>> Hi, >>> >>> > Currently I'm doing the following in my own code with multiple requests >>> to the standalone neo4j server. I wonder if it's possible to achieve in one >>> gremlin query/script so that I can post the gremlin query to the server as 1 >>> request and done. What I'm trying to achieve is: >>> > >>> > Start from one given node (e.g. v1), get all of the nodes connected >>> through a given type of relationship (e.g. relationship "foo"), within all >>> of these nodes, see if their "name" property has the same value, and if so, >>> delete the node (and the "foo" relationship connected to it) with smaller >>> outgoing degree (on a specific type of relationship, say, "bar"). If there >>> are more than two nodes with the same "name" property, only keep the one >>> with biggest outgoing degree (on type "bar"). >>> >>> >>> The query below is to warm you up. It will delete all vertices with same >>> property value as source vertex that are 'foo' related to source vertex. >>> Given that you are mutating the graph, you will want to deal with >>> transaction buffers so you don't do one transaction per mutations: >>> https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions >>> >>> g.v(1).sideEffect{x = >>> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.sideEffect{g.removeVertex(it)} >>> >>> - >>> >>> To do the stuff with the smaller counts, etc. You can do: >>> >>> g.v(1).sideEffect{x = >>> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.transform{[it, >>> it.outE('bar').count()]}.filter{it[1] > 0}.aggregate.cap.next().sort{a,b -> >>> b[1] <=> a[1]}.eachWithIndex{a,i -> if(i > 0) g.removeVertex(a[0])} >>> >>> There you go! One big fatty Gremlin query to solve your problem. >>> >>> I would recommend going through each step and seeing what it returns so >>> you understand what is going on Again, given that you are mutating the >>> graph, be sure to be wise ab
Re: [Neo4j] Gremlin help
Hi, Note that with Blueprints 1.0, you do not have to deal with a commit manager. You can do: graph.setTransactionBufferSize(50); ...and then simply do your traversal. No manager.incrCount() needed. I believe the latest Neo4j release uses Gremlin 1.3 and Blueprints 1.0. ?? Peter? Take care, Marko. http://markorodriguez.com On Oct 25, 2011, at 12:43 PM, Nuo Yan wrote: > For the record, in case someone else has similar need, I came up with the > following query that does what I described in the last email below (still on > gremlin 1.2 so still using Commit Manager): > > manager = TransactionalGraphHelper.createCommitManager(g, 50); > g.v(1).out('foo').transform{[it, it.name, > it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value > -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0) > {g.removeVertex(a[0]); manager.incrCounter()}}} > manager.close(); > > After going through this I got a lot better understanding in Gremlin. Thanks > Peter and Marko. > > > On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan wrote: > >> Thanks very much Marko. I researched the query one step at a time and >> gained much more knowledge about gremlin. >> >> However, I wanted to do something a little bit different, instead of >> comparing the "name" property of the children nodes to the source node, I >> wanted to compare among the siblings of the children nodes (only first level >> under the source node) and if there are duplicates, only keep the one with >> the biggest degree of "bar" relationship. (The source node doesn't have a >> "name" property). >> >> For example, >> >> v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> would become: >> >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> So instead of doing >> >> >> g.v(1).sideEffect{x = >> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} >> >> I proposed doing: >> >> g.v(1).out("foo").transform{[it, it.name, >> it.out("bar").count]}.aggregate.cap >> >> to get an array of first level children nodes, their names, and degree of >> "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", 15], >> [v(5), "xyz", 20] >> >> And then I can sort the array by the name property, and iterate through >> that array to delete nodes that have a smaller count based on the count >> value specified in each sub array. >> >> But since my gremlin knowledge is still very limited, before digging too >> much into this proposed solution I want to verify with you that it would >> work and see if you have better or easier approach to do it (i.e. maybe one >> simple method that I can make use that I'm not aware of). Thanks very much >> again. >> >> >> On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez wrote: >> >>> Hi, >>> Currently I'm doing the following in my own code with multiple requests >>> to the standalone neo4j server. I wonder if it's possible to achieve in one >>> gremlin query/script so that I can post the gremlin query to the server as 1 >>> request and done. What I'm trying to achieve is: Start from one given node (e.g. v1), get all of the nodes connected >>> through a given type of relationship (e.g. relationship "foo"), within all >>> of these nodes, see if their "name" property has the same value, and if so, >>> delete the node (and the "foo" relationship connected to it) with smaller >>> outgoing degree (on a specific type of relationship, say, "bar"). If there >>> are more than two nodes with the same "name" property, only keep the one >>> with biggest outgoing degree (on type "bar"). >>> >>> >>> The query below is to warm you up. It will delete all vertices with same >>> property value as source vertex that are 'foo' related to source vertex. >>> Given that you are mutating the graph, you will want to deal with >>> transaction buffers so you don't do one transaction per mutations: >>> https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions >>> >>> g.v(1).sideEffect{x = >>> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.sideEffect{g.removeVertex(it)} >>> >>> - >>> >>> To do the stuff with the smaller counts, etc. You can do: >>> >>> g.v(1).sideEffect{x = >>> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.transform{[it, >>> it.outE('bar').count()]}.filter{it[1] > 0}.aggregate.cap.next().sort{a,b -> >>> b[1] <=> a[1]}.eachWithIndex{a,i -> if(i > 0) g.removeVertex(a[0])} >>> >>> There you go! One big fatty Gremlin query to solve your problem. >>> >>> I would recommend going through each step and seeing what it returns so >>> you understand what is going on Again, given that you are mutating the >>> graph, be sure to be wis
Re: [Neo4j] Gremlin help
For the record, in case someone else has similar need, I came up with the following query that does what I described in the last email below (still on gremlin 1.2 so still using Commit Manager): manager = TransactionalGraphHelper.createCommitManager(g, 50); g.v(1).out('foo').transform{[it, it.name, it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0) {g.removeVertex(a[0]); manager.incrCounter()}}} manager.close(); After going through this I got a lot better understanding in Gremlin. Thanks Peter and Marko. On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan wrote: > Thanks very much Marko. I researched the query one step at a time and > gained much more knowledge about gremlin. > > However, I wanted to do something a little bit different, instead of > comparing the "name" property of the children nodes to the source node, I > wanted to compare among the siblings of the children nodes (only first level > under the source node) and if there are duplicates, only keep the one with > the biggest degree of "bar" relationship. (The source node doesn't have a > "name" property). > > For example, > > v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) > v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) > v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) > v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) > > would become: > > v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) > v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) > > So instead of doing > > > g.v(1).sideEffect{x = > it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} > > I proposed doing: > > g.v(1).out("foo").transform{[it, it.name, > it.out("bar").count]}.aggregate.cap > > to get an array of first level children nodes, their names, and degree of > "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", 15], > [v(5), "xyz", 20] > > And then I can sort the array by the name property, and iterate through > that array to delete nodes that have a smaller count based on the count > value specified in each sub array. > > But since my gremlin knowledge is still very limited, before digging too > much into this proposed solution I want to verify with you that it would > work and see if you have better or easier approach to do it (i.e. maybe one > simple method that I can make use that I'm not aware of). Thanks very much > again. > > > On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez wrote: > >> Hi, >> >> > Currently I'm doing the following in my own code with multiple requests >> to the standalone neo4j server. I wonder if it's possible to achieve in one >> gremlin query/script so that I can post the gremlin query to the server as 1 >> request and done. What I'm trying to achieve is: >> > >> > Start from one given node (e.g. v1), get all of the nodes connected >> through a given type of relationship (e.g. relationship "foo"), within all >> of these nodes, see if their "name" property has the same value, and if so, >> delete the node (and the "foo" relationship connected to it) with smaller >> outgoing degree (on a specific type of relationship, say, "bar"). If there >> are more than two nodes with the same "name" property, only keep the one >> with biggest outgoing degree (on type "bar"). >> >> >> The query below is to warm you up. It will delete all vertices with same >> property value as source vertex that are 'foo' related to source vertex. >> Given that you are mutating the graph, you will want to deal with >> transaction buffers so you don't do one transaction per mutations: >>https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions >> >> g.v(1).sideEffect{x = >> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.sideEffect{g.removeVertex(it)} >> >> - >> >> To do the stuff with the smaller counts, etc. You can do: >> >> g.v(1).sideEffect{x = >> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.transform{[it, >> it.outE('bar').count()]}.filter{it[1] > 0}.aggregate.cap.next().sort{a,b -> >> b[1] <=> a[1]}.eachWithIndex{a,i -> if(i > 0) g.removeVertex(a[0])} >> >> There you go! One big fatty Gremlin query to solve your problem. >> >> I would recommend going through each step and seeing what it returns so >> you understand what is going on Again, given that you are mutating the >> graph, be sure to be wise about transactions. >> >> Enjoy!, >> Marko. >> >> http://markorodriguez.com >> >> ___ >> Neo4j mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user >> > > ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4jRestNet Update
Peter, In this case we really do want all that data, so that paged enumerator actually handles it quite nicely. I'm not sure why you're saying that 'JSON needs to be built before sending over the request'. JSON is almost always written forward-only, and HTTP has supported streaming for a long time now. In fact, the Twitter Streaming APIs are just that - streaming JSON. -- Tatham -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Peter Neubauer Sent: Tuesday, 25 October 2011 2:21 AM To: Neo4j user discussions Subject: Re: [Neo4j] Neo4jRestNet Update Tatham, yes, serialization is an issue in that the JSON needs to be built before sending over the request. What you could do is to trim the query down by not pulling out the nodes, but just the properties you need (including node IDs) which will dramatically change the amount of data transferred, see e.g. http://docs.neo4j.org/chunked/snapshot/cypher-plugin.html#rest-api-send-queries-with-parameters or http://docs.neo4j.org/chunked/snapshot/gremlin-plugin.html#rest-api-group-count Would that work? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. On Mon, Oct 24, 2011 at 7:35 AM, Tatham Oddie wrote: > More info: > > That's all transparent ... consumers of our client just enumerate the result > like normal and see no difference. Under the covers, this gets split up into > pages of 100 nodes loaded on-demand. > > Added it because we wanted to pull out 38k nodes from a query and the > REST plugin was exploding the Java heap space trying to append the > string. :) (Um, streaming anyone? ;)) > > > -- Tatham > > > -Original Message- > From: user-boun...@lists.neo4j.org > [mailto:user-boun...@lists.neo4j.org] On Behalf Of Romiko Derbynew > Sent: Monday, 24 October 2011 10:46 PM > To: Neo4j user discussions > Subject: Re: [Neo4j] Neo4jRestNet Update > > Tatham has also added support for paging queries, so queries returning more > than 100 results are retrieved via an enumerator per 100 results, this > optimises the heap usage as well, by leveraging the groovy Take and Drop > functions. > > Cheers. > > -Original Message- > From: user-boun...@lists.neo4j.org > [mailto:user-boun...@lists.neo4j.org] On Behalf Of Tatham Oddie > Sent: Friday, 21 October 2011 4:37 PM > To: Neo4j user discussions > Subject: Re: [Neo4j] Neo4jRestNet Update > > Hi Kan, > > FYI - we added parametized Gremlin queries in our implementation and have > seen a nice memory heap improvement on the Java side as a result. > > That is ... instead of: > > g.v(123).outE[[label:'FOO']] > > we send: > > { > query: 'g.v(p0).outE[[label:p1]]', > params: { > p0: 123, > p1: 'FOO' > } > } > > This allows the query to be cached by neo4j and then just called with > different parameters later. > > > -- Tatham > > > -Original Message- > From: user-boun...@lists.neo4j.org > [mailto:user-boun...@lists.neo4j.org] On Behalf Of KanTube > Sent: Thursday, 20 October 2011 9:36 AM > To: user@lists.neo4j.org > Subject: [Neo4j] Neo4jRestNet Update > > fwiw... > > I updated my .Net wrapper > https://github.com/SepiaGroup/Neo4jRestNet/ > > you can now use LINQ on the flilter and sort commands for example > > Node MyNode = Node.GetNode(123); > GremlinScript script = new GremlinScript(MyNode) > .Out("Like") > .Filter(it => it.GetProperty("MyProp").ToLowerCase() == > "TestValue" || > it.GetProperty("AnotherProp").Contains("SomeValue")) > > IEnumerable ReturnNodes = Gremlin.Post(script); > > > You can also return a DataTable > > GremlinScript script = new GremlinScript(); > script.NewTable("t") > .NodeIndexLookup(new Dictionary() { { "FirstName" , > "Jack" > }, { "LastName", "Shaw" } }) > .Filter(it => it.GetProperty("UID") == "jshaw") > .As("UserNode") > .OutE("Like") > .As("LikeRel") > .InV() > .As("FriendNode") > .Table("t", "UserNode", "LikeRel", "FriendNode") > > .Append("{{it}}{{it.getProperty('{0}')}}{{it.getProperty('{1}')}} >> > -1; t","Date", "FirstName"); > > DataTable tbl = Gremlin.GetTable(script); > > > The above table example will submit the following Gremlin script to Neo4j: > > t = new Table(); > g.V[['FirstName':'Jack','LastName':'Shaw']] > .filter{it.getProperty('UID') == 'jshaw'} > .as('UserNode') > .outE('Likes') > .as('LikeRel') > .inV() > .as('FriendNode') > .table(t, > ['UserNode','LikeRel','FriendNode']){it}{it.getProperty('Date')}{it.ge > tProperty('FirstName')} >>> -1; t"); > > And the returned table will have column names of "UserNode", > "LikeRel", "FriendNode" with typed value
Re: [Neo4j] Neo4jRestNet Cypher Plugin
This looks very cool Kan! Looking forward to get some feedback from the great .NET guys around here! When that is stabilizing, maybe the README could include a couple of examples - I could help out there. Also, does anyone have automatic installation scripts for pulling down and starting up Neo4j maybe via NUGET? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. On Mon, Oct 24, 2011 at 1:23 PM, KanTube wrote: > I have push a code update that implements the Cypher plugin > > https://github.com/SepiaGroup/Neo4jRestNet > > The interface supports all of the START, MATCH, WHERE and RETURN clauses > (except the Type function on the Where clause – coming soon.) Blow are some > examples. The last two examples show an alternate syntax that can be used > with any clause. I have not implemented the Order by, Skip, Limit or > Funcitons yet. > > Romiko/Tatham if you can review the syntax and make suggestions I will try > to implement them, otherwise you should be able to incorporate this into > your code with minimal modifications. Also you may want to look at how I > parse the Expression tree. I have not had time to review the update you > just posted for paging queries but that sounds very interesting. > > // Basic Cypher query > CypherQuery c1 = new CypherQuery(); > > c1.Start(s => s.Node("A", 0)); > c1.Return( r => r.Node("A")); > > DataTable tbl = Cypher.Post(c1); > > // c1.ToString() = "START A=node(0) RETURN A" > > > // Cypher with Match clause > CypherQuery c2 = new CypherQuery(); > > c2.Start(s => s.Node("A", 0)); > c2.Match(m => m.Node("A").To("r", "Likes").Node("B")); > c2.Return(r => r.Node("A").Relationship("r").Node("B")); > > tbl = Cypher.Post(c2); > > // c2.ToString() = "START A=node(0) MATCH (A) -[r:Likes]-> (B) RETURN A, r, > B" > > > // Cypher with multi start and return optional property > CypherQuery c3 = new CypherQuery(); > c3.Start(s => s.Node("A", 0, 1)); > c3.Match(m => m.Node("A").Any("r", "Likes").Node("C")); > c3.Return(r => r.Node("C").Node("C").Property("Name?")); > > tbl = Cypher.Post(c3); > > // c3.ToString() = "START A=node(0,1) MATCH (A) -[r:Likes]- (C) RETURN C, > C.Name?" > > // Multi Start > CypherQuery c4 = new CypherQuery(); > c4.Start(s => s.Node("A", 0).Node("B",1)); > c4.Return(r => r.Node("A").Node("B")); > > tbl = Cypher.Post(c4); > > // C4.ToString() = "START A=node(0), B=node(1) RETURN A, B" > > // Cypher with Where clause > CypherQuery c5 = new CypherQuery(); > c5.Start(s => s.Node("A", 0, 1)); > c5.Where(w => w.Node("A").Property("Age?") < 30 && > w.Node("A").Property("Name?") == "Tobias" || !(w.Node("A").Property("Name?") > == "Tobias")); > c5.Return(r => r.Node("A")); > > tbl = Cypher.Post(c5.ToString()); > > // C5.ToString() = "START A=node(0,1) WHERE A.Age? < 30 and A.Name? = > 'Tobias' or not(A.Name? = 'Tobias') RETURN A" > > // Alt syntax > CypherQuery c6 = new CypherQuery(); > c6.Start(s => { > s.Node("A", 0); > s.Node("B", 1); > return s; > }); > > c6.Return(r => { > r.Node("A"); > r.Node("B"); > return r; > }); > > tbl = Cypher.Post(c6); > > // c6.ToString = "START A=node(0), B=node(1) RETURN A, B" > > // Alt syntax > CypherQuery c7 = new CypherQuery(); > c7.Start(s => s.Node("A", 0)); > c7.Start(s => s.Node("B", 1)); > > c7.Return(r => r.Node("A")); > c7.Return(r => r.Node("B")); > > tbl = Cypher.Post(c7); > // c7.ToString = "START A=node(0), B=node(1) RETURN A, B" > > > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Neo4jRestNet-Cypher-Plugin-tp3449054p3449054.html > Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] cypher feature suggestion
Hi, A feature that would be nice to add to the Cypher language features AVG, SUM, etc. would be the ability to use a percentile function. In Oracle: percentile_cont and percentile_disc http://en.wikipedia.org/wiki/Percentile A use case could be to ask for certain percentile (e.g. 90) for a 'time elapsed' property on a bunch of selected 'helpdesk call' nodes with status 'closed'. The result is the time in which 90% of the closed 'helpdesk call' nodes were solved. 10% of the selected nodes have a 'time elapsed' larger than the result. You might know a median, which is actually percentile 50. Greetings, Filip ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Urgent: 1.4.2 Github tag does not seem to match the 1.4.2 distro JAR
When attempting to debug an issue with the index framework, the debugger is clearly on the wrong source lines, so I suspect there's some type of mismatch. Thoughts? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Default Analyzer in Index Framework
Anyone able to provide some insights on this? Thanks. From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On Behalf Of Rick Bullotta [rick.bullo...@thingworx.com] Sent: Monday, October 24, 2011 6:16 PM To: Neo4j user discussions Subject: [Neo4j] Default Analyzer in Index Framework When not using fulltext indexing, what Lucene Analyzer class does Neo4J use? It seems that non-fulltext index searches are case sensitive - we'd like to change that behavior. Thanks for any help/guidance/examples! Rick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Cypher Aggregation functions - specifically SUM()
Hi, You're correct in one sense. If I remove the path, or other columns, from the RETURN statement then I can get a single SUM value back for all the properties in the entire tree below my starting node. My problem is that I want to return multiple rows, a row for each path through the graph, with the SUM of the properties of the nodes traversed so far on that single path. The idea is that I want to know which branch in the tree of posts and replies has the highest total vote count. Removing other columns from the RETURN statement as you suggest will only give me the SUM of votes for the whole tree, not per branch. Cheers! -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Cypher-Aggregation-functions-specifically-SUM-tp3450203p3450996.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user