Re: [Neo4j] neo4j spatial bounding box vs. lat/lon
Hi Boris, I can see the new update method here: https://github.com/neo4j/neo4j-spatial/blob/master/src/main/java/org/neo4j/gis/spatial/server/plugin/SpatialPlugin.java#L138 And the commit for it is here: https://github.com/neo4j/neo4j-spatial/commit/22eaf91957a6265ef1e6923b5da572b75383b83e Hope that helps. Let me know if this works. The REST method is entirely untested, but does wrap code that is tested, so I'm relatively optimistic :-) Regards, Craig On Wed, Jul 6, 2011 at 1:51 AM, Boris Kizelshteyn bo...@popcha.com wrote: Hi Craig, This is awesome! Where is the update method? I can't find the code on github. Thanks! On Sat, Jul 2, 2011 at 6:00 PM, Craig Taverner cr...@amanzi.com wrote: As I understand it, Andreas is working on the much more complex problem of updating OSM geometries. That is more complex because it involves restructuring the connected graph. The case Boris has is much simpler, just modifying the WKT or WKB in the editable layer. In the Java API this is simply to call the GeometryEncoder.encodeGeometry() method, which will modify the geometry in place (ie. replace the old geometry with a new one). However, I do not think it is that simple on the REST interface. I can check, but think we will need a new method for updating geometries. Internally it is trivial to code. So I just added a quick method, called updateGeometryFromWKT, which requires the geometry (in WKT), the existing geometry node-id, and the layer. Give it a try. On Sat, Jul 2, 2011 at 5:10 PM, Peter Neubauer neubauer.pe...@gmail.com wrote: Actually, Andreas Wilhelm is working right now on updating geometries. Sent from my phone. On Jul 2, 2011 5:00 PM, Boris Kizelshteyn bo...@popcha.com wrote: Wow that's great! I'll try it out asap. This leads to my next question: how do I update the geometry in a layer, rather than add new? What I am thinking of doing is having a multipoint geometery associated with each of my user nodes which will represent their location history. My plan is to add the geometry to a world layer and then associate the returned node with the user. How do I then add new points to that connecter node? Can I just edit the wkt and assume the index will update? Or do you have a better suggestion for doing this? I would rather avoid having each point be a seperate node as I am tracking gps data and getting lots of coordinates, it would be many thousands of nodes per user. Many thanks! On Sat, Jul 2, 2011 at 6:48 AM, Craig Taverner cr...@amanzi.com wrote: Hi Boris, Ah! You are using the REST API. That changes a lot, since Neo4j Spatial is only recently exposed in REST and we do not expose most of the capabilities I have discussed in this thread, or indeed in my other answer today. I did recently add some REST methods that might work for you, specifically the addEditableLayer, which makes a WKB layer, and the addGeometryWKTToLayer, for adding any kind of Geometry (including LineString) to the layer. However, these were only added recently, and I have no experience using them myself, so consider this very much prototype code. From your other question today, can I assume you are having trouble making sense of the data coming back? So we need a better way to return the results in WKT instead of WKB? One option would be to enhance the addEditableLayer method to allow the creation of WKT layers instead of WKB layers, so the internal representation is more internet friendly. I've just added untested support for setting the format to WKT for the internal representation of the editable layer in the REST interface. This is untested (outside of my usual unit tests, that is), and is only in the trunk of neo4j-spatial, but you are welcome to try it out and see what happens. Regards, Craig On Fri, Jul 1, 2011 at 5:29 PM, Boris Kizelshteyn bo...@popcha.com wrote: Hi Craig, Thanks so much for this reply. It is very insightful. Is it possible for me to implement the LineString geometries and lookups using REST? Many thanks! On Wed, Jun 8, 2011 at 4:58 PM, Craig Taverner cr...@amanzi.com wrote: OK. I understand much better what you want now. Your person nodes are not geographic objects, they are persons that can be at many positions and indeed move around. However, the 'path' that they take is a geographic object and can be placed on the map and analysed geographically. So the question I have is how do you store the path the person takes? Is this a bunch of position nodes connected back to that person? Or perhaps a chain of
Re: [Neo4j] neo4j + RDF + SPARQL or non-RDF?
I'd say that you can often use nodes and relationships without URIs, although maybe some concept of IDs other than the internal ids of nodes and relationships. Data stored in neo4j can often be seen as triples-like statements: (personA)--[KNOWS]--(personB) but that's just the simplest form... f.ex: (personA)--[KNOWS]--(personB)--[MARRIED_TO]--(personC) | [LIVES_IN] | v (Sweden) and you can traverse that as one graph, whereas each node-relationship-node could in this setting be viewed as one triple in RDF terms. I think RDF is needlessly limiting graph capabilities to triples, and neo4j (and fellow property graphs) does not. Also check out the Cypher query language, http://docs.neo4j.org/chunked/1.4.M06/cypher-query-lang.html 2011/7/3 noppanit noppani...@gmail.com Hi Folks, I'm very interested in RDF and SPARQL, but I'm a total newbie. However, since neo4j can do the same thing without having to use RDF or SPARQL to traverse the graph. Would it be better if I use RDF to store the graph with URIs or I can just ignore that and use pure nodes and relationships to store the data, but to store in triples-like structure? And what does neo4j guys think about RDF and the direction of neo4j with semantic web, because I think neo4j is the perfect tool for semantic web. Cheers, Toy. -- View this message in context: http://neo4j-user-list.438527.n3.nabble.com/neo4j-RDF-SPARQL-or-non-RDF-tp3135352p3135352.html Sent from the Neo4J User List mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Possibility of extending Zookeeper features to App Server
The role of Zookeeper in HA in neo4j is just master election and instance discovery. It's not involved in the actual communication between instances. Knowing that, are your questions still valid? 2011/7/1 Brendan Cheng ccp...@gmail.com Hi, I'm very interested your HA architecture and wonder if possible to extend the zookeeper features in order to cover the jobs for an app server. So that, we can have much simply architecture. The jobs for app server includes user authentication, encryption service for communication..etc. from your experience, is this going to work or what is a preferred architecture? Regards, Brendan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Start script of 1.4.M05/1.4.M06 fails with older versions of bash + FIX
Hi everyone, a rather old linux installation on our build server led us to find out that the new start script introduced in M05 (?) does not work with all versions of bash. We got: cruise:/virtual/hudson/hudson_home/jobs/graphdb/workspace# /opt/neo4j/bin/neo4j start /opt/neo4j/bin/neo4j: line 37: syntax error in conditional expression: unexpected token `(' /opt/neo4j/bin/neo4j: line 37: syntax error near `^([' /opt/neo4j/bin/neo4j: line 37: ` if [[ ${line} =~ ^([^#\s][^=]+)=(.+)$ ]]; then' On a system with this info: ext-xecruise52-1:/opt/neo4j/bin# cat /proc/version Linux version 2.6.28.7-ibm-x3650 (root@obc-fai42-1) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Thu Feb 26 13:50:31 CET 2009 Here is the quick fix I just found (no patch, since I don't want to suggest I know it works on other system versions...). Enclose the regexps on line 37ff in quotes as so: if [[ ${line} =~ ^([^#\s][^=]+)=(.+)$ ]]; then key=`echo ${BASH_REMATCH[1]} | sed 's/\./_/g'` value=${BASH_REMATCH[2]} if [[ ${key} =~ ^(.*)_([0-9]+)$ ]]; then Cheers, Stephan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] REST Batch: failed to insert empty array as a node property
When trying to process POSTing to batch-path of something like... [{id:1, method:POST, to:/node, body:{user_properties:[]} }] ...server fails with... exception : java.lang.RuntimeException, stacktrace : [ org.neo4j.server.rest.web.BatchOperationService.performJob(BatchOperationService.java:137), org.neo4j.server.rest.web.BatchOperationService.performBatchOperations(BatchOperationService.java:83), sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method), sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source), sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source), java.lang.reflect.Method.invoke(Unknown Source) etc. A bug, or have I missed something in Neo4j docs? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Performance issue on nodes with lots of relationships
I have a graph with roughly 10M nodes. Some of these nodes are highly connected to other nodes. For example I may have a single node with 1M+ relationships. A good analogy is a population that has a lives-in relationship to a state. Now the problem... Both neoclipse or neo4j-shell are terribly slow when working with these nodes. In the shell I would expect a `cd node-id` to be very fast, much like selecting via a rowid in a standard DB. Instead, I usually see several seconds delay. Doing a `ls` takes so long that I usually have to just kill the process. In fact `ls` never outputs anything which is odd since I would expect it to stream the output as it found it. I have very similar performance issues with neoclipse. I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM. Disclaimer, I am new to Neo4j. Thanks, Andrew ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Lifecycle of a Graph ?
Can anyone explains me the life cycle of a graph with Neo4j spring data graph. I want to load multiple copies of a persisted graph in memory, but some how the second instance returns null. Not sure if I am doing something wrong or if this is something related to the graph lifecycle? Any comments/ suggestions please. Karan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Lifecycle of a Graph ?
Could you explain, how you load it into memory? And what you do that returns null? The graph itself has no lifecycle. Then Graph-Entities are attached the nodes and relationships when you load (via repositories, cypher or direct, or when you navigate along relationships) them and allow write through in transactions and read-through all the time. They become detached when you leave the tx and start changing properties. Newly created entities are also detached. Cheers Michael Am 06.07.2011 um 16:20 schrieb V: Can anyone explains me the life cycle of a graph with Neo4j spring data graph. I want to load multiple copies of a persisted graph in memory, but some how the second instance returns null. Not sure if I am doing something wrong or if this is something related to the graph lifecycle? Any comments/ suggestions please. Karan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance issue on nodes with lots of relationships
Andrew, could you please also try to access the graph via the latest Milestone 1.4.M06 to see if things have improved. Does this behaviour only effect the supernodes or every node in your graph (e.g. when you access, cd, ls a person-node?) We've been discussing some changes to the initial loading/caching that might improve performance on heavily connected (super-)nodes. If our changes and tests are successful these change will be integrated in early 1.5. Milestones. Cheers Michael Am 06.07.2011 um 16:15 schrieb Andrew White: I have a graph with roughly 10M nodes. Some of these nodes are highly connected to other nodes. For example I may have a single node with 1M+ relationships. A good analogy is a population that has a lives-in relationship to a state. Now the problem... Both neoclipse or neo4j-shell are terribly slow when working with these nodes. In the shell I would expect a `cd node-id` to be very fast, much like selecting via a rowid in a standard DB. Instead, I usually see several seconds delay. Doing a `ls` takes so long that I usually have to just kill the process. In fact `ls` never outputs anything which is odd since I would expect it to stream the output as it found it. I have very similar performance issues with neoclipse. I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM. Disclaimer, I am new to Neo4j. Thanks, Andrew ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance issue on nodes with lots of relationships
Hi, Michael. Are you thinking maybe of lazily loading relationships in 1.5? That might be a huge boost. Rick -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Michael Hunger Sent: Wednesday, July 06, 2011 10:32 AM To: Neo4j user discussions Subject: Re: [Neo4j] Performance issue on nodes with lots of relationships Andrew, could you please also try to access the graph via the latest Milestone 1.4.M06 to see if things have improved. Does this behaviour only effect the supernodes or every node in your graph (e.g. when you access, cd, ls a person-node?) We've been discussing some changes to the initial loading/caching that might improve performance on heavily connected (super-)nodes. If our changes and tests are successful these change will be integrated in early 1.5. Milestones. Cheers Michael Am 06.07.2011 um 16:15 schrieb Andrew White: I have a graph with roughly 10M nodes. Some of these nodes are highly connected to other nodes. For example I may have a single node with 1M+ relationships. A good analogy is a population that has a lives-in relationship to a state. Now the problem... Both neoclipse or neo4j-shell are terribly slow when working with these nodes. In the shell I would expect a `cd node-id` to be very fast, much like selecting via a rowid in a standard DB. Instead, I usually see several seconds delay. Doing a `ls` takes so long that I usually have to just kill the process. In fact `ls` never outputs anything which is odd since I would expect it to stream the output as it found it. I have very similar performance issues with neoclipse. I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM. Disclaimer, I am new to Neo4j. Thanks, Andrew ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance issue on nodes with lots of relationships
Andrew, if you upgrade to 1.4.M06, your shell should be able to do Cypher in order to count the relationships of a node, not returning them: start n=(1) match (n)-[r]-(x) return count(r) and try that several times to see if cold caches are initially slowing down things. or something along these lines. In the LS and Neoclipse the output and visualization will be slow for that amount of data. Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Jul 6, 2011 at 4:15 PM, Andrew White li...@andrewewhite.net wrote: I have a graph with roughly 10M nodes. Some of these nodes are highly connected to other nodes. For example I may have a single node with 1M+ relationships. A good analogy is a population that has a lives-in relationship to a state. Now the problem... Both neoclipse or neo4j-shell are terribly slow when working with these nodes. In the shell I would expect a `cd node-id` to be very fast, much like selecting via a rowid in a standard DB. Instead, I usually see several seconds delay. Doing a `ls` takes so long that I usually have to just kill the process. In fact `ls` never outputs anything which is odd since I would expect it to stream the output as it found it. I have very similar performance issues with neoclipse. I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM. Disclaimer, I am new to Neo4j. Thanks, Andrew ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance issue on nodes with lots of relationships
This is consistently slow. I made a graph which just goes off of the root reference node (0) and I am seeing the following... (0)$ cd 1 about 1 minute (1)$ cd 0 instant (0)$ cd 1 about 1 minute It's almost like it is scanning the entire relationship list before actually looking up the next node. Of note I have found the following when running neoclipse... WARNING: [/path/to/neo4j-db/neostore.relationshipstore.db] Unable to memory map And I see this in the logs... neostore.nodestore.db.mapped_memory=20M neostore.propertystore.db.arrays.mapped_memory=130M neostore.propertystore.db.index.keys.mapped_memory=1M neostore.propertystore.db.index.mapped_memory=1M neostore.propertystore.db.mapped_memory=90M neostore.propertystore.db.strings.mapped_memory=130M neostore.relationshipstore.db.mapped_memory=100M Am I missing something obvious? Even without memory maps, I would expect this to be somewhat faster since reading 156MB (the size of my neostore.relationshipstore.db file) of relation data should be very fast. Also, is there anyway to do a pre-warm up so that the first hit isn't so slow? I would hate for my first user in PROD to get hammered because a cache wasn't warmed up. Thanks, Andrew On 07/06/2011 09:24 AM, Rick Bullotta wrote: Hi, Andrew. In general, this scenario (1 million+ relationships on a node) can be slow, but usually only the first time you access the node. If you're only accessing the node once in a session, then yes, it will seem sluggish. The Neoclipse issue is probably a combination of two issues: the first is lazily loading the node information the first time, and the second is the visual rendering of all those relationships. Rick -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Andrew White Sent: Wednesday, July 06, 2011 10:15 AM To: user@lists.neo4j.org Subject: [Neo4j] Performance issue on nodes with lots of relationships I have a graph with roughly 10M nodes. Some of these nodes are highly connected to other nodes. For example I may have a single node with 1M+ relationships. A good analogy is a population that has a lives-in relationship to a state. Now the problem... Both neoclipse or neo4j-shell are terribly slow when working with these nodes. In the shell I would expect a `cdnode-id` to be very fast, much like selecting via a rowid in a standard DB. Instead, I usually see several seconds delay. Doing a `ls` takes so long that I usually have to just kill the process. In fact `ls` never outputs anything which is odd since I would expect it to stream the output as it found it. I have very similar performance issues with neoclipse. I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM. Disclaimer, I am new to Neo4j. Thanks, Andrew ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] neo4j spatial bounding box vs. lat/lon
Ah ha ... the reason I couldn't find it is because there is a typo ... udpateGeometryFromWKT the p and d are switched :) However, I rebuilt it but do not see this in the REST extensions after moving everything from /target/dependency to plugins. Any thoughts? Thanks! On Wed, Jul 6, 2011 at 4:31 AM, Craig Taverner cr...@amanzi.com wrote: Hi Boris, I can see the new update method here: https://github.com/neo4j/neo4j-spatial/blob/master/src/main/java/org/neo4j/gis/spatial/server/plugin/SpatialPlugin.java#L138 And the commit for it is here: https://github.com/neo4j/neo4j-spatial/commit/22eaf91957a6265ef1e6923b5da572b75383b83e Hope that helps. Let me know if this works. The REST method is entirely untested, but does wrap code that is tested, so I'm relatively optimistic :-) Regards, Craig On Wed, Jul 6, 2011 at 1:51 AM, Boris Kizelshteyn bo...@popcha.com wrote: Hi Craig, This is awesome! Where is the update method? I can't find the code on github. Thanks! On Sat, Jul 2, 2011 at 6:00 PM, Craig Taverner cr...@amanzi.com wrote: As I understand it, Andreas is working on the much more complex problem of updating OSM geometries. That is more complex because it involves restructuring the connected graph. The case Boris has is much simpler, just modifying the WKT or WKB in the editable layer. In the Java API this is simply to call the GeometryEncoder.encodeGeometry() method, which will modify the geometry in place (ie. replace the old geometry with a new one). However, I do not think it is that simple on the REST interface. I can check, but think we will need a new method for updating geometries. Internally it is trivial to code. So I just added a quick method, called updateGeometryFromWKT, which requires the geometry (in WKT), the existing geometry node-id, and the layer. Give it a try. On Sat, Jul 2, 2011 at 5:10 PM, Peter Neubauer neubauer.pe...@gmail.com wrote: Actually, Andreas Wilhelm is working right now on updating geometries. Sent from my phone. On Jul 2, 2011 5:00 PM, Boris Kizelshteyn bo...@popcha.com wrote: Wow that's great! I'll try it out asap. This leads to my next question: how do I update the geometry in a layer, rather than add new? What I am thinking of doing is having a multipoint geometery associated with each of my user nodes which will represent their location history. My plan is to add the geometry to a world layer and then associate the returned node with the user. How do I then add new points to that connecter node? Can I just edit the wkt and assume the index will update? Or do you have a better suggestion for doing this? I would rather avoid having each point be a seperate node as I am tracking gps data and getting lots of coordinates, it would be many thousands of nodes per user. Many thanks! On Sat, Jul 2, 2011 at 6:48 AM, Craig Taverner cr...@amanzi.com wrote: Hi Boris, Ah! You are using the REST API. That changes a lot, since Neo4j Spatial is only recently exposed in REST and we do not expose most of the capabilities I have discussed in this thread, or indeed in my other answer today. I did recently add some REST methods that might work for you, specifically the addEditableLayer, which makes a WKB layer, and the addGeometryWKTToLayer, for adding any kind of Geometry (including LineString) to the layer. However, these were only added recently, and I have no experience using them myself, so consider this very much prototype code. From your other question today, can I assume you are having trouble making sense of the data coming back? So we need a better way to return the results in WKT instead of WKB? One option would be to enhance the addEditableLayer method to allow the creation of WKT layers instead of WKB layers, so the internal representation is more internet friendly. I've just added untested support for setting the format to WKT for the internal representation of the editable layer in the REST interface. This is untested (outside of my usual unit tests, that is), and is only in the trunk of neo4j-spatial, but you are welcome to try it out and see what happens. Regards, Craig On Fri, Jul 1, 2011 at 5:29 PM, Boris Kizelshteyn bo...@popcha.com wrote: Hi Craig, Thanks so much for this reply. It is very insightful. Is it possible for me to implement the LineString geometries and lookups using REST? Many thanks! On Wed, Jun 8, 2011 at 4:58 PM, Craig Taverner cr...@amanzi.com wrote: OK. I understand much better what you want now.
Re: [Neo4j] Data Federation
Hi John, But if I try to do a distributed join, aren't I hit with having to transfer more data over the wire? Yes you're right - that's one penalty of having a graph distributed. Each time you hit a relationship that crosses a machine, the latency is way higher than if you were traversing locally within an instance. I am not sure if we need auto sharding. My data is already in place in legacy systems. OK, then that's better - you've already sharded. But if your data is already housed in legacy systems, you'd have to export it into Neo4j since Neo4j is a database and not a triple store API layer. I am no expert by any means, but my understanding is that Map-reduce is for data that is not interconnected. That is, you run each map completely independently on each shard. Map reduce can take its data from anywhere (in theory). But map-reduce is a batch oriented programming pattern (with Hadoop being a popular implementation of that pattern), whereas neo4j is a database - a box of tricks that allows you to store and retrieve highly connected data efficiently. But now I've puzzled myself - I get the sense that you might well do your processing in Hadoop rather than export data into Neo4j and then process it as separate graphs. Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Start script of 1.4.M05/1.4.M06 fails with older versions of bash + FIX
Thanks for that Stephan, I've dropped it into our QA backlog for the 1.4 GA release. Jim On 6 Jul 2011, at 12:06, Stephan Hagemann wrote: Hi everyone, a rather old linux installation on our build server led us to find out that the new start script introduced in M05 (?) does not work with all versions of bash. We got: cruise:/virtual/hudson/hudson_home/jobs/graphdb/workspace# /opt/neo4j/bin/neo4j start /opt/neo4j/bin/neo4j: line 37: syntax error in conditional expression: unexpected token `(' /opt/neo4j/bin/neo4j: line 37: syntax error near `^([' /opt/neo4j/bin/neo4j: line 37: ` if [[ ${line} =~ ^([^#\s][^=]+)=(.+)$ ]]; then' On a system with this info: ext-xecruise52-1:/opt/neo4j/bin# cat /proc/version Linux version 2.6.28.7-ibm-x3650 (root@obc-fai42-1) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Thu Feb 26 13:50:31 CET 2009 Here is the quick fix I just found (no patch, since I don't want to suggest I know it works on other system versions...). Enclose the regexps on line 37ff in quotes as so: if [[ ${line} =~ ^([^#\s][^=]+)=(.+)$ ]]; then key=`echo ${BASH_REMATCH[1]} | sed 's/\./_/g'` value=${BASH_REMATCH[2]} if [[ ${key} =~ ^(.*)_([0-9]+)$ ]]; then Cheers, Stephan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance issue on nodes with lots of relationships
Hi Rick, Are you thinking maybe of lazily loading relationships in 1.5? That might be a huge boost. Added to the backlog to be discussed for inclusion in 1.5. Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance issue on nodes with lots of relationships
Andrew, can you by chance share you graph-db or perhaps your generator script? Then we could evaluate that and see where the performance hit occurs. Neo4j-shell checks the connectedness of the graph so that you can't get lost just while navigating. Could you try to use cd -a 1 (this does absolute jumps w/o checking connectedness). Are those logs you showed from neoclipse as well, or in messages.log in the graph-db directory? The unable to memory map sounds not so good, that shouldn't be a problem in Ubuntu. Cheers, Michael Am 06.07.2011 um 16:59 schrieb Andrew White: This is consistently slow. I made a graph which just goes off of the root reference node (0) and I am seeing the following... (0)$ cd 1 about 1 minute (1)$ cd 0 instant (0)$ cd 1 about 1 minute It's almost like it is scanning the entire relationship list before actually looking up the next node. Of note I have found the following when running neoclipse... WARNING: [/path/to/neo4j-db/neostore.relationshipstore.db] Unable to memory map And I see this in the logs... neostore.nodestore.db.mapped_memory=20M neostore.propertystore.db.arrays.mapped_memory=130M neostore.propertystore.db.index.keys.mapped_memory=1M neostore.propertystore.db.index.mapped_memory=1M neostore.propertystore.db.mapped_memory=90M neostore.propertystore.db.strings.mapped_memory=130M neostore.relationshipstore.db.mapped_memory=100M Am I missing something obvious? Even without memory maps, I would expect this to be somewhat faster since reading 156MB (the size of my neostore.relationshipstore.db file) of relation data should be very fast. Also, is there anyway to do a pre-warm up so that the first hit isn't so slow? I would hate for my first user in PROD to get hammered because a cache wasn't warmed up. Thanks, Andrew On 07/06/2011 09:24 AM, Rick Bullotta wrote: Hi, Andrew. In general, this scenario (1 million+ relationships on a node) can be slow, but usually only the first time you access the node. If you're only accessing the node once in a session, then yes, it will seem sluggish. The Neoclipse issue is probably a combination of two issues: the first is lazily loading the node information the first time, and the second is the visual rendering of all those relationships. Rick -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Andrew White Sent: Wednesday, July 06, 2011 10:15 AM To: user@lists.neo4j.org Subject: [Neo4j] Performance issue on nodes with lots of relationships I have a graph with roughly 10M nodes. Some of these nodes are highly connected to other nodes. For example I may have a single node with 1M+ relationships. A good analogy is a population that has a lives-in relationship to a state. Now the problem... Both neoclipse or neo4j-shell are terribly slow when working with these nodes. In the shell I would expect a `cdnode-id` to be very fast, much like selecting via a rowid in a standard DB. Instead, I usually see several seconds delay. Doing a `ls` takes so long that I usually have to just kill the process. In fact `ls` never outputs anything which is odd since I would expect it to stream the output as it found it. I have very similar performance issues with neoclipse. I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM. Disclaimer, I am new to Neo4j. Thanks, Andrew ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance issue on nodes with lots of relationships
Logs are attached. I am using the Sun 64bit HotSpot JVM (see logs). For this particular graph I simply have a single root reference node (0) and millions of nodes with a 1:1 relationship with the root. For all intents, this version of the graph is like a flat table with all elements sharing the same parent. This is the simplest graph I could construct that will eventually represent a sub graph in a more complex system. Some file sizes for the db store are... 43M neostore.nodestore.db 424M neostore.propertystore.db 193M neostore.propertystore.db.arrays 1.1K neostore.propertystore.db.index 1.1K neostore.propertystore.db.index.keys 238M neostore.propertystore.db.strings 156M neostore.relationshipstore.db 10 neostore.relationshiptypestore.db 129 neostore.relationshiptypestore.db.names Andrew On 07/06/2011 12:03 PM, Michael Hunger wrote: Ok, then it is checking the connectedness which actually really traverses all the relationships between the current and the target node. Could you share the whole messages.log file from that graph store? Which JVM are you running? If you can't share the db, could you please describe the structure of the graph, so which category of nodes has what number of (types of) relationships to which others? Also does your node 0 contain the many rels or the node with the id 1 ? Cheers Michael Am 06.07.2011 um 18:48 schrieb Andrew White: When using `cd -a` it is indeed very fast. As to the logs, those where from messages.log. Sharing the graph-db would be tough considering I am generating this graph off of several GB of data and my local upload is very limited. Any hints on the memory map issue are welcomed too. Thanks for all of your help so far. I am going to try/reply to the other recommendations in other e-mails soonish. Andrew On 07/06/2011 11:32 AM, Michael Hunger wrote: Andrew, can you by chance share you graph-db or perhaps your generator script? Then we could evaluate that and see where the performance hit occurs. Neo4j-shell checks the connectedness of the graph so that you can't get lost just while navigating. Could you try to use cd -a 1 (this does absolute jumps w/o checking connectedness). Are those logs you showed from neoclipse as well, or in messages.log in the graph-db directory? The unable to memory map sounds not so good, that shouldn't be a problem in Ubuntu. Cheers, Michael Am 06.07.2011 um 16:59 schrieb Andrew White: This is consistently slow. I made a graph which just goes off of the root reference node (0) and I am seeing the following... (0)$ cd 1about 1 minute (1)$ cd 0instant (0)$ cd 1about 1 minute It's almost like it is scanning the entire relationship list before actually looking up the next node. Of note I have found the following when running neoclipse... WARNING: [/path/to/neo4j-db/neostore.relationshipstore.db] Unable to memory map And I see this in the logs... neostore.nodestore.db.mapped_memory=20M neostore.propertystore.db.arrays.mapped_memory=130M neostore.propertystore.db.index.keys.mapped_memory=1M neostore.propertystore.db.index.mapped_memory=1M neostore.propertystore.db.mapped_memory=90M neostore.propertystore.db.strings.mapped_memory=130M neostore.relationshipstore.db.mapped_memory=100M Am I missing something obvious? Even without memory maps, I would expect this to be somewhat faster since reading 156MB (the size of my neostore.relationshipstore.db file) of relation data should be very fast. Also, is there anyway to do a pre-warm up so that the first hit isn't so slow? I would hate for my first user in PROD to get hammered because a cache wasn't warmed up. Thanks, Andrew On 07/06/2011 09:24 AM, Rick Bullotta wrote: Hi, Andrew. In general, this scenario (1 million+ relationships on a node) can be slow, but usually only the first time you access the node. If you're only accessing the node once in a session, then yes, it will seem sluggish. The Neoclipse issue is probably a combination of two issues: the first is lazily loading the node information the first time, and the second is the visual rendering of all those relationships. Rick -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Andrew White Sent: Wednesday, July 06, 2011 10:15 AM To: user@lists.neo4j.org Subject: [Neo4j] Performance issue on nodes with lots of relationships I have a graph with roughly 10M nodes. Some of these nodes are highly connected to other nodes. For example I may have a single node with 1M+ relationships. A good analogy is a population that has a lives-in relationship to a state. Now the problem... Both neoclipse or neo4j-shell are terribly slow when working with these nodes. In the shell I would expect a `cdnode-id` to be very fast, much like selecting via a rowid in a standard DB. Instead, I usually see several seconds delay. Doing
[Neo4j] Setting up a Cluster and querying
Hi there, I am quite a newbie with neo4j and I hope somebody can help me. I want to set up a Cluster with 6 Servers and a few Coordinators (can a Server at the same time be a Coordinator?). Theoretically the setting up of this cluster is more or less clear to me. But the big question for me is: How do I query this cluster? So that I don't communicate with a single server all the time, but the server with the lowest load at this time. I hope you know what I mean. Regards, Christian ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance issue on nodes with lots of relationships
2011/7/6 Jim Webber j...@neotechnology.com Hi Rick, Are you thinking maybe of lazily loading relationships in 1.5? That might be a huge boost. Added to the backlog to be discussed for inclusion in 1.5. Neo4j _is_ lazily loading relationships... and have done since before 1.0. Maybe there's some issue with the shell only. Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance issue on nodes with lots of relationships
Just noticed that ls shell reads all relationships before displaying them... I'll fix this tomorrow. 2011/7/6 Mattias Persson matt...@neotechnology.com 2011/7/6 Jim Webber j...@neotechnology.com Hi Rick, Are you thinking maybe of lazily loading relationships in 1.5? That might be a huge boost. Added to the backlog to be discussed for inclusion in 1.5. Neo4j _is_ lazily loading relationships... and have done since before 1.0. Maybe there's some issue with the shell only. Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Unable to upgrade neostore
I did not. If this is what is required then you have answered my question. Thanks. -Paul -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Adriano Henrique de Almeida Sent: Tuesday, July 05, 2011 10:59 PM To: Neo4j user discussions Subject: Re: [Neo4j] Unable to upgrade neostore Paul, Did you try to upgrade to 1.2, then to 1.3 and then to 1.4 before going from the 1.1 straight to the 1.4? Regards 2011/7/5 Paul A. Jackson paul.jack...@pb.com I have a neo4j 1.1 graph that I tried opening with 1.4M5. I had a configuration that contained allow_store_upgrade=true: [15] = {java.util.HashMap$Entry@12374} allow_store_upgrade - true key: java.lang.String = {java.lang.String@12376}allow_store_upgrade value: java.lang.String = {java.lang.String@12380}true And I get this exception: jvm 1| Caused by: org.neo4j.graphdb.TransactionFailureException: Could not create data source [nioneodb], see nested exception for cause of error jvm 1| at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:153) jvm 1| at org.neo4j.kernel.GraphDbInstance.start(GraphDbInstance.java:111) jvm 1| at org.neo4j.kernel.EmbeddedGraphDbImpl.init(EmbeddedGraphDbImpl.java:189) jvm 1| at org.neo4j.kernel.EmbeddedGraphDatabase.init(EmbeddedGraphDatabase.java:79) jvm 1| at com.g1.dcg.graph.neo4j.NeoGraph.init(NeoGraph.java:118) jvm 1| ... 12 more jvm 1| Caused by: org.neo4j.kernel.impl.nioneo.store.IllegalStoreVersionException: Store version [NeoStore v0.9.5]. Please make sure you are not running old Neo4j kernel on a store that has been created by newer version of Neo4j. jvm 1| at org.neo4j.kernel.impl.nioneo.store.NeoStore.versionFound(NeoStore.java:431) jvm 1| at org.neo4j.kernel.impl.nioneo.store.AbstractStore.loadStorage(AbstractStore.java:147) jvm 1| at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.init(CommonAbstractStore.java:170) jvm 1| at org.neo4j.kernel.impl.nioneo.store.AbstractStore.init(AbstractStore.java:120) jvm 1| at org.neo4j.kernel.impl.nioneo.store.NeoStore.init(NeoStore.java:65) jvm 1| at org.neo4j.kernel.impl.nioneo.xa.NeoStoreXaDataSource.init(NeoStoreXaDataSource.java:132) jvm 1| at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) jvm 1| at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) jvm 1| at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) jvm 1| at java.lang.reflect.Constructor.newInstance(Constructor.java:513) jvm 1| at org.neo4j.kernel.impl.transaction.XaDataSourceManager.create(XaDataSourceManager.java:75) jvm 1| at org.neo4j.kernel.impl.transaction.TxModule.registerDataSource(TxModule.java:147) jvm 1| ... 16 more My main question is whether this is supported or I am doing something wrong. I don't really need to support the upgrade of version 1.1 databases, but I want to make sure my code is correct so that I will be able to support upgrades in the future. Thanks. Paul Jackson ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Adriano Almeida Caelum | Ensino e Inovação www.caelum.com.br ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Neo4jPHP
Hey all, I've been working on another PHP client for Neo4j. I think it's ready for some real-life testing, and I'm interested to see what you all think. GitHub: https://github.com/jadell/Neo4jPHP Download: https://github.com/jadell/Neo4jPHP/tarball/0.0.1-beta Features: - Developed against the Neo4j 1.4 milestone releases - Simple, object-oriented API - Almost complete REST API coverage - Indexing of nodes and relationships, including exact match and query support - Cypher queries (thanks to Jacob Hansson) - Traversal support, including paged traversals - Lazy-loading of node and relationship data Hopefully coming soon: - Client-side caching - Batch operations There are some usage examples included. It's a beta release, so please be gentle (on me, that is; be as rough as you want with the code.) If anyone finds any bugs or has feature requests, please use the GitHub issues page at https://github.com/jadell/Neo4jPHP/issues Thanks and enjoy! -- Josh Adell ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Setting up a Cluster and querying
Hi Christian, Please see http://docs.neo4j.org/chunked/1.4.M06/ha.html for info on Neo4j HA. You can run a coordinator and a Neo4j server on the same machines. That's a common setup. As for how to query it, answering that requires some more explanation about how Neo4j can be run. Neo4j can be used in two deployment modes: embedded in a Java process, or stand-alone server. The server however internally runs an embedded instance. See http://docs.neo4j.org/chunked/1.4.M06/deployment-scenarios.html for more information on this. In an HA environment, a stand-alone server would be accessed over HTTP via the REST API[1]. You can also write custom extensions[2] in order to deploy Java code on the server so that you can build your own domain-specific query API. If you're not using the stand-alone server, but instead using embedded Neo4j in e.g. a web application deployed on Tomcat, then the API you expose from your webapp is completely up to you. Internally it then uses an embedded Neo4j instance, where you have full access to the Java API. In addition to these options, you can also use our new query language, Cypher[3]. You can try it out from the web administration interface of the stand-alone server. When setting up a Neo4j HA cluster, you typically also configure a load balancer in front of the cluster. The load balancer can use any method it desires to distribute the requests to the machines in the cluster. The load balancer is however not included in the Neo4j distribution -- it is something the user needs to provide. You could look into the Apache HTTP Server or HAProxy. Hope that answers some of your questions. David [1] http://docs.neo4j.org/chunked/1.4.M06/rest-api.html [2] http://docs.neo4j.org/chunked/1.4.M06/server-plugins.html, http://docs.neo4j.org/chunked/1.4.M06/server-unmanaged-extensions.html [3] http://docs.neo4j.org/chunked/1.4.M06/cypher-query-lang.html On Wed, Jul 6, 2011 at 11:22 AM, Christian Godde christian.go...@googlemail.com wrote: Hi there, I am quite a newbie with neo4j and I hope somebody can help me. I want to set up a Cluster with 6 Servers and a few Coordinators (can a Server at the same time be a Coordinator?). Theoretically the setting up of this cluster is more or less clear to me. But the big question for me is: How do I query this cluster? So that I don't communicate with a single server all the time, but the server with the lowest load at this time. I hope you know what I mean. Regards, Christian ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Indexed relationships
Pushed SortedTree to Git after adding a unit test and doing some debugging. TODO:Add API for indexed relationships using SortedTree as the implementation.Make SortedTree thread safe. With regard to the latter issue. I am considering the following solution. Acquire a lock (delete a non existent property) on the node that points to the root of the tree at the start of AddNode, RemoveNode and Delete. No other node in the SortedTree is really stable, even the rootnode may be moved down, turning another node into the new rootnode, while after a couple of remove actions the original rootnode may even be deleted. Locking the node pointing to the rootnode, prevents all other threads/transactions from updating the tree. This may seem restrictive, but a single new entry or a single removal may in fact have impact on much of the tree, due to balancing. More selective locking would require a prebalancing tree walk, determining the affected subtrees, lock them and once every affected subtree is locked, perform the actual balancing. Please let me hear if there are any objections to locking the node pointing to the tree as the a solution to make SortedTree thread safe. Niels Date: Tue, 5 Jul 2011 08:27:57 +0200 From: neubauer.pe...@gmail.com To: user@lists.neo4j.org Subject: Re: [Neo4j] Indexed relationships Great work Nils! /peter Sent from my phone. On Jul 4, 2011 11:39 PM, Niels Hoogeveen pd_aficion...@hotmail.com wrote: Made some more changes to the SortedTree implementation. Previously SortedTree would throw an exception if a duplicate entry was being added. I changed SortedTree to allow a key to point to more than one node, unless the SortedTree is created as a unique index, in which case an exception is raised when an attempt is made to add a node to an existing key entry. A SortedTree once defined as unique can not be changed to a non-unique index or vice-versa. SortedTrees now have a name, which is stored in the a property of the TREE_ROOT relationship and in the KEY_VALUE relationship (a new relationship that points from the SortedTree to the Node inserted in the SortedTree). The name of a SortedTree can not be changed. SortedTrees now store the class of the Comparator, so a SortedTree, once created, can not be used with a different Comparator. SortedTree is now an Iterable, making it possible to use it in a foreach-loop. Since there are as of yet, no unit tests for SortedTree, I will create those first before pushing my changes to Git. Preliminary results so far are good. I integrated the changes in my own application and it seems to work fine. Todo: Decide on an API for indexed relationships. (Community input still welcome).Write unit tests.Make SortedTree thread safe (Community help still welcome). Niels From: pd_aficion...@hotmail.com To: user@lists.neo4j.org Date: Mon, 4 Jul 2011 15:49:45 +0200 Subject: Re: [Neo4j] Indexed relationships I forgot to add another recurrent issue that can be solved with indexed relationships: guaranteed unicity constraints. From: pd_aficion...@hotmail.com To: user@lists.neo4j.org Date: Mon, 4 Jul 2011 01:55:08 +0200 Subject: [Neo4j] Indexed relationships In the thread [Neo4j] traversing densely populated nodes we discussed the problems arising when large numbers of relationships are added to the same node. Over the weekend, I have worked on a solution for the dense-relationship-nodes using SortedTree in the neo-graph-collections component. After some minor tweaks to the implementation of SortedTree, I have managed to get a workable solution, where two nodes are not directly linked by a relationship, but by means of a BTree (entirely stored in the graph). Before continuing this work, I'd like to have a discussion about features, since what we have now is not just a solution for the dense populated node issue, but is actually a full fledges indexed relationship, which makes it suitable for other purposes too. An indexed relationship can for example be used to maintain a sorted set of relationships in the graph, that is not necessarily huge, but large enough to make sorting on internal memory too expensive an operation, or situations where only one out of a large number of relationships is actually traversed in most cases. There are probably more use cases for in-graph indexed relationships, so I'd like to know what features are desirable and what API would Neo4J users appreciate. P.S. I still think it would be good to consider, if technically possible, partitioning the relationship store per relationship type and per direction. The indexed relationship solution works, but is of course slower than a direct relationship, both with respect to insert time and traversal time. If dense relationships are never traversed going out of the dense node, the extra structure maintained by the BTree is only extra burden. P.P.S. If there are people
Re: [Neo4j] Performance issue on nodes with lots of relationships
I am on a standard filesystem (ext4). I haven't seen the issue again today so I wonder if it was a fluke. Andrew On 07/06/2011 12:29 PM, Paul Bandler wrote: Any hints on the memory map issue are welcomed too. I experienced that on Solaris when I'd placed the db on a filesystem that didn't support memory mapped I/o such as NFS Sent from my iPhone On 6 Jul 2011, at 17:48, Andrew Whiteli...@andrewewhite.net wrote: Any hints on the memory map issue are welcomed too. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] t= new Table() in Gremlin
Greetings! I am using stable 1.3 and when I issue t = new Table() in the gremlin shell I get: - gremlin t = new Table(); - == startup failed: - == groovysh_evaluate: 26: unable to resolve class Table - == @ line 26, column 5. - ==t = new Table(); What am I doing wrong? Thanks! ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] t= new Table() in Gremlin
Boris, I think, 1.3 uses a much older version of gremlin which didn't get the Table result type yet? Please pull 1.4.M06 and try to use it there to see if it works in the current version. Cheers Michael Am 07.07.2011 um 04:34 schrieb Boris Kizelshteyn: Greetings! I am using stable 1.3 and when I issue t = new Table() in the gremlin shell I get: - gremlin t = new Table(); - == startup failed: - == groovysh_evaluate: 26: unable to resolve class Table - == @ line 26, column 5. - ==t = new Table(); What am I doing wrong? Thanks! ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance issue on nodes with lots of relationships
I just tested with 1.4.M06 and performance seems about the same. Also, only the supernodes are affected, the child nodes are very fast. On 07/06/2011 09:31 AM, Michael Hunger wrote: Andrew, could you please also try to access the graph via the latest Milestone 1.4.M06 to see if things have improved. Does this behaviour only effect the supernodes or every node in your graph (e.g. when you access, cd, ls a person-node?) We've been discussing some changes to the initial loading/caching that might improve performance on heavily connected (super-)nodes. If our changes and tests are successful these change will be integrated in early 1.5. Milestones. Cheers Michael Am 06.07.2011 um 16:15 schrieb Andrew White: I have a graph with roughly 10M nodes. Some of these nodes are highly connected to other nodes. For example I may have a single node with 1M+ relationships. A good analogy is a population that has a lives-in relationship to a state. Now the problem... Both neoclipse or neo4j-shell are terribly slow when working with these nodes. In the shell I would expect a `cdnode-id` to be very fast, much like selecting via a rowid in a standard DB. Instead, I usually see several seconds delay. Doing a `ls` takes so long that I usually have to just kill the process. In fact `ls` never outputs anything which is odd since I would expect it to stream the output as it found it. I have very similar performance issues with neoclipse. I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM. Disclaimer, I am new to Neo4j. Thanks, Andrew ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance issue on nodes with lots of relationships
Here is some interesting stats to consider. First, I split my nodes into two groups, one node with 1.4M children and the other with 3.4M children. While I do see some cache warm-up improvements, the transversal doesn't seem to scale linearly; ie the larger super-node has 2.4x more children but takes 17x longer to transverse. neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r) +--+ | count(r) | +--+ | 1468486 | +--+ 1 rows, 25724 ms neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r) +--+ | count(r) | +--+ | 1468486 | +--+ 1 rows, 19763 ms neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r) +--+ | count(r) | +--+ | 3472174 | +--+ 1 rows, 565448 ms neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r) +--+ | count(r) | +--+ | 3472174 | +--+ 1 rows, 337975 ms Any ideas on this? Andrew On 07/06/2011 09:55 AM, Peter Neubauer wrote: Andrew, if you upgrade to 1.4.M06, your shell should be able to do Cypher in order to count the relationships of a node, not returning them: start n=(1) match (n)-[r]-(x) return count(r) and try that several times to see if cold caches are initially slowing down things. or something along these lines. In the LS and Neoclipse the output and visualization will be slow for that amount of data. Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/- Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Jul 6, 2011 at 4:15 PM, Andrew Whiteli...@andrewewhite.net wrote: I have a graph with roughly 10M nodes. Some of these nodes are highly connected to other nodes. For example I may have a single node with 1M+ relationships. A good analogy is a population that has a lives-in relationship to a state. Now the problem... Both neoclipse or neo4j-shell are terribly slow when working with these nodes. In the shell I would expect a `cdnode-id` to be very fast, much like selecting via a rowid in a standard DB. Instead, I usually see several seconds delay. Doing a `ls` takes so long that I usually have to just kill the process. In fact `ls` never outputs anything which is odd since I would expect it to stream the output as it found it. I have very similar performance issues with neoclipse. I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM. Disclaimer, I am new to Neo4j. Thanks, Andrew ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance issue on nodes with lots of relationships
Hi Andrew, How big is your configured Java heap? It could be that all the nodes and relationships don't fit into the cache. David On Wed, Jul 6, 2011 at 8:03 PM, Andrew White li...@andrewewhite.net wrote: Here is some interesting stats to consider. First, I split my nodes into two groups, one node with 1.4M children and the other with 3.4M children. While I do see some cache warm-up improvements, the transversal doesn't seem to scale linearly; ie the larger super-node has 2.4x more children but takes 17x longer to transverse. neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r) +--+ | count(r) | +--+ | 1468486 | +--+ 1 rows, 25724 ms neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r) +--+ | count(r) | +--+ | 1468486 | +--+ 1 rows, 19763 ms neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r) +--+ | count(r) | +--+ | 3472174 | +--+ 1 rows, 565448 ms neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r) +--+ | count(r) | +--+ | 3472174 | +--+ 1 rows, 337975 ms Any ideas on this? Andrew On 07/06/2011 09:55 AM, Peter Neubauer wrote: Andrew, if you upgrade to 1.4.M06, your shell should be able to do Cypher in order to count the relationships of a node, not returning them: start n=(1) match (n)-[r]-(x) return count(r) and try that several times to see if cold caches are initially slowing down things. or something along these lines. In the LS and Neoclipse the output and visualization will be slow for that amount of data. Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/- Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Jul 6, 2011 at 4:15 PM, Andrew Whiteli...@andrewewhite.net wrote: I have a graph with roughly 10M nodes. Some of these nodes are highly connected to other nodes. For example I may have a single node with 1M+ relationships. A good analogy is a population that has a lives-in relationship to a state. Now the problem... Both neoclipse or neo4j-shell are terribly slow when working with these nodes. In the shell I would expect a `cdnode-id` to be very fast, much like selecting via a rowid in a standard DB. Instead, I usually see several seconds delay. Doing a `ls` takes so long that I usually have to just kill the process. In fact `ls` never outputs anything which is odd since I would expect it to stream the output as it found it. I have very similar performance issues with neoclipse. I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM. Disclaimer, I am new to Neo4j. Thanks, Andrew ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- David Montag david.mon...@neotechnology.com Neo Technology, www.neotechnology.com Cell: 650.556.4411 Skype: ddmontag ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] t= new Table() in Gremlin
Hi, To further clarify, Table is provided in Gremlin 1.1+. To check the version of Gremlin, do GremlinTokens.VERSION. Good luck, Marko. http://markorodriguez.com On Jul 6, 2011, at 8:40 PM, Michael Hunger wrote: Boris, I think, 1.3 uses a much older version of gremlin which didn't get the Table result type yet? Please pull 1.4.M06 and try to use it there to see if it works in the current version. Cheers Michael Am 07.07.2011 um 04:34 schrieb Boris Kizelshteyn: Greetings! I am using stable 1.3 and when I issue t = new Table() in the gremlin shell I get: - gremlin t = new Table(); - == startup failed: - == groovysh_evaluate: 26: unable to resolve class Table - == @ line 26, column 5. - ==t = new Table(); What am I doing wrong? Thanks! ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user