Re: [Neo4j] Problem compiling neo4j-rdf-sail
Dennis, I upgraded the parent pom to parent groupIdorg.neo4j/groupId artifactIdparent-pom/artifactId version14/version /parent which is deployed at http://m2.neo4j.org/org/neo4j/parent-pom/14/ , so this should fix your problem. You will notice that the RDF components are running against Neo4j 1.1 . If you like, feel free to upgrade things, and let us know how it works out for you! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Dec 1, 2010 at 2:21 AM, Schmidt, Dennis dennis.schm...@sap.com wrote: Hi all, When I try to compile the latest version of the neo4j-rdf-sail, to include neo4j with Sesame, I'm getting en error message from maven: [INFO] Scanning for projects... [ERROR] The build could not read 1 project - [Help 1] [ERROR] [ERROR] The project org.neo4j:neo4j-rdf-sail:0.6-SNAPSHOT (C:\Downloads\neo4j-rdfsail\pom.xml) has 1 error [ERROR] Non-resolvable parent POM: Failure to find org.neo4j:parent-pom:pom: 7-SNAPSHOT in http://repo.aduna-software.org/maven2/releases was cached in the local repository, resolution will not be reattempted until the update interval of aduna-repo has elapsed or updates are forced and 'parent.relativePath' points at wrong local POM @ line 4, column 11 - [Help 2] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException I get a similar error message with the 0.5 tag as well. As I understand that from the error message and the referenced maven help documents, I'm obviously missing some external pom file, that's not included in the SVN for the neo4j-rdf-sail itself. But how do I fix that? Hope somebody could give me some clarification. I am not really familiar with maven I have to add. So what confused me a bit was, that removing the parent.*/parent part of the pom made maven compile something. However, the final tests failed and also there was no final *.jar file created (which I'd expect as the outcome of the whole building process, right?) Cheers, Dennis ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Get All Nodes in Rest API
Max, I think in the long run, people are going to switch between servers, especially if there is some kind of sharding scheme involved. Regarding the best way to express the deletion of nodes in REST, what is everyone's opinion? I personally like the deletion via the self String, what are the implications on logging you are mentioning? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Dec 1, 2010 at 10:23 AM, Max De Marzi Jr. maxdema...@gmail.com wrote: I started working on phase2 for neography, but things got ugly when I got to node.delete. The node has to know where to delete itself from. Ok we can get that from the self string, but what about the logging options? If we want that we have to keep the Neography::Rest object on every node and relationship. That's kind of ugly. How likely is it for people to be switching back and forth between Neo4j Servers ? What's a good use case? Because if we can get away with just set the server options once in Neography::Config and forget it, that would be much easier. On Tue, Nov 30, 2010 at 1:16 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Good point, add an issue in trac under the rest component and we will add it! /peter - From my cellphone, please excuse typos and brevity... On Nov 30, 2010 8:05 PM, Max De Marzi Jr. maxdema...@gmail.com wrote: Peter, Is there a command to get all nodes from the Rest API? If not, will one be added? If not to that, what's the right way to handle it? On every new node creation link to it from the Root node with some relationship type? Thanks, Max ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] strange performance
My guess would be that the long pause times is because you start traversing parts of the graph that is not cached at all. The memory mapped buffers are assigned 1800M but the db is much larger than that. Solution would be to install more memory or switch to a SSD. You could also try lower heap size and switch to cache_type=weak freeing up some memory for the memory mapped buffers. Regards, Johan On Tue, Nov 30, 2010 at 11:43 AM, Martin Grimmer martin.grim...@unister.de wrote: Hi, here the extra output: Physical mem: 3961MB Heap size: 1809MB store_dir=/neo4j_database/ rebuild_idgenerators_fast=true neostore.propertystore.db.index.keys.mapped_memory=1M logical_log=/neo4j_database//nioneo_logical.log neostore.propertystore.db.strings.mapped_memory=69M neostore.propertystore.db.arrays.mapped_memory=1M neo_store=/neo4j_database//neostore neostore.relationshipstore.db.mapped_memory=1372M neostore.propertystore.db.index.mapped_memory=1M create=true neostore.propertystore.db.mapped_memory=275M dump_configuration=true neostore.nodestore.db.mapped_memory=91M dir=/neo4j_database//lucene-fulltext Am 23.11.2010 11:07, schrieb Johan Svensson: Hi, Could you add the following configuration parameter: dump_configuration=true and send the output printed to standard out when starting up. Other useful information would be some thread dumps while executing a query that takes long time (send kill -3 signal to the process). Regards, Johan On Tue, Nov 23, 2010 at 10:53 AM, Martin Grimmer martin.grim...@unister.de wrote: Hello, while running my benchmark i did the following: added: -verbose:gc to see gc usage run sar -u 1 100 and here are the results: on a 4 core cpu CPU %user %nice %system %iowait %steal %idle all 0,75 0,00 0,62 18,81 0,00 79,82 - Scaled to 1 core: actually the cpu does nothing to compute the queries. Only 0.75% * 4 = 3% CPU usage for a single core, and 18,81% * 4 = 75,24% io wait for a single core, the rest is divided into system and idle. The gc output: ... [GC 806876K-326533K(1852416K), 0.2419270 secs] ... many queries ... [GC 873349K-423494K(1837760K), 0.3257520 secs] ... many queries ... [GC 956678K-502630K(1643648K), 0.3619280 secs] ... many queries ... [GC 839654K-551462K(1686720K), 0.3088770 secs] ... So its not the GC. Thanks to all of you, ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Status of visualization options
Just wanted to thank everyone that answered my question. Didn't find an exact answer, but I am able to reduce the options to look at. Fakod's latest post about Neo4j with GWT+JIT seems interesting. I am not running a REST server, but I can work from the ideas. I'm also a Scala developer and GWT doesn't work with Scala (out of the box). Cheers, Ivan On Sun, 21 Nov 2010 17:57:22 +0100, Ivan Brusic i...@brusic.com wrote: Hello all, I have successfully imported a large part of my data set into neo4j and have done some basic traversals. I want to start visualizing a portion of the graph or of a traversal and was overwhelmed by the amount of options listed at http://wiki.neo4j.org/content/Visualization_options_for_graphs It is unclear to me which products have actually been successfully integrated with neo4j or which was still have ongoing development. My preference is for simplicity over flexibility and to work at the code level via a library for the generation portion. The actual viewer can be a separate standalone application. The visualizations themselves will be very simple. I am not an Eclipse user so I have not looked into Neoclipse. I have worked with Graphviz files before, so to generate a basic .dot file seems ideal. There was some discussion about JUNG lately, so perhaps that is a better alternative. Unfortunately I do not have the time to explore all of the options. Cheers, Ivan -- Message: 3 Date: Sun, 21 Nov 2010 15:10:53 +0100 From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo4j] Status of visualization options To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlktinstbpexxfa=buw0jmgf=zg2ushjvykrh36+...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 2010/11/21 Ivan Brusic i...@brusic.com Hello all, I have successfully imported a large part of my data set into neo4j and have done some basic traversals. I want to start visualizing a portion of the graph or of a traversal and was overwhelmed by the amount of options listed at http://wiki.neo4j.org/content/Visualization_options_for_graphs It is unclear to me which products have actually been successfully integrated with neo4j or which was still have ongoing development. My preference is for simplicity over flexibility and to work at the code level via a library for the generation portion. The actual viewer can be a separate standalone application. The visualizations themselves will be very simple. I am not an Eclipse user so I have not looked into Neoclipse. I have worked with Graphviz files before, so to generate a basic .dot file seems ideal. There was some discussion about JUNG lately, so perhaps that is a better alternative. Unfortunately I do not have the time to explore all of the options. Neoclipse is a standalone application (although it's based on the eclipse platform) so you don't have to know anything about eclipse in order to run and use it. I also think the JUNG support is through Gremlin only. Cheers, Ivan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com -- Message: 4 Date: Sun, 21 Nov 2010 09:57:12 -0700 From: venkat takumatla vxmar...@ualr.edu Subject: Re: [Neo4j] Status of visualization options To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlkti=sgz-skulfkpzo=j+dbm-q_cv3v4n=rlyoo...@mail.gmail.com Content-Type: text/plain; charset=iso-8859-1 I am developing a testing tool for my work, I have implemented a basic visualization library which can be embedded into any java application or can work as standalone application. It can load network from gdf files or neo4j database files. Here I am attaching a screenshot of the tool. Your feedback can help in improving the library. current layout used is force based layout algorithm. Venkat On Sun, Nov 21, 2010 at 7:10 AM, Mattias Persson matt...@neotechnology.comwrote: 2010/11/21 Ivan Brusic i...@brusic.com Hello all, I have successfully imported a large part of my data set into neo4j and have done some basic traversals. I want to start visualizing a portion of the graph or of a traversal and was overwhelmed by the amount of options listed at http://wiki.neo4j.org/content/Visualization_options_for_graphs It is unclear to me which products have actually been successfully integrated with neo4j or which was still have ongoing development. My preference is for simplicity over flexibility and to work at the code level via a library for the generation portion. The actual viewer can be a separate standalone application. The visualizations themselves will be very simple. I am not an Eclipse user so I have
[Neo4j] Image of Neo4j system architecture
Hi, I am searching for an image of the Neo4j system architecture for my university studies. Is there any? I searched the neo4j wiki and the web for that. It would be nice to have an image like this. Thanks for you help, Tobias Gröger ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Get All Nodes in Rest API
On Wed, Dec 1, 2010 at 03:29, Peter Neubauer peter.neuba...@neotechnology.com wrote: Is there a command to get all nodes from the Rest API? +1. I would also like to know this :) -- Javier de la Rosa http://versae.es ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] bin/neo4j script proposal
I am just testing the current Mileston V1.2.M04 for unix. I have an issue with the definition of $DIST_ARCH in 233 and 240: Regarding the http://en.wikipedia.org/wiki/Uname and as much as I understood from the code of uname, the `uname -m` switch might result in less confusing messages than -p can. On my machine building coreutils-8.7 from http://ftp.gnu.org/gnu/coreutils/, `uname -p` produces the output Intel(R) Core(TM)2 CPU T5600 @ 1.83GHz. I think there are two options: 1: adding some logic that will translate Intel Core to something like i686 2: switch statements in bin/neo4j:233 and bin/neo4j:240 since `-m` seems to have a higher chance to return the right thing. Regards Mao PU ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Get All Nodes in Rest API
Hi there, Martin Chlupac already added this issue under https://trac.neo4j.org/ticket/249, so feel free to dig in and implement it. Otherwise, I hope we get to it in the next iteration. There are some implications with this since this method potentially may return a lot of data. But maybe that is not a primary concern of the server but more a responsibility of the client to be careful? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Dec 1, 2010 at 5:45 PM, Javier de la Rosa ver...@gmail.com wrote: On Wed, Dec 1, 2010 at 03:29, Peter Neubauer peter.neuba...@neotechnology.com wrote: Is there a command to get all nodes from the Rest API? +1. I would also like to know this :) -- Javier de la Rosa http://versae.es ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Image of Neo4j system architecture
Tobias, good question. Right now there is a lot of documentation work going on, see https://svn.neo4j.org/documentation/manual/trunk/ , this would fit nicely into the overview or introduction of it. Will see if we can get something in for the initial release with Milestone 1.2.M05! What kind of granularity would you like to see? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. 2010/12/1 Tobias Gröger tobias_groe...@yahoo.de: Hi, I am searching for an image of the Neo4j system architecture for my university studies. Is there any? I searched the neo4j wiki and the web for that. It would be nice to have an image like this. Thanks for you help, Tobias Gröger ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] [SPAM] Cats and Dogs, living together
These topics are fairly integral to what we do in ThingWorx, so I can share some feedback: In our world what goes where isn't always determined by us - there are zillions of legacy data stores that might need to be integrated. They could be relational, proprietary, or accessed via some type of API or service invocation. Thus, our application needs to elastic enough to leverage those sources as well. In our case, we've chosen to create an abstraction layer for datasets, services, and events that allows us to manage this complexity and to create heterogeneous views/services/applications from that complexity. In terms of data we can control, we've chosen (for now) to put it all into Neo. This includes our modeling/metamodel (data, scripts/logic, visualizations, services, domain data types, etc.) as well as the data we collect (which is basically in two main forms: activity streams and tables). We've implementing an in-memory data transformation engine that allows us to do sql-like things (filter, sort, aggregate, join, etc.) on data from any of the aforementioned sources, as well as for data from our own domain objects (which uses the same dataset abstraction that we apply to external data). In terms of transactions, at this point, we have not yet going as far as implementing hybrid transactions that wrap both external (JDBC) transactions and Neo transactions. However, we have abstracted the way things get invoked such that it would be easy to place a single transaction wrapper around anything that might potentially manipulate data (in fact, it's implemented today but only for Neo transactions). I'm not sure what you mean in terms of message queues for data distribution, but we use queues in two main ways within ThingWorx. First, we use writer queues to manage writing of stream entries and data table entries into Neo, since these will tend to be very high frequency/high volume writes and we didn't want to have to create a separate transaction for each of them. We use a set of workers that flush writes after each X seconds have elapsed or when Y records are waiting to be written. There are persistence helpers that know how to persist the various types of domain objects that get queued up. The other place we use them is for distribution of events. These could be from internal or external sources, as a result of data mutation, user interaction, service invocation, timer, etc...we use queues as a means of regulating the flow/loading and to manage distribution/subscriptions. In terms of data migration strategies, that's an area where we're currently doing some exploration. We already have some basic stuff to take structures from RDBMS tables and turn them into their equivalent structures (and metamodel structures) in our platform and therefore in Neo, but we haven't really done much with it yet nor have we done much with things like indexes and constraints. Just simple data for now. What we are also exploring is using Neo to index data that might reside in external tables. The searchable view of the data would reside in Neo and we would maintain a reference back to the original source (table/row/unique identifier in that row) when we need to retrieve the original data. Sort of a spidering/crawling approach for now, though we would like it to also be event driven at some point. In terms of features that could help in the context of the questions you've asked, I suppose a few things come to mind: - The ability to enlist/contain a Neo transaction into other transactions (and vice versa, I suppose) - Richer data typing beyond the primitives that Neo stores today (DateTime and Location being a few interesting and common ones). Ideally this could be extensible. Currently, we use the domain object's metadata to help with this, which works OK - Special treatment for storing/retrieving large strings or blobs (perhaps even at the expense of performance on these activities, but indirectly improving performance on node/relationship/property reads/writes due to reduced memory consumption) - Support for structured storage (e.g. a property that represents a structure rather than a primitive). Using stuff like serialization is too fragile and platform/language-specific, but perhaps with some type of minimalist metamodeling/schemas this could be accomplished fairly easily (or some type of generic persistent model that knew how to deal with JSON objects, XML documents, native Java objects, Maps/Sets, etc.). This is all stuff that we've had to write on our own - Support for the idea of node types (similar to relationship types). Currently, we stamp each node with a String property that indicates its type. Strings are not the most efficient way to do it, as we all know.
Re: [Neo4j] [SPAM] Cats and Dogs, living together
Thanks for the great feedback Rick! I created enhancement requests for your suggestions at https://trac.neo4j.org/ticket/292 - 288 so we don't drop them! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Dec 1, 2010 at 9:22 PM, rick.bullo...@burningskysoftware.com wrote: These topics are fairly integral to what we do in ThingWorx, so I can share some feedback: In our world what goes where isn't always determined by us - there are zillions of legacy data stores that might need to be integrated. They could be relational, proprietary, or accessed via some type of API or service invocation. Thus, our application needs to elastic enough to leverage those sources as well. In our case, we've chosen to create an abstraction layer for datasets, services, and events that allows us to manage this complexity and to create heterogeneous views/services/applications from that complexity. In terms of data we can control, we've chosen (for now) to put it all into Neo. This includes our modeling/metamodel (data, scripts/logic, visualizations, services, domain data types, etc.) as well as the data we collect (which is basically in two main forms: activity streams and tables). We've implementing an in-memory data transformation engine that allows us to do sql-like things (filter, sort, aggregate, join, etc.) on data from any of the aforementioned sources, as well as for data from our own domain objects (which uses the same dataset abstraction that we apply to external data). In terms of transactions, at this point, we have not yet going as far as implementing hybrid transactions that wrap both external (JDBC) transactions and Neo transactions. However, we have abstracted the way things get invoked such that it would be easy to place a single transaction wrapper around anything that might potentially manipulate data (in fact, it's implemented today but only for Neo transactions). I'm not sure what you mean in terms of message queues for data distribution, but we use queues in two main ways within ThingWorx. First, we use writer queues to manage writing of stream entries and data table entries into Neo, since these will tend to be very high frequency/high volume writes and we didn't want to have to create a separate transaction for each of them. We use a set of workers that flush writes after each X seconds have elapsed or when Y records are waiting to be written. There are persistence helpers that know how to persist the various types of domain objects that get queued up. The other place we use them is for distribution of events. These could be from internal or external sources, as a result of data mutation, user interaction, service invocation, timer, etc...we use queues as a means of regulating the flow/loading and to manage distribution/subscriptions. In terms of data migration strategies, that's an area where we're currently doing some exploration. We already have some basic stuff to take structures from RDBMS tables and turn them into their equivalent structures (and metamodel structures) in our platform and therefore in Neo, but we haven't really done much with it yet nor have we done much with things like indexes and constraints. Just simple data for now. What we are also exploring is using Neo to index data that might reside in external tables. The searchable view of the data would reside in Neo and we would maintain a reference back to the original source (table/row/unique identifier in that row) when we need to retrieve the original data. Sort of a spidering/crawling approach for now, though we would like it to also be event driven at some point. In terms of features that could help in the context of the questions you've asked, I suppose a few things come to mind: - The ability to enlist/contain a Neo transaction into other transactions (and vice versa, I suppose) - Richer data typing beyond the primitives that Neo stores today (DateTime and Location being a few interesting and common ones). Ideally this could be extensible. Currently, we use the domain object's metadata to help with this, which works OK - Special treatment for storing/retrieving large strings or blobs (perhaps even at the expense of performance on these activities, but indirectly improving performance on node/relationship/property reads/writes due to reduced memory consumption) - Support for structured storage (e.g. a property that represents a structure rather than a primitive). Using stuff like serialization is
Re: [Neo4j] bin/neo4j script proposal
Mao, thanks for reporting this. on my OSX, uname -m 2/dev/null | tr [A-Z] [a-z] | tr -d ' ' and uname -p 2/dev/null | tr [A-Z] [a-z] | tr -d ' ' are both producing i386. I guess since the main case seems to work, switching is safe. We will test tomorrow, but I am committing the update to accommodate your feedback, switching the lines, Index: src/main/distribution/shell-scripts/bin/neo4j === --- src/main/distribution/shell-scripts/bin/neo4j (revision 7096) +++ src/main/distribution/shell-scripts/bin/neo4j (working copy) @@ -230,14 +230,14 @@ APP_PLIST=${APP_PLIST_BASE}.plist else DIST_ARCH= -DIST_ARCH=`uname -p 2/dev/null | tr [A-Z] [a-z] | tr -d ' '` +DIST_ARCH=`uname -m 2/dev/null | tr [A-Z] [a-z] | tr -d ' '` if [ X$DIST_ARCH = X ] then DIST_ARCH=unknown fi if [ $DIST_ARCH = unknown ] then -DIST_ARCH=`uname -m 2/dev/null | tr [A-Z] [a-z] | tr -d ' '` +DIST_ARCH=`uname -p 2/dev/null | tr [A-Z] [a-z] | tr -d ' '` fi case $DIST_ARCH in 'athlon' | 'i386' | 'i486' | 'i586' | 'i686') Committed as of rev. 7412, thanks for sharing! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Dec 1, 2010 at 7:41 PM, Mao PU mao.c...@googlemail.com wrote: I am just testing the current Mileston V1.2.M04 for unix. I have an issue with the definition of $DIST_ARCH in 233 and 240: Regarding the http://en.wikipedia.org/wiki/Uname and as much as I understood from the code of uname, the `uname -m` switch might result in less confusing messages than -p can. On my machine building coreutils-8.7 from http://ftp.gnu.org/gnu/coreutils/, `uname -p` produces the output Intel(R) Core(TM)2 CPU T5600 @ 1.83GHz. I think there are two options: 1: adding some logic that will translate Intel Core to something like i686 2: switch statements in bin/neo4j:233 and bin/neo4j:240 since `-m` seems to have a higher chance to return the right thing. Regards Mao PU ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] bin/neo4j script proposal
Curiously, on my linux/acer aspire laptop, uname -p produces 'unknown' and uname -m produces i686. Seems like for me the -m option is much better. and the command: cat /proc/cpuinfo | grep name produces: model name : AMD Turion(tm) X2 Ultra Dual-Core Mobile ZM-82 model name : AMD Turion(tm) X2 Ultra Dual-Core Mobile ZM-82 (two cores) On Wed, Dec 1, 2010 at 9:33 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Mao, thanks for reporting this. on my OSX, uname -m 2/dev/null | tr [A-Z] [a-z] | tr -d ' ' and uname -p 2/dev/null | tr [A-Z] [a-z] | tr -d ' ' are both producing i386. I guess since the main case seems to work, switching is safe. We will test tomorrow, but I am committing the update to accommodate your feedback, switching the lines, Index: src/main/distribution/shell-scripts/bin/neo4j === --- src/main/distribution/shell-scripts/bin/neo4j (revision 7096) +++ src/main/distribution/shell-scripts/bin/neo4j (working copy) @@ -230,14 +230,14 @@ APP_PLIST=${APP_PLIST_BASE}.plist else DIST_ARCH= -DIST_ARCH=`uname -p 2/dev/null | tr [A-Z] [a-z] | tr -d ' '` +DIST_ARCH=`uname -m 2/dev/null | tr [A-Z] [a-z] | tr -d ' '` if [ X$DIST_ARCH = X ] then DIST_ARCH=unknown fi if [ $DIST_ARCH = unknown ] then -DIST_ARCH=`uname -m 2/dev/null | tr [A-Z] [a-z] | tr -d ' '` +DIST_ARCH=`uname -p 2/dev/null | tr [A-Z] [a-z] | tr -d ' '` fi case $DIST_ARCH in 'athlon' | 'i386' | 'i486' | 'i586' | 'i686') Committed as of rev. 7412, thanks for sharing! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Dec 1, 2010 at 7:41 PM, Mao PU mao.c...@googlemail.com wrote: I am just testing the current Mileston V1.2.M04 for unix. I have an issue with the definition of $DIST_ARCH in 233 and 240: Regarding the http://en.wikipedia.org/wiki/Uname and as much as I understood from the code of uname, the `uname -m` switch might result in less confusing messages than -p can. On my machine building coreutils-8.7 from http://ftp.gnu.org/gnu/coreutils/, `uname -p` produces the output Intel(R) Core(TM)2 CPU T5600 @ 1.83GHz. I think there are two options: 1: adding some logic that will translate Intel Core to something like i686 2: switch statements in bin/neo4j:233 and bin/neo4j:240 since `-m` seems to have a higher chance to return the right thing. Regards Mao PU ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] [SPAM] Cats and Dogs, living together
While I don't do anything as fancy as Rick, I do one very specific thing which is a kind of special case of one of his suggestions. We use neo4j as a combined index and statistics tree for data stored in very large flat binary files. We originally used to port all this data into neo4j, but found a few problems with that: - The total database size was usually many, many times larger than the original data (I mean 10-50 times larger) - The write performance to the database is the bottleneck during import Once we started working with extremely large datasets, potentially approaching the database capacity, we decided to improve both performance and scalability by storing only an index in the database, but one containing key statistical results, the ones most likely to be needed by any further analysis. This allows the tool to still perform the analyses required on the entire dataset, but with higher import speeds, higher analysis speeds and lower database sizes. As in Ricks suggestion we maintain references to the original files and record offsets, so that drill down to the original data remains possible. While this is specific to our use case, the principle is probably reusable in other domains. On Wed, Dec 1, 2010 at 8:59 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Thanks for the great feedback Rick! I created enhancement requests for your suggestions at https://trac.neo4j.org/ticket/292 - 288 so we don't drop them! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Dec 1, 2010 at 9:22 PM, rick.bullo...@burningskysoftware.com wrote: These topics are fairly integral to what we do in ThingWorx, so I can share some feedback: In our world what goes where isn't always determined by us - there are zillions of legacy data stores that might need to be integrated. They could be relational, proprietary, or accessed via some type of API or service invocation. Thus, our application needs to elastic enough to leverage those sources as well. In our case, we've chosen to create an abstraction layer for datasets, services, and events that allows us to manage this complexity and to create heterogeneous views/services/applications from that complexity. In terms of data we can control, we've chosen (for now) to put it all into Neo. This includes our modeling/metamodel (data, scripts/logic, visualizations, services, domain data types, etc.) as well as the data we collect (which is basically in two main forms: activity streams and tables). We've implementing an in-memory data transformation engine that allows us to do sql-like things (filter, sort, aggregate, join, etc.) on data from any of the aforementioned sources, as well as for data from our own domain objects (which uses the same dataset abstraction that we apply to external data). In terms of transactions, at this point, we have not yet going as far as implementing hybrid transactions that wrap both external (JDBC) transactions and Neo transactions. However, we have abstracted the way things get invoked such that it would be easy to place a single transaction wrapper around anything that might potentially manipulate data (in fact, it's implemented today but only for Neo transactions). I'm not sure what you mean in terms of message queues for data distribution, but we use queues in two main ways within ThingWorx. First, we use writer queues to manage writing of stream entries and data table entries into Neo, since these will tend to be very high frequency/high volume writes and we didn't want to have to create a separate transaction for each of them. We use a set of workers that flush writes after each X seconds have elapsed or when Y records are waiting to be written. There are persistence helpers that know how to persist the various types of domain objects that get queued up. The other place we use them is for distribution of events. These could be from internal or external sources, as a result of data mutation, user interaction, service invocation, timer, etc...we use queues as a means of regulating the flow/loading and to manage distribution/subscriptions. In terms of data migration strategies, that's an area where we're currently doing some exploration. We already have some basic stuff to take structures from RDBMS tables and turn them into their equivalent structures (and metamodel structures) in our platform and therefore in Neo, but we haven't really done much with it yet nor have we done much with
Re: [Neo4j] Problem compiling neo4j-rdf-sail
Thanks Peter, Updating that version part helped. However, to make it compile properly, I need to have both repositiories in the pom. The one you mentioned and the old one (http://repo.aduna-software.org/maven2/releases) because otherwise I always got an error regarding the download of some sesame files. So now it IS compiling and finally running the tests. But I can still find no *.jar file (only *.class) and also two of the RmiSailTest tests fail. The Stacktrace in both cases (org.neo4j.rdf.sail.rmi.RmiSailTest.[testBasic, testFulltextSearch]) is: java.io.IOException: Cannot overwrite: C:\Downloads\neo4j-rdfsail\target\var\fulltext\_0.cfs at org.apache.lucene.store.FSDirectory.initOutput(FSDirectory.java:362) at org.apache.lucene.store.SimpleFSDirectory.createOutput(SimpleFSDirectory.java:58) at org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter.java:150) at org.apache.lucene.index.DocumentsWriter.createCompoundFile(DocumentsWriter.java:627) at org.apache.lucene.index.IndexWriter.doFlushInternal(IndexWriter.java:4355) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:4209) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:4200) at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:2195) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2158) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2122) at org.neo4j.rdf.fulltext.SimpleFulltextIndex.safeClose(SimpleFulltextIndex.java:261) at org.neo4j.rdf.fulltext.SimpleFulltextIndex$IndexingThread.flushEntries(SimpleFulltextIndex.java:808) at org.neo4j.rdf.fulltext.SimpleFulltextIndex$IndexingThread.run(SimpleFulltextIndex.java:761) org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: simplefsl...@c:\Downloads\neo4j-rdfsail\target\var\fulltext\write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:85) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1550) at org.apache.lucene.index.IndexWriter.lt;initgt;(IndexWriter.java:1079) at org.neo4j.rdf.fulltext.SimpleFulltextIndex.getWriter(SimpleFulltextIndex.java:210) at org.neo4j.rdf.fulltext.SimpleFulltextIndex.access$1300(SimpleFulltextIndex.java:73) at org.neo4j.rdf.fulltext.SimpleFulltextIndex$IndexingThread.ensureWriters(SimpleFulltextIndex.java:795) at org.neo4j.rdf.fulltext.SimpleFulltextIndex$IndexingThread.run(SimpleFulltextIndex.java:745) So my questions now are a) why can't I find a *.jar file and b) is it OK (maybe a known issue) that these two tests are failing or how could I possibly fix that? Cheers, Dennis Dennis, I upgraded the parent pom to parent groupIdorg.neo4j/groupId artifactIdparent-pom/artifactId version14/version /parent which is deployed at http://m2.neo4j.org/org/neo4j/parent-pom/14/ , so this should fix your problem. You will notice that the RDF components are running against Neo4j 1.1 . If you like, feel free to upgrade things, and let us know how it works out for you! Cheers, /peter neubauer ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Cats and Dogs, living together
I'm currently prototyping an application that will use both a RDBMS and Neo4j in parallel. The RDBMS currently exists as the backing store for a web application. While it's feasible to move all of the data into Neo4j, that's not politically palatable right now. So I plan to use Neo4j to store dependency relationships and associated data with references back to the RDBMS. To minimize synchronization issues, I plan to only store entity type and id information in nodes unless specific information is needed to control traversals. Right now most of the data access is via JPA using hibernate through a pretty good DAO-domain abstraction. I plan to extend the domain model to include neo4j nodes and relationships along the lines of your examples and extend the DAOs to include management of the embedded Neo4j instance. My prototype is a Spring app and currently uses aspect oriented transaction management, but in order to manage the transactions on the two databases, I'll probably have to handle transactions programmatically, probably wrapping one in the other. It would be nice to combine the transaction management somehow. In terms of data migration, although I'm trying to minimize migration right now, I have thought about how I might migrate if I were to use Neo4j exclusively. I'm still not sure what the best approaches are to properties vs nodes for entity attributes, indexes (lucene) vs type subnodes, etc. One of the hurdles to migrating to Neo4j is the lack of tool support, particularly for general access. This has been acknowledged by Emil and others, so I'm not complaining. But with our current app, if we need to import data or fix general data problems, we can use a SQL workbench to directly access the database rather than write a special capability in the app. That's not always the best approach but it does allow other developers more familiar with other technologies the ability to access the DB with their tools/language due to the prevalence of SQL. With Neo4j it seems that programmatic access will be required. Granted, I haven't explored the REST server, SPARQL support, or the shell very much. They may offer more generalized access. I'm interested to hear how others are approaching the polyglot-persistence task. As I move forward, I'll share what I learn or have problems with. Thanks, Kalin On Dec 1, 2010, at 10:52 AM, Andreas Kollegger wrote: Would anybody be willing to share experiences with trying to introduce Neo4j into a system with another relational (or other NoSQL) database? We're starting to think about best practices for integration: * Hybrid data-modeling: what goes where? * XA transactions * message queues for data distribution * data migration strategies Any problems or feature-requests related to living in a multi-storage-platform world are welcome. Cheers, Andreas ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Status of visualization options
I'm prototyping an app using Neo4j. Currently for testing and familiarity I can generate a random network of any size. I am using JIT (http://thejit.org) to visualize the graph on a web page via Javascript, basically using the Force Directed example 2 with my JSON data instead of the example data. For small graphs ( 50 nodes, 200 or so relationships) the performance is OK, large graphs take a long time to display but once displayed the interactivity is fine. I haven't done any troubleshooting or benchmarking to see where the issues are, but it appears to be the JSON ingestion and/or the layout calculations. I'm also planning to look at MxGraph (http://www.mxgraph.com) as it offers a rich api/model and purports to be performant. Anyway JIT appears to be fairly easy to work with but may have some performance limitations. Cheers, Kalin On Dec 1, 2010, at 8:27 AM, Ivan Brusic wrote: Just wanted to thank everyone that answered my question. Didn't find an exact answer, but I am able to reduce the options to look at. Fakod's latest post about Neo4j with GWT+JIT seems interesting. I am not running a REST server, but I can work from the ideas. I'm also a Scala developer and GWT doesn't work with Scala (out of the box). Cheers, Ivan On Sun, 21 Nov 2010 17:57:22 +0100, Ivan Brusic i...@brusic.com wrote: Hello all, I have successfully imported a large part of my data set into neo4j and have done some basic traversals. I want to start visualizing a portion of the graph or of a traversal and was overwhelmed by the amount of options listed at http://wiki.neo4j.org/content/Visualization_options_for_graphs It is unclear to me which products have actually been successfully integrated with neo4j or which was still have ongoing development. My preference is for simplicity over flexibility and to work at the code level via a library for the generation portion. The actual viewer can be a separate standalone application. The visualizations themselves will be very simple. I am not an Eclipse user so I have not looked into Neoclipse. I have worked with Graphviz files before, so to generate a basic .dot file seems ideal. There was some discussion about JUNG lately, so perhaps that is a better alternative. Unfortunately I do not have the time to explore all of the options. Cheers, Ivan -- Message: 3 Date: Sun, 21 Nov 2010 15:10:53 +0100 From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo4j] Status of visualization options To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlktinstbpexxfa=buw0jmgf=zg2ushjvykrh36+...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 2010/11/21 Ivan Brusic i...@brusic.com Hello all, I have successfully imported a large part of my data set into neo4j and have done some basic traversals. I want to start visualizing a portion of the graph or of a traversal and was overwhelmed by the amount of options listed at http://wiki.neo4j.org/content/Visualization_options_for_graphs It is unclear to me which products have actually been successfully integrated with neo4j or which was still have ongoing development. My preference is for simplicity over flexibility and to work at the code level via a library for the generation portion. The actual viewer can be a separate standalone application. The visualizations themselves will be very simple. I am not an Eclipse user so I have not looked into Neoclipse. I have worked with Graphviz files before, so to generate a basic .dot file seems ideal. There was some discussion about JUNG lately, so perhaps that is a better alternative. Unfortunately I do not have the time to explore all of the options. Neoclipse is a standalone application (although it's based on the eclipse platform) so you don't have to know anything about eclipse in order to run and use it. I also think the JUNG support is through Gremlin only. Cheers, Ivan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com -- Message: 4 Date: Sun, 21 Nov 2010 09:57:12 -0700 From: venkat takumatla vxmar...@ualr.edu Subject: Re: [Neo4j] Status of visualization options To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlkti=sgz-skulfkpzo=j+dbm-q_cv3v4n=rlyoo...@mail.gmail.com Content-Type: text/plain; charset=iso-8859-1 I am developing a testing tool for my work, I have implemented a basic visualization library which can be embedded into any java application or can work as standalone application. It can load network from gdf files or neo4j database files. Here I am attaching a screenshot of the tool. Your feedback can help in improving the library.