Indeed the implicit id can cause problems on any use case regarding synchronization between neo4j and some other datasource (or replication between two neo4j instances)
For example if i import data from an xml file, and then rerun the import, i would expect the data in my neo4j instance to be overwritten/updated... It would be really sweet it it would be possible to define this id before the node is created, as a String for example. But i guess you guys use long for a good reason. -atle On Tue, Jan 19, 2010 at 1:00 PM, Rick Bullotta < rick.bullo...@burningskysoftware.com> wrote: > There is really no "natural" way to express complex graphs (something more > than hierarchal) in something like XML or JSON, but it can be done as long > as each entity has a unique identification of some kind (e.g. GraphML's > IDs). > > Barring any reason not to, it would seem that GraphML would be the most > logical place to start. It seems to recognize many of the potential > complications(e.g. the parse "hints") and is extensible. > > One primary disconnect point is the dependency on "ids". Neo nodes and > relationships have an implicit ID (the long value representing the node or > relationship), but may or may not have an explicit ID. Thus, as mentioned > previously, the identity may not be the same on import as it was an export, > unless an explicit ID is provided for each node/relationship. Our current > graph model is a mix of both. Some nodes have ID's (typically a name), > others do not. > > Also, GraphML would need to be extended to support the concept of > relationship types, but this seems to be fairly straightforward using > custom > attributes, elements and/or xlink. > > > > -----Original Message----- > From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] > On > Behalf Of Craig Taverner > Sent: Tuesday, January 19, 2010 4:07 AM > To: Neo user discussions > Subject: Re: [Neo] Import/export > > I was wondering if the neo-shell or the neo4j.rb in IRB would solve this > requirement (easily creating or loading some initial graph). I have not > played much with the shell, but know that it has commands for making nodes > and relationships. But I think it is best for interactive work, and I think > it is not ideal for scripting. On the other hand the Ruby API provides a > command-line/scripting DSL for generating a graph and since it is also very > easy to read data from a file, it is easy to read the file and create the > graph in a language not entirely unlike your original 'javascript-like' > example. > > While on that note, I said javascript-like, but if we focus on the 'state > transfer' example you gave, we're talking about JSON, and I think I might > get a few votes for suggesting that as a nicer alternative to XML for a > generic data structure. > > So, if you use JSON, the Java API and a JSON library would suffice to build > the graph. If you are willing to deviate a little from the syntax, you > could > have the format in Ruby and directly executable in neo4j.rb (which I think > is even cooler :-) > > I also personally think both XML and JSON represent implicit tree > structures, and so any XML or JSON dataset can be loaded as a tree graph > with generic code (and no need for hashes of node ids or any caches). > However, things get slightly tricky when we need to translate XML/JSON > closed graph contructs into the graph, but even that seems achievable (with > hashes/caches ;-) > > On Tue, Jan 19, 2010 at 8:55 AM, David Montag <da...@montag.se> wrote: > > > Hi, > > > > Having read the replies and thought about it more, I think my initial > > e-mail > > had a slightly wrong focus. The technical details that have surfaced so > far > > are interesting, and would definitely be relevant, should an > implementation > > be attempted. > > > > However. > > > > What I personally would like to know is, do you think there's a need for > > initial data sets in the first place? Because that is the problem that I > > initially set out to solve. Then I kind of got ahead of myself and > started > > thinking about the hows, and not the whats and whys. Simply zipping up a > > couple of pre-populated stores with different graphs would actually solve > > the problem. Maybe not in the most elegant and/or maintainable way, but > > still. Export/import is a much broader feature. > > > > Opinions? Don't get me wrong, I'm not trying to kill the tech discussion. > > I'm just trying to solve the actual problem that I ran into. And if you > > think export/import would be useful too, great! I'd be happy to continue > > that discussion as well. > > > > Also, let me make it clear that this (i.e. initial data sets) isn't > > something I'm doing as a project for myself. I would expect it to be a > > community effort, benefiting everyone. So I actually *want* to know if > you > > like the ideas or not, in addition to solutions. With the awesomeness > that > > is the Neo4j community, it shouldn't be a problem. :) > > > > -David > > > > On Tue, Jan 19, 2010 at 2:32 AM, Rick Bullotta < > > rick.bullo...@burningskysoftware.com> wrote: > > > > > Actually, I think there's one other key "gotcha" to be aware of. > > > > > > Rewiring relationships when importing should not assume anything about > > the > > > nodeID's. While the nodeID's are a useful "unique identifier" in the > > > export > > > process, on import, you'd want to create a HashMap or similar structure > > > that > > > you populate with the "old" and "new" node ID's as you create them in > the > > > first pass through (nodes/properties), then use the "old" nodeIDs > > > referenced > > > in the exported relationships as your lookup to get the "new" nodeIDs. > > > > > > Could be kinda memory intensive for really large graphs (since you'd > have > > > to > > > keep a HashMap entry of Long/Long for each node), but probably > > manageable. > > > In the worst case you could keep the translation table on disk and > chunk > > it > > > in as needed. > > > > > > -----Original Message----- > > > From: user-boun...@lists.neo4j.org [mailto: > user-boun...@lists.neo4j.org] > > > On > > > Behalf Of Rob Challen > > > Sent: Monday, January 18, 2010 6:25 PM > > > To: Neo user discussions > > > Subject: Re: [Neo] Import/export > > > > > > Rdf seems a good candidate to me. > > > > > > Having said that it might just be pretty easy to write out the graph > > > in a spreadsheet (nodes and properties in one tab and relationship > > > triples and properties in another) and import that, as long as you > > > aren't fussed about maintaining data types. > > > > > > Rob. > > > > > > On 18/01/2010, Peter Neubauer <neubauer.pe...@gmail.com> wrote: > > > > Hi David, > > > > one thing would be to provide example node spaces, maybe even as > > > > Amazon EC2 AMIs, or downloadable nodespaces. > > > > > > > > Regrading XML format, I think GraphML is the most standard thing > > > > there, Gremlin already has a GraphML importer that can be used to > > > > import data into Neo4j, > > > > > > > > > > http://wiki.github.com/tinkerpop/gremlin/graphml-reader-and-writer-library > > > > . Probably not hard to write directly onto Neo4j. > > > > > > > > Anyone knowing about a good other binary format? > > > > > > > > WDYT? > > > > > > > > Cheers, > > > > > > > > /peter neubauer > > > > > > > > COO and Sales, Neo Technology > > > > > > > > GTalk: neubauer.peter > > > > Skype peter.neubauer > > > > Phone +46 704 106975 > > > > LinkedIn http://www.linkedin.com/in/neubauer > > > > Twitter http://twitter.com/peterneubauer > > > > > > > > http://www.neo4j.org - Your high performance graph > > > database. > > > > http://gremlin.tinkerpop.com - PageRank in 2 lines of code. > > > > > > > > > > > > > > > > On Mon, Jan 18, 2010 at 8:37 PM, David Montag <da...@montag.se> > wrote: > > > >> Hi, > > > >> > > > >> This weekend I was toying around with Neo4j. I wanted to do some > > > indexing > > > >> experiments. Unfortunately I found myself without a graph to work > > with. > > > >> Sure, I could write some code to generate a graph for me, but it'd > be > > a > > > >> one-time-thing. I wanted to get going *now*. That got me thinking > > about > > > >> import/export functionality. > > > >> > > > >> I think a command-line import tool would be useful, accompanied by > > (and > > > >> built on) a Java API. Both of them would be tied to a certain > > > >> representation > > > >> format. The export can be represented in different ways, where two > > > >> possible > > > >> ways are: > > > >> - State transfer: (node{id:1, name:foo}, node{id:2}, > > rel{start:1,end:2, > > > >> type=bar}, ...) > > > >> - Operation transfer: (id1 = create node, id2 = create node, create > > rel > > > >> id1->id2 type bar, ...) > > > >> > > > >> I guess the state transfer feels like the more straightforward one. > > The > > > >> diff-style nature of the operation transfer might be useful in other > > > >> cases. > > > >> > > > >> When I first thought of this, the target user was somebody who > wanted > > to > > > >> get > > > >> started with a graph, and didn't want to write code to do an import > > > >> "manually". Maybe the import/export can extend to other use cases, > but > > > >> this > > > >> was the primary one. A possible workflow could be db exported to > file, > > > >> file > > > >> published, file downloaded, file imported into db. > > > >> > > > >> In the end, it would be great if new users could download sample > data > > > sets > > > >> and import them into a Neo4j instance without writing a single line > of > > > >> code. > > > >> Which also gets me thinking about a command-line tool to create an > > empty > > > >> Neo4j instance to import into. The actual implementations of the > tools > > > are > > > >> trivial. It's the discussion that leads to the implementation that's > > > >> important. > > > >> > > > >> Does this sound like anything that would interest people? If so, > > > (digging > > > >> into details) what kind of representation do you guys think would be > > > best? > > > >> I > > > >> was thinking XML, but a binary format might be better for > performance > > > >> (size/primitives ratio). Maybe both? Because I do like the idea of a > > > >> human-readable (and editable) format. If you don't think it would be > > > >> useful > > > >> I would love to hear why. > > > >> > > > >> This is just a brain dump of my thoughts. Surely others have thought > > of > > > >> this > > > >> as well. I'm just getting the discussion started. WDYT? > > > >> > > > >> -David > > > >> _______________________________________________ > > > >> Neo mailing list > > > >> User@lists.neo4j.org > > > >> https://lists.neo4j.org/mailman/listinfo/user > > > >> > > > > _______________________________________________ > > > > Neo mailing list > > > > User@lists.neo4j.org > > > > https://lists.neo4j.org/mailman/listinfo/user > > > > > > > > > > -- > > > Sent from my mobile device > > > _______________________________________________ > > > Neo mailing list > > > User@lists.neo4j.org > > > https://lists.neo4j.org/mailman/listinfo/user > > > > > > _______________________________________________ > > > Neo mailing list > > > User@lists.neo4j.org > > > https://lists.neo4j.org/mailman/listinfo/user > > > > > _______________________________________________ > > Neo mailing list > > User@lists.neo4j.org > > https://lists.neo4j.org/mailman/listinfo/user > > > _______________________________________________ > Neo mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > > _______________________________________________ > Neo mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user