Hi Linan, trying fast stabs at answers inline before heading home :) On Thu, Sep 1, 2011 at 3:29 AM, Linan Wang <tali.w...@gmail.com> wrote:
> hi, > got some questions not found simple answers from the documents. i bet > some of them are pretty primitive, bear with me please. > > 1, what's the general rule for choosing properties or relationship? > say a User lives in a City, which just contains a simple int id > value. to find users live in a city, i can do a simple traversal, of > all user nodes, or find the city node first, then collect all the > users. seems to me both ways work and share same level of performance. > (am i right here?) > Generally, if a number of properties really is denoting the same concept (like a city) and you don't want to duplicate the data, and be able to traverse or query it, I would introduce nodes. However, if the node woudl turn into a supernode (like a city node with 100K relationships), then consider introducing an in-graph indexing structure, or an out-of-graph external index like Lucene in order to look up relationships or nodes when you need them, since that will be cheaper. > 2, does index operation add/remove/modify threadsafe, don't need > lock/transaction? > Yes, but the index framework is transactional as well as the graph. You need TX for any modifying operation, but not for reads. > 3, does it simple property writing operations also need to be wrapped > inside transaction? if so, in the imdb exmaple > tutor/domain/MovieImpl.java underlyingNode.setProperty is used neither > within transaction, nor put into a save method, do all setProperty > works inside a transaction? > See Anders reply and above. > 4, what's the best practice to do bulk insertion when running (not > seed initial data)? i read post says that too many insertions within a > transaction may lead to memory problem? what's the proper mount of > insertion within a transaction? > Yes, transaction data is kept in memory before calling commit and flushing to disk, so overly large TX might result in memory problems. OTOH small TX incur higher IO load. > 5, is there a suggested max length for string/array property? would it > be better to put into sql? > Well, the String store block size is adjustable (and we are working on even better layouts there), but for big strings like documents, a fiel system or Key/Value store might be better, and just keeping the reference to the location makes more sense. 6, say a facebook user may "likes" thousands of things, and these > things are sparsly connected. in this case, things should be modeled as nodes or array property? > Nodes. Sparse connections are one of the places where Neo4j shines - a fairly balanced graph where supernodes are seldom. > 7, where can i find an example to use domain models with serverplugin? > i want to put my data in a standalone server and just use the > serverplugin, unmanaged extension. should i just put the domain models > into the same serverplugin jar? > Yes, I would do that. However, if you are not expecting to return Nodes, Relationships or Properties, an unmanaged extension will give you the full API of REST services. One extension that way is for instance the scripting extension, see https://github.com/neo4j/script-extension 8, the warning in the documentation about unmanaged extension is > scary. what i can see is that people may use bad ways, instead of > Iterator/IteratorWrappers. any comment on this? > Yeah. It's just a warning, no sudden death. With that approach, you are inventing your own API and can do whatever you want, for good and bad. > 9, i'm not sure if it's trival: find out users who are only 2 > relationships a way (use twitter example: my followees' followers), > live in same city, group by age and gender. also retrieve all their > followees. i want to do the traversal in java, where can i find an > examples? > Well, http://docs.neo4j.org/chunked/snapshot/tutorials-java-embedded-traversal.htmlshould get you started? Also, in the next version, the Tinkerpop fluent iterator API (https://github.com/tinkerpop/pipes/wiki/FluentPipeline) is hopefully finding its way into the Neo4j release, if QA is ok, and you will have more options to do this. > 10, i've had horrible experience in turning jvm options. have neo4j > been running on Zing JVM, hp nonstop jvm? are they better options? > > I think there are initial tests running on Zing, but I don't know for sure. If you have access to such a machine, ir would be great if you can give feedback. Michael Hunger is doing a lot of these tests for hosting. Sorry for the delay, hope this helps. Let us know if you have more questions! /peter _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user