Re: [Neo4j] 10 questions

Peter Neubauer Fri, 02 Sep 2011 07:33:35 -0700

Hi Linan,
trying fast stabs at answers inline before heading home :)

On Thu, Sep 1, 2011 at 3:29 AM, Linan Wang <tali.w...@gmail.com> wrote:


> hi,
> got some questions not found simple answers from the documents. i bet
> some of them are pretty primitive, bear with me  please.
>
> 1, what's the general rule for choosing properties or relationship?
> say a User lives in a City, which just contains a simple int  id
> value. to find users live in a city, i can do a simple traversal, of
> all user nodes, or find the city node first, then collect all the
> users. seems to me both ways work and share same level of performance.
> (am i right here?)
>
Generally, if a number of properties really is denoting the same concept
(like a city) and you don't want to duplicate the data, and be able to
traverse or query it, I would introduce nodes. However, if the node woudl
turn into a supernode (like a city node with 100K relationships), then
consider introducing an in-graph indexing structure, or an out-of-graph
external index like Lucene in order to look up relationships or nodes when
you need them, since that will be cheaper.


> 2, does index operation add/remove/modify threadsafe, don't need
> lock/transaction?
>
Yes, but the index framework is transactional as well as the graph. You need
TX for any modifying operation, but not for reads.


> 3, does it simple property writing operations also need to be wrapped
> inside transaction? if so, in the imdb exmaple
> tutor/domain/MovieImpl.java underlyingNode.setProperty is used neither
> within transaction, nor put into a save method, do all setProperty
> works inside a transaction?
>
See Anders reply and above.


> 4, what's the best practice to do bulk insertion when running (not
> seed initial data)? i read post says that too many insertions within a
> transaction may lead to memory problem? what's the proper mount of
> insertion within a transaction?
>
Yes, transaction data is kept in memory before calling commit and flushing
to disk, so overly large TX might result in memory problems. OTOH small TX
incur higher IO load.


> 5, is there a suggested max length for string/array property? would it
> be better to put into sql?
>
Well, the String store block size is adjustable (and we are working on even
better layouts there), but for big strings like documents, a fiel system or
Key/Value store might be better, and just keeping the reference to the
location makes more sense.

6, say a facebook user may "likes" thousands of things, and these
> things are sparsly connected. in this case, things should be modeled

as nodes or array property?
>
Nodes. Sparse connections are one of the places where Neo4j shines - a
fairly balanced graph where supernodes are seldom.


> 7, where can i find an example to use domain models with serverplugin?
> i want to put my data in a standalone server and just use the
> serverplugin, unmanaged extension. should i just put the domain models
> into the same serverplugin jar?
>
 Yes, I would do that. However, if you are not expecting to return Nodes,
Relationships or Properties, an unmanaged extension will give you the full
API of REST services. One extension that way is for instance the scripting
extension, see https://github.com/neo4j/script-extension

8, the warning in the documentation about unmanaged extension is
> scary. what i can see is that people may use bad ways, instead of
> Iterator/IteratorWrappers. any comment on this?
>
Yeah. It's just a warning, no sudden death. With that approach, you are
inventing your own API and can do whatever you want, for good and bad.


> 9, i'm not sure if it's trival: find out users who are only 2
> relationships a way (use twitter example: my followees' followers),
> live in same city, group by age and gender. also retrieve all their
> followees. i want to do the traversal in java, where can i find an
> examples?
>
Well,
http://docs.neo4j.org/chunked/snapshot/tutorials-java-embedded-traversal.htmlshould
get you started? Also, in the next version, the Tinkerpop fluent
iterator API (https://github.com/tinkerpop/pipes/wiki/FluentPipeline) is
hopefully finding its way into the Neo4j release, if QA is ok, and you will
have more options to do this.


> 10, i've had horrible experience in turning jvm options. have neo4j
> been running on Zing JVM, hp nonstop jvm? are they better options?
>
> I think there are initial tests running on Zing, but I don't know for sure.
If you have access to such a machine, ir would be great if you can give
feedback. Michael Hunger is doing a lot of these tests for hosting.


Sorry for the delay, hope this helps. Let us know if you have more
questions!

/peter
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] 10 questions

Reply via email to