Per node and relationship.

Btw. this is my personal opinion, not backed by the people who know better :)

I tried to outline the problem below, if you add hundreds of thousands of keys 
for the same node, they are all aggregated in the same document which holds to 
its field-list as strong-references and as long as the document is in scope (in 
tx or cache) it
won't be gc'ed.

In Massimo's case there is a single node to which 250 million keys are indexed. 
I hope you agree that this is not the usual use-case for a graph-db-index ?

Sorry for the confusion,

Cheers

Michael

Am 28.06.2011 um 14:05 schrieb Rick Bullotta:

> Wow. "Neo4J is optimized for keys to find certain nodes or relationships 
> which normally are not more than a dozen." That's quite a surprise to me, and 
> I hope it not to be the case! 
> -----Original Message-----
> From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On 
> Behalf Of Michael Hunger
> Sent: Tuesday, June 28, 2011 7:58 AM
> To: Neo4j user discussions
> Subject: Re: [Neo4j] Unexpected error
> 
> Hi Massimo,
> 
> still the lucene document holds an ArrayList with all fields which grows 
> immensely as you add millions of fields.
> 
> package org.apache.lucene.document;
> public final class Document implements java.io.Serializable {
>  List<Fieldable> fields = new ArrayList<Fieldable>();
> }
> 
> So perhaps the difference is that in Neo4j we have one Document per 
> node/relationship, where all keys + values are added.
> 
> Which builds up those large documents.
> 
> It is unusual for Neo4j to have a single document that contains that many 
> entries as you normally just index the keys to find certain nodes or 
> relationships
> which normally are not more than a dozen. And that's why the neo4j lucene 
> index implementation is and will be optimized for exactly this case.
> 
> So I'd suggest as Mattias said you create your own datasource and handle the 
> adding there (obviously just adding node-ids + hashes to an arbitrary lucene 
> index.
> (with one doc per node-id (or hash).
> 
> Cheers
> 
> Michael
> 
> Am 28.06.2011 um 10:21 schrieb Massimo Lusetti:
> 
>> On Mon, Jun 27, 2011 at 11:56 PM, Michael Hunger
>> <michael.hun...@neotechnology.com> wrote:
>> 
>>> Massimo,
>>> 
>>> could you please look into the Lucene Document instance that you add all 
>>> the fields to?
>> 
>> You're right... I add only the NodeId and my own hash... Which fields
>> do you add?
>> 
>>> If it also contains this ultralarge ArrayList with all the Fields ?
>> 
>> As I said It only contains the Fields relative to the NodeId and the Hash
>> 
>>> And which version of lucene did you use for your standalone testing?
>> 
>> I use 3.1.0.
>> 
>> BTW I'm testing resin and even couchdb for the index part but I don't
>> like it that much cause that index phase will be out of the node4j
>> Transaction and I want to isolate all that operations in a Tranaction.
>> Have you hint for me on how to "extend" the node4j Transaction... is
>> that extensible?
>> 
>> Thanks
>> -- 
>> Massimo
>> http://meridio.blogspot.com
>> _______________________________________________
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
> 
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to