Re: [Neo4j] Lucene index commit rate and NoSuchElementException

Michael Hunger Tue, 01 Feb 2011 13:25:45 -0800

What about batch insertion of the nodes and indexing them after the fact?

And I agree with Tobias that a CHM should be a better claim checking algorithm 
than using
indexing for that. The index as well as the insertion of the nodes will only be 
visible to other
threads after the commit (ACID, please TI correct me if I'm wrong) , so it is 
surely possible that
you accidentally insert the same data twice.


Cheers

Michael

Am 01.02.2011 um 22:19 schrieb Tobias Ivarsson:

> No, it means that you have to synchronize the threads so that they don't
> insert the same data concurrently.
> Perhaps a ConcurrentHashMap<MD5,token> where you would putIfAbsent(md5,new
> Object()) when you start working on a new hash. If the token Object you get
> back is not the same as you put in, you know that another thread is working
> on that md5, which means this thread should move on to another one. When the
> transaction is done you remove the md5 from the Map, to ensure that you
> don't leak memory.
> 
> That's a simple "locking on arbitrary key" implementation. The reason you
> cannot just do synchronized(md5) {...} is of course that your hashes are
> computed, and thus will not be the same object every time, even though they
> are equals().
> 
> For getting a performance boost out of writes, doing multiple operations in
> one transaction will give a much bigger gain than multiple threads though.
> For your use case, I think two writer threads and a few hundred elements per
> transaction is an appropriate size.
> 
> -tobias
> 
> On Tue, Feb 1, 2011 at 9:06 PM, Massimo Lusetti <mluse...@gmail.com> wrote:
> 
>> On Tue, Feb 1, 2011 at 8:02 PM, Mattias Persson
>> <matt...@neotechnology.com> wrote:
>> 
>>> Seems a little weird, the commit rate won't affect the end result,
>>> just performance (more operations per commit means faster
>>> performance). Your code seems correct for single threaded use btw.
>> 
>> Does it means that I cannot access the graphdb from multiple threads?
>> That code is on a singleton service which expose the
>> GraphDatabaseService through a method addNode() from where I run that
>> code.
>> 
>> The singleton service is called by a thread pool which can fire at
>> maximum 20 concurrent threads.
>> 
>> Any hints is really appreciated.
>> 
>> Cheers
>> --
>> Massimo
>> http://meridio.blogspot.com
>> _______________________________________________
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>> 
> 
> 
> 
> -- 
> Tobias Ivarsson <tobias.ivars...@neotechnology.com>
> Hacker, Neo Technology
> www.neotechnology.com
> Cellphone: +46 706 534857
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Lucene index commit rate and NoSuchElementException

Reply via email to