James,
we are experimenting with that feature, namely, not forcing a flush()
at the end of a transaction and let the OS take care of the actual
flushing. You potentially loose some last-transaction data, but the
store is still going to recover and will not get corrupted.
Mattias has been testing this in the ordered-writes branch at
https://github.com/neo4j/community/tree/ordered-writes .This needs to
be fleshed out to give access to these settings per transaction. I
think it will not make it into 1.5 unless someone in the community
steps up and puts in the effort to expose it. But feel free to try it
out and give feedback on your findings!

/peter

On Fri, Sep 9, 2011 at 8:07 PM, espeed <ja...@jamesthornton.com> wrote:
> Hi Guys -
>
> I have been working on loading WordNet (http://wordnet.princeton.edu/) into
> Neo4j, and have been using it as an opportunity to tune write performance on
> Linux for a Web application I am developing.
>
> My initial idea was to load WordNet RDF
> (http://semanticweb.cs.vu.nl/lod/wn30/) through the Blueprints SailGraph
> interface, but then I decided to use NLTK (http://www.nltk.org) and load it
> directly from Bulbs into Rexster.
>
> Stephen recently added batch transactions to Rexster
> (https://github.com/tinkerpop/rexster-kibbles/tree/master/batch-kibble), but
> right now I am not using them because I want to see what type of write
> performance you can get in non-batch mode.
>
> The Neo4j performance guides were helpful:
>
> * http://wiki.neo4j.org/content/Performance_Guide
> * http://wiki.neo4j.org/content/Linux_Performance_Guide
> * http://wiki.neo4j.org/content/Configuration_Settings
>
> As are Peter and Tobias' recommendations to put Neo4j transactions in manual
> mode
> (https://groups.google.com/d/msg/gremlin-users/vl4IZO7O8H4/20Yc4rUObNcJ) so
> you don't have to flush to disk for each write.
>
> However, manual/batch modes are not practical for writes in a Web
> application. It would be cool if there was a tunable parameter where you
> could set Neo4j to flush to disk at some interval instead of after every
> create/update statement.
>
> Obviously you would have an issue if the server crashed before it was
> written to disk, but this could be mitigated through HA redundancy, and
> because it's a tunable parameter, you could dial it up or down depending on
> your requirements.
>
> MongoDB does something similar, and it is reported that a single server can
> do 20-30,000 writes per second
> (http://www.dbms2.com/2011/04/04/the-mongodb-story/).
>
> Here some of the things Mongo does to make writes fast:
>
> * A memory-mapped data model.
> * Deferred writes — a write might take a couple of seconds to actually
> persist.
> * Optimism — you don’t have to wait for an acknowledgement if you write
> something to the database.
> * “Upsert in place” – update in place without checking whether you’re doing
> a write or insert.
>
> What would it take for Neo4j to approach these levels?
>
> Neo4j does memory-mapped IO:
>
>
> http://wiki.neo4j.org/content/Configuration_Settings#Memory_mapped_I.2FO_settings
>
> There have been talks about adding optimistic locking:
>
>  http://neo4j.org/forums/#nabble-td2891798
>
> And Peter has said that deferred writes are on the drawing board
> (http://lists.neo4j.org/pipermail/user/2011-May/008792.html):
>
>
> Peter Neubauer wrote:
>>
>> However, we are looking into Neo4j normal mode speedups by having a mode
>> that drops the JTA dependencies and thus can relax on the logfile flushing
>> requirements for each transaction, by that being able to use the
>> underlying
>> OS for ordered (deferred) writing, adjustable on a case-by-case level
>> (e.g.
>> batch inserting big data). This will give Neo4j insertions in this mode
>> comparable performance with the batchinserter, while keeping all other
>> semantics and layers in place. I hope this can make it into 1.4, and it
>> will
>> speed up the RDF insertion considerably!
>>
>
> Is support for optimistic locking and deferred writes planned for an
> upcoming release?
>
> Thanks.
>
> - James
>
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Write-Performance-tp3323638p3323638.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to