[ http://issues.apache.org/jira/browse/SOLR-65?page=all ]

Mike Klaas updated SOLR-65:
---------------------------

    Attachment: autocommit_patch.diff

New patch.

First, the locking semantics actually were wrong.  Since ever addDoc call 
grabbed the commit lock and downgraded to access lock, subsequent calls would 
block on the commit.  I tried a few vastly different schemes, and it took a 
while to figure out something that allowed concurrency but also gave the same 
protections as before.  I finally settled on using the read/write commit lock 
as the principal lock, with a touch of synchronization to protect the addDoc 
calls.

That finally enabled concurrency, but other bottlenecks emerged.  checkCommit() 
was grabbing the commit lock, which created a barrier at the end of every 
addDoc call which  was forced to wait for all pending addDoc calls.  Switched 
to synchro on the tracker (synchronizing on DUH2 would provoke a potential 
deadlock).

Finally, there was significant contention on the lock for the logger output 
stream.  When merging wasn't occuring, the doc rate could reach 200-300 dps, 
and each docId was being logged.  I modified the bulk add code to log the docid 
of all documents in a single log statement.  While I was at it, I converted the 
<result> output for multi-adds to a single xml element.  Was more information 
going to be added to this?

The gains of multi-threaded indexing for my application are modest.  The cpu 
usage is >100% consistently; it drops a bit during medium merges and drops a 
lot during large merges (merges effectively serialize adding documents).  
Still, the throughput gain is about 20-30%.  In retrospect, this isn't terribly 
surprising, as our analysis is relatively modest.  Applications with heavier 
analysis needs would see more gains. 

> autoCommit/autoOptimize implementation + multithreaded document adding
> ----------------------------------------------------------------------
>
>                 Key: SOLR-65
>                 URL: http://issues.apache.org/jira/browse/SOLR-65
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Mike Klaas
>         Assigned To: Mike Klaas
>         Attachments: autocommit_patch.diff, autocommit_patch.diff
>
>
> Basic implementation of autoCommit/autoOptimize functionality, plus overhaul 
> of DUH2 threading to reduce contention

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to