[ http://issues.apache.org/jira/browse/SOLR-65?page=all ]
Mike Klaas updated SOLR-65: --------------------------- Attachment: autocommit_patch.diff New patch. First, the locking semantics actually were wrong. Since ever addDoc call grabbed the commit lock and downgraded to access lock, subsequent calls would block on the commit. I tried a few vastly different schemes, and it took a while to figure out something that allowed concurrency but also gave the same protections as before. I finally settled on using the read/write commit lock as the principal lock, with a touch of synchronization to protect the addDoc calls. That finally enabled concurrency, but other bottlenecks emerged. checkCommit() was grabbing the commit lock, which created a barrier at the end of every addDoc call which was forced to wait for all pending addDoc calls. Switched to synchro on the tracker (synchronizing on DUH2 would provoke a potential deadlock). Finally, there was significant contention on the lock for the logger output stream. When merging wasn't occuring, the doc rate could reach 200-300 dps, and each docId was being logged. I modified the bulk add code to log the docid of all documents in a single log statement. While I was at it, I converted the <result> output for multi-adds to a single xml element. Was more information going to be added to this? The gains of multi-threaded indexing for my application are modest. The cpu usage is >100% consistently; it drops a bit during medium merges and drops a lot during large merges (merges effectively serialize adding documents). Still, the throughput gain is about 20-30%. In retrospect, this isn't terribly surprising, as our analysis is relatively modest. Applications with heavier analysis needs would see more gains. > autoCommit/autoOptimize implementation + multithreaded document adding > ---------------------------------------------------------------------- > > Key: SOLR-65 > URL: http://issues.apache.org/jira/browse/SOLR-65 > Project: Solr > Issue Type: New Feature > Components: update > Reporter: Mike Klaas > Assigned To: Mike Klaas > Attachments: autocommit_patch.diff, autocommit_patch.diff > > > Basic implementation of autoCommit/autoOptimize functionality, plus overhaul > of DUH2 threading to reduce contention -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira