Guys, thank you for all the replies. I think I have figured out a partial solution for the problem on Friday night. Adding a whole bunch of debug statements to the info stream showed that every document is following "update document" path instead of "add document" path. Meaning that all document IDs are getting into the "pending deletes" queue, and Solr has to rescan its index on every commit for potential deletions. This is single threaded and seems to get progressively slower with the index size.
Adding overwrite=false to the URL in /update handler did NOT help, as my debug statements showed that messages still go to updateDocument() function with deleteTerm not being null. So, I hacked Lucene a little bit and set deleteTerm=null as a temporary solution in the beginning of updateDocument(), and it does not call applyDeletes() anymore. This gave a 6-8x performance boost, and now we can index about 9 million documents/hour (producing 20Gb of index every hour). Right now it's at 1TB index size and going, without noticeable degradation of the indexing speed. This is decent, but still the 24-core machine is barely utilized :) Now I think it's hitting a merge bottleneck, where all indexing threads are being paused. And ConcurrentMergeScheduler with 4 threads is not helping much. I guess the changes on the trunk would definitely help, but we will likely stay on 3.4 Will dig more into the issue on Monday. Really curious to see why "overwrite=false" didn't help, but the hack did. Once again, thank you for the answers and recommendations Roman -- View this message in context: http://lucene.472066.n3.nabble.com/large-scale-indexing-issues-single-threaded-bottleneck-tp3461815p3466523.html Sent from the Solr - User mailing list archive at Nabble.com.