bq: What adds bottleneck in the indexing flow? Is it the buffering and flushing out to disk ?
It Depends (tm). What do the Solr logs show when one of these two things happens? You pretty much have to put a profiler on the Solr instance to see where it's spending the time, but timeouts are very often caused by: 1> having a very large heap 2> hitting a stop-the-world garbage collection that exceeds your timeouts. Best, Erick On Sat, Dec 5, 2015 at 8:07 PM, KNitin <nitin.t...@gmail.com> wrote: > I have an extremely large indexing load (per doc size of 4-5 Mb with over > 100M docs). I have auto commit settings to flush to disk (with open > searcher as false) every 20 seconds. Even with that the update sometime > fails or timed out. The goal is to improve the indexing throughput and > hence trying to experiment and see if tweaking any of these can speed up. > > What adds bottleneck in the indexing flow? Is it the buffering and flushing > out to disk ? > > On Sat, Dec 5, 2015 at 11:15 AM, Erick Erickson <erickerick...@gmail.com> > wrote: > >> I'm pretty sure that max indexing threads is per core, but just looked >> and it's not supported in Solr 5.3 and above so I wouldn't worry about >> it at all. >> >> I've never seen much in the way of benefit for bumping this past 128M >> or maybe 256M. This is just how much memory is filled up before the >> buffer is flushed to disk. Unless you have very high indexing loads or >> really long autocommit times, you'll rarely hit it anyway since this >> memory is also flushed when you do any flavor of hard commit. >> >> Best, >> Erick >> >> On Fri, Dec 4, 2015 at 4:55 PM, KNitin <nitin.t...@gmail.com> wrote: >> > Hi, >> > >> > The max indexing threads in the solrconfig.xml is set to 8 by default. >> Does >> > this mean only 8 concurrent indexing threads will be allowed per >> collection >> > level? or per core level? >> > >> > Buffered size : This seems to be set at 64Mb. If we have beefier machine >> > that can take more load, can we set this to a higher limit say 1 or 2 Gb? >> > What will be downside of doing so? (apart from commits taking longer). >> > >> > Thanks in advance! >> > Nitin >>