Thanks for sharing your insights shawn On Mon, Oct 17, 2011 at 1:27 AM, Shawn Heisey <s...@elyograg.org> wrote:
> On 10/16/2011 12:01 PM, samarth s wrote: > >> Hi, >> >> Is it safe to assume that with a megeFactor of 10 the open file >> descriptors >> required by solr would be around (1+ 10) * 10 = 110 >> ref: *http://onjava.com/pub/a/**onjava/2003/03/05/lucene.html#** >> indexing_speed*<http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed*> >> Solr wiki: >> http://wiki.apache.org/solr/**SolrPerformanceFactors#**Optimization_** >> Considerationsstates<http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates> >> >> that FD's required per segment is around 7. >> >> Are these estimates appropriate. Does it in anyway depend on the size of >> the >> index& number of docs (assuming same number of segments in any case) as >> well? >> > > My index has 10 files per normal segment (the usual 7 plus three more for > termvectors). Some of the segments also have a ".del" file, and there is a > segments_* file and a segments.gen file. Your servlet container and other > parts of the OS will also have to open files. > > I have personally seen three levels of segment merging taking place at the > same time on a slow filesystem during a full-import, along with new content > coming in at the same time. With a mergefactor of 10, each merge is 11 > segments - the ten that are being merged and the merged segment. If you > have three going on at the same time, that's 33 segments, and you can have > up to 10 more that are actively being built by ongoing index activity, so > that's 43 potential segments. If your filesystem is REALLY slow, you might > end up with even more segments as existing merges are paused for new ones to > start, but if you run into that, you'll want to udpate your hardware, so I > won't consider it. > > Multiplying 43 segments by 11 files per segment yields a working > theoretical maximum of 473 files. Add in the segments files, you're up to > 475. > > Most operating systems have a default FD limit that's at least 1024. If > you only have one index (core) on your Solr server, Solr is the only thing > running on that server, and it's using the default mergeFactor of 10, you > should be fine with the default. If you are going to have more than one > index on your Solr server (such as a build core and a live core), you plan > to run other things on the server, or you want to increase your mergeFactor > significantly, you might need to adjust the OS configuration to allow more > file descriptors. > > Thanks, > Shawn > > -- Regards, Samarth