Re: Number of segments in collection is more than what is set in TieredMergePolicyFactory
Hi Shawn, Thank you for the explanation. Regards, Edwin On Wed, 30 Jan 2019 at 15:18, Shawn Heisey wrote: > On 1/28/2019 10:14 AM, Zheng Lin Edwin Yeo wrote: > > We have the following TieredMergePolicyFactory configuration in our > > solrconfig,xml > > > > class="org.apache.solr.index.TieredMergePolicyFactory"> > >10 > >10 > >10 > > These three settings are the really important ones. Except for > maxMergeAtOnceExplicit, you have these at the default settings. The > default for maxMergeAtOnceExplicit is 30 ... and you shouldn't lower it > without a really good reason. It mostly comes into play during an > optimize ... when you lower it, optimizes may take longer than normal. > It won't be able to merge as many segments at the same time, so the > number of passes required to complete the optimize could increase. > > The most important setting here is segmentsPerTier ... this does not > mean you will never have more than 10 total segments, it means that at > each tier, Lucene will try to keep the number of segments below 10. > With a large index, you are likely to have 3 or 4 tiers, possibly more. > > On an index where I spent a lot of time, my settings were, respective to > yours, 35, 105, and 35. I often had more than 100 segments in those > indexes. It was behaving correctly. > > > What could be the reason that it is not able to merge the segments to 3, > > with each of the segment size to be 5 GB? > > It is working as designed, just not as you expected. > > Thanks, > Shawn >
Re: Number of segments in collection is more than what is set in TieredMergePolicyFactory
On 1/28/2019 10:14 AM, Zheng Lin Edwin Yeo wrote: We have the following TieredMergePolicyFactory configuration in our solrconfig,xml 10 10 10 These three settings are the really important ones. Except for maxMergeAtOnceExplicit, you have these at the default settings. The default for maxMergeAtOnceExplicit is 30 ... and you shouldn't lower it without a really good reason. It mostly comes into play during an optimize ... when you lower it, optimizes may take longer than normal. It won't be able to merge as many segments at the same time, so the number of passes required to complete the optimize could increase. The most important setting here is segmentsPerTier ... this does not mean you will never have more than 10 total segments, it means that at each tier, Lucene will try to keep the number of segments below 10. With a large index, you are likely to have 3 or 4 tiers, possibly more. On an index where I spent a lot of time, my settings were, respective to yours, 35, 105, and 35. I often had more than 100 segments in those indexes. It was behaving correctly. What could be the reason that it is not able to merge the segments to 3, with each of the segment size to be 5 GB? It is working as designed, just not as you expected. Thanks, Shawn
Re: Number of segments in collection is more than what is set in TieredMergePolicyFactory
Hi, Anyone has any insights of this? Thank you in advance. Regards, Edwin On Tue, 29 Jan 2019 at 01:14, Zheng Lin Edwin Yeo wrote: > Hi, > > We have the following TieredMergePolicyFactory configuration in our > solrconfig,xml > > > 10 > 10 > 10 > 10 > 5120 > 0.1 > 2048 > 10.0 > > > However, when we index data to the collection, the number of segments that > we are getting does not match what we configured. > For example, our collection size is 13.7 GB. With the above > TieredMergePolicyFactory configuration, we should expect to have 3 segments > (since 13.7 / 5 = 2.74, which rounds up to 3). But we are getting 24 > segments in our collection, which we have attached the screenshot in the > link below. > > https://drive.google.com/file/d/1hjIQVk_L2Bn9MYOmCdf2wKD_f_D2DNV6/view?usp=sharing > > What could be the reason that it is not able to merge the segments to 3, > with each of the segment size to be 5 GB? > > Regards, > Edwin > > > >
Number of segments in collection is more than what is set in TieredMergePolicyFactory
Hi, We have the following TieredMergePolicyFactory configuration in our solrconfig,xml 10 10 10 10 5120 0.1 2048 10.0 However, when we index data to the collection, the number of segments that we are getting does not match what we configured. For example, our collection size is 13.7 GB. With the above TieredMergePolicyFactory configuration, we should expect to have 3 segments (since 13.7 / 5 = 2.74, which rounds up to 3). But we are getting 24 segments in our collection, which we have attached the screenshot in the link below. https://drive.google.com/file/d/1hjIQVk_L2Bn9MYOmCdf2wKD_f_D2DNV6/view?usp=sharing What could be the reason that it is not able to merge the segments to 3, with each of the segment size to be 5 GB? Regards, Edwin