Re: Number of segments in collection is more than what is set in TieredMergePolicyFactory

2019-01-30 Thread Zheng Lin Edwin Yeo
Hi Shawn,

Thank you for the explanation.

Regards,
Edwin

On Wed, 30 Jan 2019 at 15:18, Shawn Heisey  wrote:

> On 1/28/2019 10:14 AM, Zheng Lin Edwin Yeo wrote:
> > We have the following TieredMergePolicyFactory configuration in our
> > solrconfig,xml
> >
> >  class="org.apache.solr.index.TieredMergePolicyFactory">
> >10
> >10
> >10
>
> These three settings are the really important ones.  Except for
> maxMergeAtOnceExplicit, you have these at the default settings.  The
> default for maxMergeAtOnceExplicit is 30 ... and you shouldn't lower it
> without a really good reason.  It mostly comes into play during an
> optimize ... when you lower it, optimizes may take longer than normal.
> It won't be able to merge as many segments at the same time, so the
> number of passes required to complete the optimize could increase.
>
> The most important setting here is segmentsPerTier ... this does not
> mean you will never have more than 10 total segments, it means that at
> each tier, Lucene will try to keep the number of segments below 10.
> With a large index, you are likely to have 3 or 4 tiers, possibly more.
>
> On an index where I spent a lot of time, my settings were, respective to
> yours, 35, 105, and 35.  I often had more than 100 segments in those
> indexes.  It was behaving correctly.
>
> > What could be the reason that it is not able to merge the segments to 3,
> > with each of the  segment size to be 5 GB?
>
> It is working as designed, just not as you expected.
>
> Thanks,
> Shawn
>


Re: Number of segments in collection is more than what is set in TieredMergePolicyFactory

2019-01-29 Thread Shawn Heisey

On 1/28/2019 10:14 AM, Zheng Lin Edwin Yeo wrote:

We have the following TieredMergePolicyFactory configuration in our
solrconfig,xml


   10
   10
   10


These three settings are the really important ones.  Except for 
maxMergeAtOnceExplicit, you have these at the default settings.  The 
default for maxMergeAtOnceExplicit is 30 ... and you shouldn't lower it 
without a really good reason.  It mostly comes into play during an 
optimize ... when you lower it, optimizes may take longer than normal. 
It won't be able to merge as many segments at the same time, so the 
number of passes required to complete the optimize could increase.


The most important setting here is segmentsPerTier ... this does not 
mean you will never have more than 10 total segments, it means that at 
each tier, Lucene will try to keep the number of segments below 10. 
With a large index, you are likely to have 3 or 4 tiers, possibly more.


On an index where I spent a lot of time, my settings were, respective to 
yours, 35, 105, and 35.  I often had more than 100 segments in those 
indexes.  It was behaving correctly.



What could be the reason that it is not able to merge the segments to 3,
with each of the  segment size to be 5 GB?


It is working as designed, just not as you expected.

Thanks,
Shawn


Re: Number of segments in collection is more than what is set in TieredMergePolicyFactory

2019-01-29 Thread Zheng Lin Edwin Yeo
Hi,

Anyone has any insights of this?

Thank you in advance.

Regards,
Edwin

On Tue, 29 Jan 2019 at 01:14, Zheng Lin Edwin Yeo 
wrote:

> Hi,
>
> We have the following TieredMergePolicyFactory configuration in our
> solrconfig,xml
>
> 
>   10
>   10
>   10
>   10
>   5120
>   0.1
>   2048
>   10.0
> 
>
> However, when we index data to the collection, the number of segments that
> we are getting does not match what we configured.
> For example, our collection size is 13.7 GB. With the above
> TieredMergePolicyFactory configuration, we should expect to have 3 segments
> (since 13.7 / 5 = 2.74, which rounds up to 3). But we are getting 24
> segments in our collection, which we have attached the screenshot in the
> link below.
>
> https://drive.google.com/file/d/1hjIQVk_L2Bn9MYOmCdf2wKD_f_D2DNV6/view?usp=sharing
>
> What could be the reason that it is not able to merge the segments to 3,
> with each of the  segment size to be 5 GB?
>
> Regards,
> Edwin
>
>
>
>


Number of segments in collection is more than what is set in TieredMergePolicyFactory

2019-01-28 Thread Zheng Lin Edwin Yeo
Hi,

We have the following TieredMergePolicyFactory configuration in our
solrconfig,xml


  10
  10
  10
  10
  5120
  0.1
  2048
  10.0


However, when we index data to the collection, the number of segments that
we are getting does not match what we configured.
For example, our collection size is 13.7 GB. With the above
TieredMergePolicyFactory configuration, we should expect to have 3 segments
(since 13.7 / 5 = 2.74, which rounds up to 3). But we are getting 24
segments in our collection, which we have attached the screenshot in the
link below.
https://drive.google.com/file/d/1hjIQVk_L2Bn9MYOmCdf2wKD_f_D2DNV6/view?usp=sharing

What could be the reason that it is not able to merge the segments to 3,
with each of the  segment size to be 5 GB?

Regards,
Edwin