Merge indexes with CoreAdminHandler - possible issue

Alexey Timofeev Tue, 01 Nov 2016 16:40:02 -0700

Hello!

I am stuck with CoreAdminHandler's merge indexes functionality. It looks
like merge indexes behaves weird if we use it with "srcCore" parameter. (If
we use "indexDir" parameter then it works fine.) Please tell me if it's
real bug in Solr or I am using it wrong? If community admits that it's bug
indeed, then let's create ticket for it and I will be happy to suggest
patch to fix it.


Now let me explain what I think is wrong with merging indexes using "srcCore"
parameter. Trouble is that when merge code starts to merge doc values
fields it mistakenly determines all fields that can be uninverted to be doc
values fields. That results in uninverting of all uninvert-able fields and
writing result in resulting index. Thus, memory consumption is huge, chance
of OOM is big, resulting index is bloated.

Now, why merge code considers all uninvert-able fields to be doc values?
It's because it considers all fields where (FieldInfo.docValuesType !=
DocValuesType.NONE) to be doc values. FieldInfo objects are provided by
IndexReader and as we always have UninvertingReader in chain of readers
then we always get FieldInfo.docValuesType to be doc values type to which
that field can be converted. Thus, we almost always have
(FieldInfo.docValuesType
!= DocValuesType.NONE) and uninvert almost all fields.

Are there a way to create core without UninvertingReader in chain of
readers? If so then is it expected way of usage or just workaround?

If loading core without UninvertingReader is not what I meant to do then I
would suggest to consult schema to find out what fields are doc values
instead of relying on FieldInfo.docValuesType.

Thank you in advance. Looking forward to your replies!

-- 
Regards.

Merge indexes with CoreAdminHandler - possible issue

Reply via email to