Re: DocValue on Strings slow and OOM

Per Steffensen Tue, 05 Nov 2013 04:48:17 -0800

Looking at threaddumps

It seems like one of the major differences in what is done forc_dstr_doc_sto vs a_dlng_doc_sto is in SimpleFactes.getFacetFieldCounts,where c_dstr_doc_sto takes the "getTermCounts"-path and a_dlng_doc_stotakes the "getListedTermCounts"-path.

String termList = localParams == null ? null :localParams.get(CommonParams.TERMS);

            if (termList != null) {
              res.add(key, getListedTermCounts(facetValue, termList));
            } else {
              res.add(key, getTermCounts(facetValue));
            }

getTermCounts seems to do a lot more and to be a lot more complex thangetListedTermCounts


On 11/5/13 11:47 AM, Per Steffensen wrote:

Hi
We have a 6-Solr-node (release 4.4.0) setup with 12billion "small"documents loadad. The documents have the following fields
* a_dlng_doc_sto
* b_dlng_doc_sto
* c_dstr_doc_sto
* timestamp_lng_ind_sto
* d_lng_ind_sto
From schema.xml
<dynamicField name="*_dstr_doc_sto" type="dstring" indexed="false"stored="true" required="true" docValues="true"/><dynamicField name="*_lng_ind_sto" type="long" indexed="true"stored="true"/><dynamicField name="*_dlng_doc_sto" type="dlng" indexed="false"stored="true" required="true" docValues="true"/>
...
<fieldType name="dstring" class="solr.StrField"sortMissingLast="true" docValuesFormat="Disk"/><fieldType name="dlng" class="solr.TrieLongField"precisionStep="0" positionIncrementGap="0" docValuesFormat="Disk"/>
We execute queries on the following format:
* q=timestamp_lng_ind_sto:[x TO y] AND d_lng_ind_sto:(a OR b OR ... OR n)
* facet=true&facet.field=<field>&facet.zeros=false&facet.mincount=1
F.ex executing a query with values for x, y, a, b ... and n that hitsonly 6 documents (out of the 12billion) total* With <field>=a_dlng_doc_sto (long docvalue) the query respondsfairly quick (< 2 sec)* With <field>=c_dstr_doc_sto (string docvalue) the query respondsvery slowly (> 100 sec) and only if we give the Solr-nodes a lot ofXmx. If Xmx is too low we experience OOM on involved Solr-nodes andnever see a responsec_dstr_doc_sto strings are all about 10-15 chars, so it is not verylong strings
Is it a known issue that there is such a big difference between facetsearches on longs and strings? And that memory usage seems to verydifferent, also?
If yes, has it been optimized after 4.4.0?

Regards, Per Steffensen

Re: DocValue on Strings slow and OOM

Reply via email to