Using 4.3.1-SNAPSHOT I have identified where the issue is occurring.  For a
query in the format (it returns one document, sorted by field4)

+(field0:UUID0) -field1:string0 +field2:string1 +field3:text0
+field4:"text1"


with the field types

<fieldType name="uuid" class="solr.UUIDField" indexed="true"/>

<fieldType name="string" class="solr.StrField" sortMissingFirst="true"
omitNorms="true"/>

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
  <analyzer>
    <charFilter class="solr.HTMLStripCharFilterFactory"/>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StandardFilterFactory"/>
    <filter class="solr.TrimFilterFactory"/>
    <filter class="solr.ICUFoldingFilterFactory"/>
  </analyzer>
</fieldType>


the method FieldCacheImpl$SortedDocValuesCache#createValue, the reader
reports 2640449 terms.  As a result, the loop on line 1198 is
executed 2640449 and the inner loop is executed a total of 658310778.  My
index contains 56180128 documents.

My configuration file sets the <queries> for the newSearcher and
firstSearcher listeners to the value

<lst>
   <str name="q">static firstSearcher warming in solrconfig.xml</str>
   <str name="sort">field4</str>
</lst>


which does not appear to affect the speed.  I'm not sure how replication
plays into the equation outside the fact that we are relatively aggressive
on the replication (every 60 seconds).  I fear I may be at the end of my
knowledge without really getting into the code so any help at this point
would be greatly appreciated.

Shane

On Thu, Jun 13, 2013 at 4:11 PM, Shane Perry <thry...@gmail.com> wrote:

> I've dug through the code and have narrowed the delay down
> to TopFieldCollector$OneComparatorNonScoringCollector.setNextReader() at
> the point where the comparator's setNextReader() method is called (line 98
> in the lucene_solr_4_3 branch).  That line is actually two method calls so
> I'm not yet certain which path is the cause.  I'll continue to dig through
> the code but am on thin ice so input would be great.
>
> Shane
>
>
> On Thu, Jun 13, 2013 at 7:56 AM, Shane Perry <thry...@gmail.com> wrote:
>
>> Erick,
>>
>> We do have soft commits turned.  Initially, autoCommit was set at 15000
>> and autoSoftCommit at 1000.  We did up those to 1200000 and 600000
>> respectively.  However, since the core in question is a slave, we don't
>> actually do writes to the core but rely on replication only to populate the
>> index.  In this case wouldn't autoCommit and autoSoftCommit essentially be
>> no-ops?  I thought I had pulled out all hard commits but a double check
>> shows one instance where it still occurs.
>>
>> Thanks for your time.
>>
>> Shane
>>
>> On Thu, Jun 13, 2013 at 5:19 AM, Erick Erickson 
>> <erickerick...@gmail.com>wrote:
>>
>>> Shane:
>>>
>>> You've covered all the config stuff that I can think of. There's one
>>> other possibility. Do you have the soft commits turned on and are
>>> they very short? Although soft commits shouldn't invalidate any
>>> segment-level caches (but I'm not sure whether the sorting buffers
>>> are low-level or not).
>>>
>>> About the only other thing I can think of is that you're somehow
>>> doing hard commits from, say, the client but that's really
>>> stretching.
>>>
>>> All I can really say at this point is that this isn't a problem I've seen
>>> before, so it's _likely_ some innocent-seeming config has changed.
>>> I'm sure it'll be obvious once you find it <G>...
>>>
>>> Erick
>>>
>>> On Wed, Jun 12, 2013 at 11:51 PM, Shane Perry <thry...@gmail.com> wrote:
>>> > Erick,
>>> >
>>> > I agree, it doesn't make sense.  I manually merged the solrconfig.xml
>>> from
>>> > the distribution example with my 3.6 solrconfig.xml, pulling out what I
>>> > didn't need.  There is the possibility I removed something I shouldn't
>>> have
>>> > though I don't know what it would be.  Minus removing the dynamic
>>> fields, a
>>> > custom tokenizer class, and changing all my fields to be stored, the
>>> > schema.xml file should be the same as well.  I'm not currently in the
>>> > position to do so, but I'll double check those two files.  Finally, the
>>> > data was re-indexed when I moved to 4.3.
>>> >
>>> > My statement about field values wasn't stated very well.  What I meant
>>> is
>>> > that the 'text' field has more unique terms than some of my other
>>> fields.
>>> >
>>> > As for this being an edge case, I'm not sure why it would manifest
>>> itself
>>> > in 4.3 but not in 3.6 (short of me having a screwy configuration
>>> setting).
>>> >  If I get a chance, I'll see if I can duplicate the behavior with a
>>> small
>>> > document count in a sandboxed environment.
>>> >
>>> > Shane
>>> >
>>> > On Wed, Jun 12, 2013 at 5:14 PM, Erick Erickson <
>>> erickerick...@gmail.com>wrote:
>>> >
>>> >> This doesn't make much sense, particularly the fact
>>> >> that you added first/new searchers. I'm assuming that
>>> >> these are sorting on the same field as your slow query.
>>> >>
>>> >> But sorting on a text field for which
>>> >> "Overall, the values of the field are unique"
>>> >> is a red-flag. Solr doesn't sort on fields that have
>>> >> more than one term, so you might as well use a
>>> >> string field and be done with it, it's possible you're
>>> >> hitting some edge case.
>>> >>
>>> >> Did you just copy your 3.6 schema and configs to
>>> >> 4.3? Did you re-index?
>>> >>
>>> >> Best
>>> >> Erick
>>> >>
>>> >> On Wed, Jun 12, 2013 at 5:11 PM, Shane Perry <thry...@gmail.com>
>>> wrote:
>>> >> > Thanks for the responses.
>>> >> >
>>> >> > Setting first/newSearcher had no noticeable effect.  I'm sorting on
>>> a
>>> >> > stored/indexed field named 'text' who's fieldType is solr.TextField.
>>> >> >  Overall, the values of the field are unique. The JVM is only using
>>> about
>>> >> > 2G of the available 12G, so no OOM/GC issue (at least on the
>>> surface).
>>> >>  The
>>> >> > server is question is a slave with approximately 56 million
>>> documents.
>>> >> >  Additionally, sorting on a field of the same type but with
>>> significantly
>>> >> > less uniqueness results quick response times.
>>> >> >
>>> >> > The following is a sample of *debugQuery=true* for a query which
>>> returns
>>> >> 1
>>> >> > document:
>>> >> >
>>> >> > <lst name="process">
>>> >> >   <double name="time">61458.0</double>
>>> >> >   <lst name="query">
>>> >> >     <double name="time">61452.0</double>
>>> >> >   </lst>
>>> >> >   <lst name="facet">
>>> >> >     <double name="time">0.0</double>
>>> >> >   </lst>
>>> >> >   <lst name="mlt">
>>> >> >     <double name="time">0.0</double>
>>> >> >   </lst>
>>> >> >   <lst name="highlight">
>>> >> >     <double name="time">0.0</double>
>>> >> >   </lst>
>>> >> >   <lst name="stats">
>>> >> >     <double name="time">0.0</double>
>>> >> >   </lst>
>>> >> >   <lst name="debug">
>>> >> >     <double name="time">6.0</double>
>>> >> >   </lst>
>>> >> > </lst>
>>> >> >
>>> >> >
>>> >> > -- Update --
>>> >> >
>>> >> > Out of desperation, I turned off replication by commenting out the
>>> *<list
>>> >> > name="slave">* element in the replication requestHandler block.
>>>  After
>>> >> > restarting tomcat I was surprised to find that the replication
>>> admin UI
>>> >> > still reported the core as replicating.  Search queries were still
>>> slow.
>>> >>  I
>>> >> > then disabled replication via the UI and the display updated to
>>> report
>>> >> the
>>> >> > core was no longer replicating.  Queries are now fast so it appears
>>> that
>>> >> > the sorting may be a red-herring.
>>> >> >
>>> >> > It's may be of note to also mention that the slow queries don't
>>> appear to
>>> >> > be getting cached.
>>> >> >
>>> >> > Thanks again for the feed back.
>>> >> >
>>> >> > On Wed, Jun 12, 2013 at 2:33 PM, Jack Krupansky <
>>> j...@basetechnology.com
>>> >> >wrote:
>>> >> >
>>> >> >> Rerun the sorted query with &debugQuery=true and look at the module
>>> >> >> timings. See what stands out
>>> >> >>
>>> >> >> Are you actually sorting on a "text" field, as opposed to a
>>> "string"
>>> >> field?
>>> >> >>
>>> >> >> Of course, it's always possible that maybe you're hitting some odd
>>> >> OOM/GC
>>> >> >> condition as a result of Solr growing  between releases.
>>> >> >>
>>> >> >> -- Jack Krupansky
>>> >> >>
>>> >> >> -----Original Message----- From: Shane Perry
>>> >> >> Sent: Wednesday, June 12, 2013 3:00 PM
>>> >> >> To: solr-user@lucene.apache.org
>>> >> >> Subject: Sorting by field is slow
>>> >> >>
>>> >> >>
>>> >> >> In upgrading from Solr 3.6.1 to 4.3.0, our query response time has
>>> >> >> increased exponentially.  After testing in 4.3.0 it appears the
>>> same
>>> >> query
>>> >> >> (with 1 matching document) returns after 100 ms without sorting but
>>> >> takes 1
>>> >> >> minute when sorting by a text field.  I've looked around but
>>> haven't yet
>>> >> >> found a reason for the degradation.  Can someone give me some
>>> insight or
>>> >> >> point me in the right direction for resolving this?  In most
>>> cases, I
>>> >> can
>>> >> >> change my code to do client-side sorting but I do have a couple of
>>> >> >> situations where pagination prevents client-side sorting.  Any help
>>> >> would
>>> >> >> be greatly appreciated.
>>> >> >>
>>> >> >> Thanks,
>>> >> >>
>>> >> >> Shane
>>> >> >>
>>> >>
>>>
>>
>>
>

Reply via email to