Re: Sorting by field is slow
Turns out it was a case of an oversite. My warming queries weren't setting the sort order and as a result don't successfully complete. After setting the sort order things appear to be responding quickly. Thanks for the help. On Mon, Jun 17, 2013 at 9:45 AM, Shane Perry wrote: > Using 4.3.1-SNAPSHOT I have identified where the issue is occurring. For > a query in the format (it returns one document, sorted by field4) > > +(field0:UUID0) -field1:string0 +field2:string1 +field3:text0 > +field4:"text1" > > > with the field types > > > > omitNorms="true"/> > > > > > > > > > > > > > the method FieldCacheImpl$SortedDocValuesCache#createValue, the reader > reports 2640449 terms. As a result, the loop on line 1198 is > executed 2640449 and the inner loop is executed a total of 658310778. My > index contains 56180128 documents. > > My configuration file sets the for the newSearcher and > firstSearcher listeners to the value > > >static firstSearcher warming in solrconfig.xml >field4 > > > > which does not appear to affect the speed. I'm not sure how replication > plays into the equation outside the fact that we are relatively aggressive > on the replication (every 60 seconds). I fear I may be at the end of my > knowledge without really getting into the code so any help at this point > would be greatly appreciated. > > Shane > > On Thu, Jun 13, 2013 at 4:11 PM, Shane Perry wrote: > >> I've dug through the code and have narrowed the delay down >> to TopFieldCollector$OneComparatorNonScoringCollector.setNextReader() at >> the point where the comparator's setNextReader() method is called (line 98 >> in the lucene_solr_4_3 branch). That line is actually two method calls so >> I'm not yet certain which path is the cause. I'll continue to dig through >> the code but am on thin ice so input would be great. >> >> Shane >> >> >> On Thu, Jun 13, 2013 at 7:56 AM, Shane Perry wrote: >> >>> Erick, >>> >>> We do have soft commits turned. Initially, autoCommit was set at 15000 >>> and autoSoftCommit at 1000. We did up those to 120 and 60 >>> respectively. However, since the core in question is a slave, we don't >>> actually do writes to the core but rely on replication only to populate the >>> index. In this case wouldn't autoCommit and autoSoftCommit essentially be >>> no-ops? I thought I had pulled out all hard commits but a double check >>> shows one instance where it still occurs. >>> >>> Thanks for your time. >>> >>> Shane >>> >>> On Thu, Jun 13, 2013 at 5:19 AM, Erick Erickson >> > wrote: >>> Shane: You've covered all the config stuff that I can think of. There's one other possibility. Do you have the soft commits turned on and are they very short? Although soft commits shouldn't invalidate any segment-level caches (but I'm not sure whether the sorting buffers are low-level or not). About the only other thing I can think of is that you're somehow doing hard commits from, say, the client but that's really stretching. All I can really say at this point is that this isn't a problem I've seen before, so it's _likely_ some innocent-seeming config has changed. I'm sure it'll be obvious once you find it ... Erick On Wed, Jun 12, 2013 at 11:51 PM, Shane Perry wrote: > Erick, > > I agree, it doesn't make sense. I manually merged the solrconfig.xml from > the distribution example with my 3.6 solrconfig.xml, pulling out what I > didn't need. There is the possibility I removed something I shouldn't have > though I don't know what it would be. Minus removing the dynamic fields, a > custom tokenizer class, and changing all my fields to be stored, the > schema.xml file should be the same as well. I'm not currently in the > position to do so, but I'll double check those two files. Finally, the > data was re-indexed when I moved to 4.3. > > My statement about field values wasn't stated very well. What I meant is > that the 'text' field has more unique terms than some of my other fields. > > As for this being an edge case, I'm not sure why it would manifest itself > in 4.3 but not in 3.6 (short of me having a screwy configuration setting). > If I get a chance, I'll see if I can duplicate the behavior with a small > document count in a sandboxed environment. > > Shane > > On Wed, Jun 12, 2013 at 5:14 PM, Erick Erickson < erickerick...@gmail.com>wrote: > >> This doesn't make much sense, particularly the fact >> that you added first/new searchers. I'm assuming that >> these are sorting on the same field as your slow query. >> >> But sorting on a text field for which >> "Overall, the values of the field are unique" >> is a red-flag. Solr doesn't sort on fields th
Re: Sorting by field is slow
Using 4.3.1-SNAPSHOT I have identified where the issue is occurring. For a query in the format (it returns one document, sorted by field4) +(field0:UUID0) -field1:string0 +field2:string1 +field3:text0 +field4:"text1" with the field types the method FieldCacheImpl$SortedDocValuesCache#createValue, the reader reports 2640449 terms. As a result, the loop on line 1198 is executed 2640449 and the inner loop is executed a total of 658310778. My index contains 56180128 documents. My configuration file sets the for the newSearcher and firstSearcher listeners to the value static firstSearcher warming in solrconfig.xml field4 which does not appear to affect the speed. I'm not sure how replication plays into the equation outside the fact that we are relatively aggressive on the replication (every 60 seconds). I fear I may be at the end of my knowledge without really getting into the code so any help at this point would be greatly appreciated. Shane On Thu, Jun 13, 2013 at 4:11 PM, Shane Perry wrote: > I've dug through the code and have narrowed the delay down > to TopFieldCollector$OneComparatorNonScoringCollector.setNextReader() at > the point where the comparator's setNextReader() method is called (line 98 > in the lucene_solr_4_3 branch). That line is actually two method calls so > I'm not yet certain which path is the cause. I'll continue to dig through > the code but am on thin ice so input would be great. > > Shane > > > On Thu, Jun 13, 2013 at 7:56 AM, Shane Perry wrote: > >> Erick, >> >> We do have soft commits turned. Initially, autoCommit was set at 15000 >> and autoSoftCommit at 1000. We did up those to 120 and 60 >> respectively. However, since the core in question is a slave, we don't >> actually do writes to the core but rely on replication only to populate the >> index. In this case wouldn't autoCommit and autoSoftCommit essentially be >> no-ops? I thought I had pulled out all hard commits but a double check >> shows one instance where it still occurs. >> >> Thanks for your time. >> >> Shane >> >> On Thu, Jun 13, 2013 at 5:19 AM, Erick Erickson >> wrote: >> >>> Shane: >>> >>> You've covered all the config stuff that I can think of. There's one >>> other possibility. Do you have the soft commits turned on and are >>> they very short? Although soft commits shouldn't invalidate any >>> segment-level caches (but I'm not sure whether the sorting buffers >>> are low-level or not). >>> >>> About the only other thing I can think of is that you're somehow >>> doing hard commits from, say, the client but that's really >>> stretching. >>> >>> All I can really say at this point is that this isn't a problem I've seen >>> before, so it's _likely_ some innocent-seeming config has changed. >>> I'm sure it'll be obvious once you find it ... >>> >>> Erick >>> >>> On Wed, Jun 12, 2013 at 11:51 PM, Shane Perry wrote: >>> > Erick, >>> > >>> > I agree, it doesn't make sense. I manually merged the solrconfig.xml >>> from >>> > the distribution example with my 3.6 solrconfig.xml, pulling out what I >>> > didn't need. There is the possibility I removed something I shouldn't >>> have >>> > though I don't know what it would be. Minus removing the dynamic >>> fields, a >>> > custom tokenizer class, and changing all my fields to be stored, the >>> > schema.xml file should be the same as well. I'm not currently in the >>> > position to do so, but I'll double check those two files. Finally, the >>> > data was re-indexed when I moved to 4.3. >>> > >>> > My statement about field values wasn't stated very well. What I meant >>> is >>> > that the 'text' field has more unique terms than some of my other >>> fields. >>> > >>> > As for this being an edge case, I'm not sure why it would manifest >>> itself >>> > in 4.3 but not in 3.6 (short of me having a screwy configuration >>> setting). >>> > If I get a chance, I'll see if I can duplicate the behavior with a >>> small >>> > document count in a sandboxed environment. >>> > >>> > Shane >>> > >>> > On Wed, Jun 12, 2013 at 5:14 PM, Erick Erickson < >>> erickerick...@gmail.com>wrote: >>> > >>> >> This doesn't make much sense, particularly the fact >>> >> that you added first/new searchers. I'm assuming that >>> >> these are sorting on the same field as your slow query. >>> >> >>> >> But sorting on a text field for which >>> >> "Overall, the values of the field are unique" >>> >> is a red-flag. Solr doesn't sort on fields that have >>> >> more than one term, so you might as well use a >>> >> string field and be done with it, it's possible you're >>> >> hitting some edge case. >>> >> >>> >> Did you just copy your 3.6 schema and configs to >>> >> 4.3? Did you re-index? >>> >> >>> >> Best >>> >> Erick >>> >> >>> >> On Wed, Jun 12, 2013 at 5:11 PM, Shane Perry >>> wrote: >>> >> > Thanks for the responses. >>> >> > >>> >> > Setting first/newSearcher had no noticeable effect. I'm sorting on >>> a >>> >> > sto
Re: Sorting by field is slow
I've dug through the code and have narrowed the delay down to TopFieldCollector$OneComparatorNonScoringCollector.setNextReader() at the point where the comparator's setNextReader() method is called (line 98 in the lucene_solr_4_3 branch). That line is actually two method calls so I'm not yet certain which path is the cause. I'll continue to dig through the code but am on thin ice so input would be great. Shane On Thu, Jun 13, 2013 at 7:56 AM, Shane Perry wrote: > Erick, > > We do have soft commits turned. Initially, autoCommit was set at 15000 > and autoSoftCommit at 1000. We did up those to 120 and 60 > respectively. However, since the core in question is a slave, we don't > actually do writes to the core but rely on replication only to populate the > index. In this case wouldn't autoCommit and autoSoftCommit essentially be > no-ops? I thought I had pulled out all hard commits but a double check > shows one instance where it still occurs. > > Thanks for your time. > > Shane > > On Thu, Jun 13, 2013 at 5:19 AM, Erick Erickson > wrote: > >> Shane: >> >> You've covered all the config stuff that I can think of. There's one >> other possibility. Do you have the soft commits turned on and are >> they very short? Although soft commits shouldn't invalidate any >> segment-level caches (but I'm not sure whether the sorting buffers >> are low-level or not). >> >> About the only other thing I can think of is that you're somehow >> doing hard commits from, say, the client but that's really >> stretching. >> >> All I can really say at this point is that this isn't a problem I've seen >> before, so it's _likely_ some innocent-seeming config has changed. >> I'm sure it'll be obvious once you find it ... >> >> Erick >> >> On Wed, Jun 12, 2013 at 11:51 PM, Shane Perry wrote: >> > Erick, >> > >> > I agree, it doesn't make sense. I manually merged the solrconfig.xml >> from >> > the distribution example with my 3.6 solrconfig.xml, pulling out what I >> > didn't need. There is the possibility I removed something I shouldn't >> have >> > though I don't know what it would be. Minus removing the dynamic >> fields, a >> > custom tokenizer class, and changing all my fields to be stored, the >> > schema.xml file should be the same as well. I'm not currently in the >> > position to do so, but I'll double check those two files. Finally, the >> > data was re-indexed when I moved to 4.3. >> > >> > My statement about field values wasn't stated very well. What I meant >> is >> > that the 'text' field has more unique terms than some of my other >> fields. >> > >> > As for this being an edge case, I'm not sure why it would manifest >> itself >> > in 4.3 but not in 3.6 (short of me having a screwy configuration >> setting). >> > If I get a chance, I'll see if I can duplicate the behavior with a >> small >> > document count in a sandboxed environment. >> > >> > Shane >> > >> > On Wed, Jun 12, 2013 at 5:14 PM, Erick Erickson < >> erickerick...@gmail.com>wrote: >> > >> >> This doesn't make much sense, particularly the fact >> >> that you added first/new searchers. I'm assuming that >> >> these are sorting on the same field as your slow query. >> >> >> >> But sorting on a text field for which >> >> "Overall, the values of the field are unique" >> >> is a red-flag. Solr doesn't sort on fields that have >> >> more than one term, so you might as well use a >> >> string field and be done with it, it's possible you're >> >> hitting some edge case. >> >> >> >> Did you just copy your 3.6 schema and configs to >> >> 4.3? Did you re-index? >> >> >> >> Best >> >> Erick >> >> >> >> On Wed, Jun 12, 2013 at 5:11 PM, Shane Perry >> wrote: >> >> > Thanks for the responses. >> >> > >> >> > Setting first/newSearcher had no noticeable effect. I'm sorting on a >> >> > stored/indexed field named 'text' who's fieldType is solr.TextField. >> >> > Overall, the values of the field are unique. The JVM is only using >> about >> >> > 2G of the available 12G, so no OOM/GC issue (at least on the >> surface). >> >> The >> >> > server is question is a slave with approximately 56 million >> documents. >> >> > Additionally, sorting on a field of the same type but with >> significantly >> >> > less uniqueness results quick response times. >> >> > >> >> > The following is a sample of *debugQuery=true* for a query which >> returns >> >> 1 >> >> > document: >> >> > >> >> > >> >> > 61458.0 >> >> > >> >> > 61452.0 >> >> > >> >> > >> >> > 0.0 >> >> > >> >> > >> >> > 0.0 >> >> > >> >> > >> >> > 0.0 >> >> > >> >> > >> >> > 0.0 >> >> > >> >> > >> >> > 6.0 >> >> > >> >> > >> >> > >> >> > >> >> > -- Update -- >> >> > >> >> > Out of desperation, I turned off replication by commenting out the >> *> >> > name="slave">* element in the replication requestHandler block. >> After >> >> > restarting tomcat I was surprised to find that the replication admin >> UI >> >> > still reported the core as rep
Re: Sorting by field is slow
Erick, We do have soft commits turned. Initially, autoCommit was set at 15000 and autoSoftCommit at 1000. We did up those to 120 and 60 respectively. However, since the core in question is a slave, we don't actually do writes to the core but rely on replication only to populate the index. In this case wouldn't autoCommit and autoSoftCommit essentially be no-ops? I thought I had pulled out all hard commits but a double check shows one instance where it still occurs. Thanks for your time. Shane On Thu, Jun 13, 2013 at 5:19 AM, Erick Erickson wrote: > Shane: > > You've covered all the config stuff that I can think of. There's one > other possibility. Do you have the soft commits turned on and are > they very short? Although soft commits shouldn't invalidate any > segment-level caches (but I'm not sure whether the sorting buffers > are low-level or not). > > About the only other thing I can think of is that you're somehow > doing hard commits from, say, the client but that's really > stretching. > > All I can really say at this point is that this isn't a problem I've seen > before, so it's _likely_ some innocent-seeming config has changed. > I'm sure it'll be obvious once you find it ... > > Erick > > On Wed, Jun 12, 2013 at 11:51 PM, Shane Perry wrote: > > Erick, > > > > I agree, it doesn't make sense. I manually merged the solrconfig.xml > from > > the distribution example with my 3.6 solrconfig.xml, pulling out what I > > didn't need. There is the possibility I removed something I shouldn't > have > > though I don't know what it would be. Minus removing the dynamic > fields, a > > custom tokenizer class, and changing all my fields to be stored, the > > schema.xml file should be the same as well. I'm not currently in the > > position to do so, but I'll double check those two files. Finally, the > > data was re-indexed when I moved to 4.3. > > > > My statement about field values wasn't stated very well. What I meant is > > that the 'text' field has more unique terms than some of my other fields. > > > > As for this being an edge case, I'm not sure why it would manifest itself > > in 4.3 but not in 3.6 (short of me having a screwy configuration > setting). > > If I get a chance, I'll see if I can duplicate the behavior with a small > > document count in a sandboxed environment. > > > > Shane > > > > On Wed, Jun 12, 2013 at 5:14 PM, Erick Erickson >wrote: > > > >> This doesn't make much sense, particularly the fact > >> that you added first/new searchers. I'm assuming that > >> these are sorting on the same field as your slow query. > >> > >> But sorting on a text field for which > >> "Overall, the values of the field are unique" > >> is a red-flag. Solr doesn't sort on fields that have > >> more than one term, so you might as well use a > >> string field and be done with it, it's possible you're > >> hitting some edge case. > >> > >> Did you just copy your 3.6 schema and configs to > >> 4.3? Did you re-index? > >> > >> Best > >> Erick > >> > >> On Wed, Jun 12, 2013 at 5:11 PM, Shane Perry wrote: > >> > Thanks for the responses. > >> > > >> > Setting first/newSearcher had no noticeable effect. I'm sorting on a > >> > stored/indexed field named 'text' who's fieldType is solr.TextField. > >> > Overall, the values of the field are unique. The JVM is only using > about > >> > 2G of the available 12G, so no OOM/GC issue (at least on the surface). > >> The > >> > server is question is a slave with approximately 56 million documents. > >> > Additionally, sorting on a field of the same type but with > significantly > >> > less uniqueness results quick response times. > >> > > >> > The following is a sample of *debugQuery=true* for a query which > returns > >> 1 > >> > document: > >> > > >> > > >> > 61458.0 > >> > > >> > 61452.0 > >> > > >> > > >> > 0.0 > >> > > >> > > >> > 0.0 > >> > > >> > > >> > 0.0 > >> > > >> > > >> > 0.0 > >> > > >> > > >> > 6.0 > >> > > >> > > >> > > >> > > >> > -- Update -- > >> > > >> > Out of desperation, I turned off replication by commenting out the > * >> > name="slave">* element in the replication requestHandler block. After > >> > restarting tomcat I was surprised to find that the replication admin > UI > >> > still reported the core as replicating. Search queries were still > slow. > >> I > >> > then disabled replication via the UI and the display updated to report > >> the > >> > core was no longer replicating. Queries are now fast so it appears > that > >> > the sorting may be a red-herring. > >> > > >> > It's may be of note to also mention that the slow queries don't > appear to > >> > be getting cached. > >> > > >> > Thanks again for the feed back. > >> > > >> > On Wed, Jun 12, 2013 at 2:33 PM, Jack Krupansky < > j...@basetechnology.com > >> >wrote: > >> > > >> >> Rerun the sorted query with &debugQuery=true and look at the module > >> >> timings. See what stands out > >> >> > >> >> Ar
Re: Sorting by field is slow
Shane: You've covered all the config stuff that I can think of. There's one other possibility. Do you have the soft commits turned on and are they very short? Although soft commits shouldn't invalidate any segment-level caches (but I'm not sure whether the sorting buffers are low-level or not). About the only other thing I can think of is that you're somehow doing hard commits from, say, the client but that's really stretching. All I can really say at this point is that this isn't a problem I've seen before, so it's _likely_ some innocent-seeming config has changed. I'm sure it'll be obvious once you find it ... Erick On Wed, Jun 12, 2013 at 11:51 PM, Shane Perry wrote: > Erick, > > I agree, it doesn't make sense. I manually merged the solrconfig.xml from > the distribution example with my 3.6 solrconfig.xml, pulling out what I > didn't need. There is the possibility I removed something I shouldn't have > though I don't know what it would be. Minus removing the dynamic fields, a > custom tokenizer class, and changing all my fields to be stored, the > schema.xml file should be the same as well. I'm not currently in the > position to do so, but I'll double check those two files. Finally, the > data was re-indexed when I moved to 4.3. > > My statement about field values wasn't stated very well. What I meant is > that the 'text' field has more unique terms than some of my other fields. > > As for this being an edge case, I'm not sure why it would manifest itself > in 4.3 but not in 3.6 (short of me having a screwy configuration setting). > If I get a chance, I'll see if I can duplicate the behavior with a small > document count in a sandboxed environment. > > Shane > > On Wed, Jun 12, 2013 at 5:14 PM, Erick Erickson > wrote: > >> This doesn't make much sense, particularly the fact >> that you added first/new searchers. I'm assuming that >> these are sorting on the same field as your slow query. >> >> But sorting on a text field for which >> "Overall, the values of the field are unique" >> is a red-flag. Solr doesn't sort on fields that have >> more than one term, so you might as well use a >> string field and be done with it, it's possible you're >> hitting some edge case. >> >> Did you just copy your 3.6 schema and configs to >> 4.3? Did you re-index? >> >> Best >> Erick >> >> On Wed, Jun 12, 2013 at 5:11 PM, Shane Perry wrote: >> > Thanks for the responses. >> > >> > Setting first/newSearcher had no noticeable effect. I'm sorting on a >> > stored/indexed field named 'text' who's fieldType is solr.TextField. >> > Overall, the values of the field are unique. The JVM is only using about >> > 2G of the available 12G, so no OOM/GC issue (at least on the surface). >> The >> > server is question is a slave with approximately 56 million documents. >> > Additionally, sorting on a field of the same type but with significantly >> > less uniqueness results quick response times. >> > >> > The following is a sample of *debugQuery=true* for a query which returns >> 1 >> > document: >> > >> > >> > 61458.0 >> > >> > 61452.0 >> > >> > >> > 0.0 >> > >> > >> > 0.0 >> > >> > >> > 0.0 >> > >> > >> > 0.0 >> > >> > >> > 6.0 >> > >> > >> > >> > >> > -- Update -- >> > >> > Out of desperation, I turned off replication by commenting out the *> > name="slave">* element in the replication requestHandler block. After >> > restarting tomcat I was surprised to find that the replication admin UI >> > still reported the core as replicating. Search queries were still slow. >> I >> > then disabled replication via the UI and the display updated to report >> the >> > core was no longer replicating. Queries are now fast so it appears that >> > the sorting may be a red-herring. >> > >> > It's may be of note to also mention that the slow queries don't appear to >> > be getting cached. >> > >> > Thanks again for the feed back. >> > >> > On Wed, Jun 12, 2013 at 2:33 PM, Jack Krupansky > >wrote: >> > >> >> Rerun the sorted query with &debugQuery=true and look at the module >> >> timings. See what stands out >> >> >> >> Are you actually sorting on a "text" field, as opposed to a "string" >> field? >> >> >> >> Of course, it's always possible that maybe you're hitting some odd >> OOM/GC >> >> condition as a result of Solr growing between releases. >> >> >> >> -- Jack Krupansky >> >> >> >> -Original Message- From: Shane Perry >> >> Sent: Wednesday, June 12, 2013 3:00 PM >> >> To: solr-user@lucene.apache.org >> >> Subject: Sorting by field is slow >> >> >> >> >> >> In upgrading from Solr 3.6.1 to 4.3.0, our query response time has >> >> increased exponentially. After testing in 4.3.0 it appears the same >> query >> >> (with 1 matching document) returns after 100 ms without sorting but >> takes 1 >> >> minute when sorting by a text field. I've looked around but haven't yet >> >> found a reason for the degradation. Can someone give me some insight or >> >> point
Re: Sorting by field is slow
Erick, I agree, it doesn't make sense. I manually merged the solrconfig.xml from the distribution example with my 3.6 solrconfig.xml, pulling out what I didn't need. There is the possibility I removed something I shouldn't have though I don't know what it would be. Minus removing the dynamic fields, a custom tokenizer class, and changing all my fields to be stored, the schema.xml file should be the same as well. I'm not currently in the position to do so, but I'll double check those two files. Finally, the data was re-indexed when I moved to 4.3. My statement about field values wasn't stated very well. What I meant is that the 'text' field has more unique terms than some of my other fields. As for this being an edge case, I'm not sure why it would manifest itself in 4.3 but not in 3.6 (short of me having a screwy configuration setting). If I get a chance, I'll see if I can duplicate the behavior with a small document count in a sandboxed environment. Shane On Wed, Jun 12, 2013 at 5:14 PM, Erick Erickson wrote: > This doesn't make much sense, particularly the fact > that you added first/new searchers. I'm assuming that > these are sorting on the same field as your slow query. > > But sorting on a text field for which > "Overall, the values of the field are unique" > is a red-flag. Solr doesn't sort on fields that have > more than one term, so you might as well use a > string field and be done with it, it's possible you're > hitting some edge case. > > Did you just copy your 3.6 schema and configs to > 4.3? Did you re-index? > > Best > Erick > > On Wed, Jun 12, 2013 at 5:11 PM, Shane Perry wrote: > > Thanks for the responses. > > > > Setting first/newSearcher had no noticeable effect. I'm sorting on a > > stored/indexed field named 'text' who's fieldType is solr.TextField. > > Overall, the values of the field are unique. The JVM is only using about > > 2G of the available 12G, so no OOM/GC issue (at least on the surface). > The > > server is question is a slave with approximately 56 million documents. > > Additionally, sorting on a field of the same type but with significantly > > less uniqueness results quick response times. > > > > The following is a sample of *debugQuery=true* for a query which returns > 1 > > document: > > > > > > 61458.0 > > > > 61452.0 > > > > > > 0.0 > > > > > > 0.0 > > > > > > 0.0 > > > > > > 0.0 > > > > > > 6.0 > > > > > > > > > > -- Update -- > > > > Out of desperation, I turned off replication by commenting out the * > name="slave">* element in the replication requestHandler block. After > > restarting tomcat I was surprised to find that the replication admin UI > > still reported the core as replicating. Search queries were still slow. > I > > then disabled replication via the UI and the display updated to report > the > > core was no longer replicating. Queries are now fast so it appears that > > the sorting may be a red-herring. > > > > It's may be of note to also mention that the slow queries don't appear to > > be getting cached. > > > > Thanks again for the feed back. > > > > On Wed, Jun 12, 2013 at 2:33 PM, Jack Krupansky >wrote: > > > >> Rerun the sorted query with &debugQuery=true and look at the module > >> timings. See what stands out > >> > >> Are you actually sorting on a "text" field, as opposed to a "string" > field? > >> > >> Of course, it's always possible that maybe you're hitting some odd > OOM/GC > >> condition as a result of Solr growing between releases. > >> > >> -- Jack Krupansky > >> > >> -Original Message- From: Shane Perry > >> Sent: Wednesday, June 12, 2013 3:00 PM > >> To: solr-user@lucene.apache.org > >> Subject: Sorting by field is slow > >> > >> > >> In upgrading from Solr 3.6.1 to 4.3.0, our query response time has > >> increased exponentially. After testing in 4.3.0 it appears the same > query > >> (with 1 matching document) returns after 100 ms without sorting but > takes 1 > >> minute when sorting by a text field. I've looked around but haven't yet > >> found a reason for the degradation. Can someone give me some insight or > >> point me in the right direction for resolving this? In most cases, I > can > >> change my code to do client-side sorting but I do have a couple of > >> situations where pagination prevents client-side sorting. Any help > would > >> be greatly appreciated. > >> > >> Thanks, > >> > >> Shane > >> >
Re: Sorting by field is slow
This doesn't make much sense, particularly the fact that you added first/new searchers. I'm assuming that these are sorting on the same field as your slow query. But sorting on a text field for which "Overall, the values of the field are unique" is a red-flag. Solr doesn't sort on fields that have more than one term, so you might as well use a string field and be done with it, it's possible you're hitting some edge case. Did you just copy your 3.6 schema and configs to 4.3? Did you re-index? Best Erick On Wed, Jun 12, 2013 at 5:11 PM, Shane Perry wrote: > Thanks for the responses. > > Setting first/newSearcher had no noticeable effect. I'm sorting on a > stored/indexed field named 'text' who's fieldType is solr.TextField. > Overall, the values of the field are unique. The JVM is only using about > 2G of the available 12G, so no OOM/GC issue (at least on the surface). The > server is question is a slave with approximately 56 million documents. > Additionally, sorting on a field of the same type but with significantly > less uniqueness results quick response times. > > The following is a sample of *debugQuery=true* for a query which returns 1 > document: > > > 61458.0 > > 61452.0 > > > 0.0 > > > 0.0 > > > 0.0 > > > 0.0 > > > 6.0 > > > > > -- Update -- > > Out of desperation, I turned off replication by commenting out the * name="slave">* element in the replication requestHandler block. After > restarting tomcat I was surprised to find that the replication admin UI > still reported the core as replicating. Search queries were still slow. I > then disabled replication via the UI and the display updated to report the > core was no longer replicating. Queries are now fast so it appears that > the sorting may be a red-herring. > > It's may be of note to also mention that the slow queries don't appear to > be getting cached. > > Thanks again for the feed back. > > On Wed, Jun 12, 2013 at 2:33 PM, Jack Krupansky > wrote: > >> Rerun the sorted query with &debugQuery=true and look at the module >> timings. See what stands out >> >> Are you actually sorting on a "text" field, as opposed to a "string" field? >> >> Of course, it's always possible that maybe you're hitting some odd OOM/GC >> condition as a result of Solr growing between releases. >> >> -- Jack Krupansky >> >> -Original Message- From: Shane Perry >> Sent: Wednesday, June 12, 2013 3:00 PM >> To: solr-user@lucene.apache.org >> Subject: Sorting by field is slow >> >> >> In upgrading from Solr 3.6.1 to 4.3.0, our query response time has >> increased exponentially. After testing in 4.3.0 it appears the same query >> (with 1 matching document) returns after 100 ms without sorting but takes 1 >> minute when sorting by a text field. I've looked around but haven't yet >> found a reason for the degradation. Can someone give me some insight or >> point me in the right direction for resolving this? In most cases, I can >> change my code to do client-side sorting but I do have a couple of >> situations where pagination prevents client-side sorting. Any help would >> be greatly appreciated. >> >> Thanks, >> >> Shane >>
Re: Sorting by field is slow
Thanks for the responses. Setting first/newSearcher had no noticeable effect. I'm sorting on a stored/indexed field named 'text' who's fieldType is solr.TextField. Overall, the values of the field are unique. The JVM is only using about 2G of the available 12G, so no OOM/GC issue (at least on the surface). The server is question is a slave with approximately 56 million documents. Additionally, sorting on a field of the same type but with significantly less uniqueness results quick response times. The following is a sample of *debugQuery=true* for a query which returns 1 document: 61458.0 61452.0 0.0 0.0 0.0 0.0 6.0 -- Update -- Out of desperation, I turned off replication by commenting out the ** element in the replication requestHandler block. After restarting tomcat I was surprised to find that the replication admin UI still reported the core as replicating. Search queries were still slow. I then disabled replication via the UI and the display updated to report the core was no longer replicating. Queries are now fast so it appears that the sorting may be a red-herring. It's may be of note to also mention that the slow queries don't appear to be getting cached. Thanks again for the feed back. On Wed, Jun 12, 2013 at 2:33 PM, Jack Krupansky wrote: > Rerun the sorted query with &debugQuery=true and look at the module > timings. See what stands out > > Are you actually sorting on a "text" field, as opposed to a "string" field? > > Of course, it's always possible that maybe you're hitting some odd OOM/GC > condition as a result of Solr growing between releases. > > -- Jack Krupansky > > -Original Message- From: Shane Perry > Sent: Wednesday, June 12, 2013 3:00 PM > To: solr-user@lucene.apache.org > Subject: Sorting by field is slow > > > In upgrading from Solr 3.6.1 to 4.3.0, our query response time has > increased exponentially. After testing in 4.3.0 it appears the same query > (with 1 matching document) returns after 100 ms without sorting but takes 1 > minute when sorting by a text field. I've looked around but haven't yet > found a reason for the degradation. Can someone give me some insight or > point me in the right direction for resolving this? In most cases, I can > change my code to do client-side sorting but I do have a couple of > situations where pagination prevents client-side sorting. Any help would > be greatly appreciated. > > Thanks, > > Shane >
Re: Sorting by field is slow
Rerun the sorted query with &debugQuery=true and look at the module timings. See what stands out Are you actually sorting on a "text" field, as opposed to a "string" field? Of course, it's always possible that maybe you're hitting some odd OOM/GC condition as a result of Solr growing between releases. -- Jack Krupansky -Original Message- From: Shane Perry Sent: Wednesday, June 12, 2013 3:00 PM To: solr-user@lucene.apache.org Subject: Sorting by field is slow In upgrading from Solr 3.6.1 to 4.3.0, our query response time has increased exponentially. After testing in 4.3.0 it appears the same query (with 1 matching document) returns after 100 ms without sorting but takes 1 minute when sorting by a text field. I've looked around but haven't yet found a reason for the degradation. Can someone give me some insight or point me in the right direction for resolving this? In most cases, I can change my code to do client-side sorting but I do have a couple of situations where pagination prevents client-side sorting. Any help would be greatly appreciated. Thanks, Shane
Re: Sorting by field is slow
http://wiki.apache.org/solr/SolrPerformanceFactors If you do a lot of field based sorting, it is advantageous to add explicitly warming queries to the "newSearcher" and "firstSearcher" event listeners in your solrconfig which sort on those fields, so the FieldCache is populated prior to any queries being executed by your users. -- View this message in context: http://lucene.472066.n3.nabble.com/Sorting-by-field-is-slow-tp4070026p4070028.html Sent from the Solr - User mailing list archive at Nabble.com.