Hello.
Unfortunately it didn't help. Still a huge difference between 50 vs 51
and disabling enableLazyFieldLoading in solrconfig.xml still helps.

solr-impl 10.0.0-SNAPSHOT 011d713a884559e3efeaa69e4f3c8dd8e630ff22
[snapshot build, details omitted]
cat solr/core/src/java/org/apache/solr/search/SolrDocumentFetcher.java
| head -370 | tail -13
    SolrDocumentStoredFieldVisitor(Set<String> toLoad, IndexReader
reader, int docId) {
      super(toLoad);
      this.docId = docId;
      this.doc = getDocument();
      if (documentCache == null) {
        // lazy loading makes no sense if we don't have a `documentCache`
        this.lazyFieldProducer = null;
        this.addLargeFieldsLazily = false;
      } else {
        this.lazyFieldProducer =
            toLoad != null && enableLazyFieldLoading ? new
LazyDocument(reader, docId) : null;
        this.addLargeFieldsLazily = !largeFields.isEmpty();
      }


On Wed, Jun 26, 2024 at 5:10 AM Michael Gibney
<mich...@michaelgibney.net> wrote:
>
> FYI:
> https://issues.apache.org/jira/browse/SOLR-17349
> https://github.com/apache/solr/pull/2535
>
> I'm curious whether this helps!
>
> On Fri, Jun 21, 2024 at 3:08 PM Oleksandr Tkachuk <sasha547...@gmail.com> 
> wrote:
> >
> > >If you're set up to try running a patched version on your data, I'm 
> > >curious to know if this will help.
> > I'll be happy to do this.
> >
> > >But maybe it's not so much a magic threshold as arbitrary, and specific to 
> > >the data you're evaluating over.
> > Well, I tested this case on the collection that you remembered, with a
> > large number of fields (564133 at this moment) and more documents
> > there (~68 million documents). The number of documents and their
> > content are significantly different there from where I tested
> > previously. And I can say that I was quickly able to reproduce the
> > problem with magic number 50(51), although not as noticeable as in the
> > previous one. I confirmed this on absolutely any cardinality and any
> > variance using hey (I’m more than sure that it will be reproduced on
> > any other benchmark). Although qtime did not differ visually or did
> > not differ as much as we would like, with the intensity of queries the
> > difference grows significantly (but still easier to reproduce on fl=
> > data that has high unevenness and low cardinality), for example:
> > Huge cardinality, values almost completely unique:
> > ./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
> > 'http://localhost:8983/solr/col/select?fl=fld1&wt=json&q=fld1:"fld1value"&start=0&rows=50'
> >   Slowest:      0.0024 secs
> >   Fastest:      0.0009 secs
> >   Average:      0.0013 secs
> >   Requests/sec: 3768.1874
> >
> > ./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
> > 'http://localhost:8983/solr/col/select?fl=fld1&wt=json&q=fld1:"fld1value"&start=0&rows=51'
> >   Slowest:      0.0018 secs
> >   Fastest:      0.0007 secs
> >   Average:      0.0009 secs
> >   Requests/sec: 5620.4994
> >
> > Just 1.5x diff
> >
> >
> > "fld2":["v1",30501964,"v2",4202177,"v3",210886] :
> > ./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
> > 'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v3"&start=0&rows=50'
> >   Slowest:      0.1198 secs
> >   Fastest:      0.0013 secs
> >   Average:      0.0019 secs
> >   Requests/sec: 2641.0227
> >
> > ./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
> > 'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v3"&start=0&rows=51'
> >   Slowest:      0.0051 secs
> >   Fastest:      0.0003 secs
> >   Average:      0.0003 secs
> >   Requests/sec: 14610.4688
> >
> > ./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
> > 'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v1"&start=0&rows=50'
> >   Slowest:      0.0059 secs
> >   Fastest:      0.0008 secs
> >   Average:      0.0010 secs
> >   Requests/sec: 4795.5539
> >
> > ./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
> > 'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v1"&start=0&rows=51'
> >   Slowest:      0.0010 secs
> >   Fastest:      0.0003 secs
> >   Average:      0.0003 secs
> >   Requests/sec: 14726.7978
> >
> > 4-6x diff
> >
> > On Fri, Jun 21, 2024 at 4:59 PM Michael Gibney
> > <mich...@michaelgibney.net> wrote:
> > >
> > > Interesting! If turning off lazy field loading helps, I think I have a
> > > trivial patch that may fix this (i.e. without requiring the workaround
> > > of disabling lazy field loading -- which, as you say, makes no sense
> > > to have in effect without the documentCache). The only thing that had
> > > been stopping me from suggesting this patch right off the bat was the
> > > "magic" threshold of 50, which I couldn't explain at all. But maybe
> > > it's not so much a magic threshold as arbitrary, and specific to the
> > > data you're evaluating over. I'll open an issue/PR more narrowly
> > > scoped to the change. I'd say you could open the issue, except I still
> > > don't fully understand the connection between the change I'm
> > > considering and the behavior you're seeing -- just that they seem very
> > > likely to be connected. If you're set up to try running a patched
> > > version on your data, I'm curious to know if this will help.
> > >
> > > On Thu, Jun 20, 2024 at 6:16 PM Oleksandr Tkachuk <sasha547...@gmail.com> 
> > > wrote:
> > > >
> > > > FYI: There is a solution in the last paragraph, but I still ran your
> > > > tests, since the solution was found by "Cut and Try"  and there is no
> > > > deep understanding.
> > > >
> > > > >I wonder what would happen if you fully bypassed the query cache 
> > > > >(i.e., `q={!cache=false}product_type:"1"`?
> > > > It does not help, there is not even one millisecond of difference in 
> > > > both cases.
> > > >
> > > > >I recall that previously you had a very large number of dynamic 
> > > > >fields. Is that the case here as well? And if so, are the dynamic 
> > > > >fields mostly stored? docValues?
> > > > This is another collection, I’ll get to the one with many many fields 
> > > > later :))
> > > > If this is the ~correct way to count the number of fields, then this
> > > > collection has the following number of fields:
> > > > curl -s "http://localhost:8983/solr/XXX/admin/luke?numTerms=0"; | grep
> > > > '"type"' | wc -l
> > > > 121
> > > > Of these, 88 have docvalues enabled and 33 stored.
> > > >
> > > > As for the two fields used in query, here's how they are defined in the 
> > > > schema.
> > > >   <field name="product_id" type="plong" indexed="true" stored="true"/>
> > > >   <field name="product_type" type="pint" indexed="true" stored="false"/>
> > > >   <fieldType name="pint" class="solr.IntPointField" docValues="true"/>
> > > >   <fieldType name="plong" class="solr.LongPointField" docValues="true"/>
> > > >
> > > > Changing fl= to something like a string field with stored=true without
> > > > docvalues results in zero changes.
> > > > I also tried this simple query on string type fields (copying the
> > > > field) and got the same result. I also tried it on fields where the
> > > > cardinality was different - the spread was not 150 times, but also
> > > > often noticeable. In addition, I still do not fully understand the
> > > > logic of this behavior
> > > > ("product_type":["3",1069282,"2",710042,"1",13702]) if I do:
> > > > 1) q=product_type:"1" rows=50 - qtime 150ms
> > > > 2) q=product_type:"1" rows=51 - qtime 0ms
> > > > 3) q=product_type:"2" rows=50 - qtime 3ms
> > > > 4) q=product_type:"2" rows=51 - qtime 0ms
> > > > 5) q=product_type:"3" rows=50 - qtime 1ms
> > > > 6) q=product_type:"3" rows=51 - qtime 0ms
> > > > I checked on other fields and get the same behavior - the fewer
> > > > documents contain a given value, the slower the query becomes.
> > > > If I can provide any more information, I will be glad.
> > > >
> > > > The problem was solved by turning off enableLazyFieldLoading. I am
> > > > very surprised that this functionality continues to work when document
> > > > cache is disabled and I thought that this parameter was intended only
> > > > for it. In addition, we received an improvement in avg and 95% on many
> > > > other types of queries, as well as some reduction in CPU load. Are
> > > > there any consequences or disadvantages of such a decision? If not,
> > > > then perhaps it is worth paying attention to this problem.
> > > >
> > > > On Thu, Jun 20, 2024 at 10:13 PM Michael Gibney
> > > > <mich...@michaelgibney.net> wrote:
> > > > >
> > > > > I've been unable to reproduce anything like this behavior. If you're
> > > > > really getting queryResultCache hits for these, then the field
> > > > > type/etc of the field you're querying on shouldn't make a difference.
> > > > > type/etc of the return field (product_id) would be more likely to
> > > > > matter. I wonder what would happen if you fully bypassed the query
> > > > > cache (i.e., `q={!cache=false}product_type:"1"`?
> > > > >
> > > > > I recall that previously you had a very large number of dynamic
> > > > > fields. Is that the case here as well? And if so, are the dynamic
> > > > > fields mostly stored? docValues?
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Jun 14, 2024 at 7:29 AM Oleksandr Tkachuk 
> > > > > <sasha547...@gmail.com> wrote:
> > > > > >
> > > > > > Initial data:
> > > > > > Doc count: 1793026
> > > > > > Field: "product_type", point int, indexed true, stored false,
> > > > > > docvalues true. Values:
> > > > > >  "facet_fields":{
> > > > > >       "product_type":["3",1069282,"2",710042,"1",13702]
> > > > > >     },
> > > > > > Single shard, single instance.
> > > > > >
> > > > > > # ./hey_linux_amd64 -n 10000 -c 10 -T "application/json"
> > > > > > 'http://localhost:8983/solr/XXX/select?fl=product_id&wt=json&q=product_type:"1"&start=0&rows=51'
> > > > > > Summary:
> > > > > >   Total:        0.6374 secs
> > > > > >   Slowest:      0.0043 secs
> > > > > >   Fastest:      0.0003 secs
> > > > > >   Average:      0.0006 secs
> > > > > >   Requests/sec: 15688.5755
> > > > > >
> > > > > > # ./hey_linux_amd64 -n 10000 -c 10 -T "application/json"
> > > > > > 'http://localhost:8983/solr/XXX/select?fl=product_id&wt=json&q=product_type:"1"&start=0&rows=50'
> > > > > > Summary:
> > > > > >   Total:        101.3246 secs
> > > > > >   Slowest:      0.2048 secs
> > > > > >   Fastest:      0.0564 secs
> > > > > >   Average:      0.1007 secs
> > > > > >   Requests/sec: 98.6927
> > > > > >
> > > > > >
> > > > > > 1) I've already played with queryResultWindowSize and
> > > > > > queryResultMaxDocsCached by setting different, high and low values 
> > > > > > and
> > > > > > this is probably not what I'm looking for since it gave a <few
> > > > > > milliseconds difference in query performance
> > > > > > 2) Checked on different versions of solr (9.6.1 and 8.7.0) - no
> > > > > > significant changes
> > > > > > 3) Tried changing the field type to string - zero performance 
> > > > > > changes
> > > > > > 4) In both cases I see successful lookups in queryResultCache
> > > > > > 5) Enabling documentCache solves the problem in this case 
> > > > > > (rows<=50),
> > > > > > but introduces many other performance issues so it doesn't seem 
> > > > > > like a
> > > > > > viable option.

Reply via email to