FYI:
https://issues.apache.org/jira/browse/SOLR-17349
https://github.com/apache/solr/pull/2535

I'm curious whether this helps!

On Fri, Jun 21, 2024 at 3:08 PM Oleksandr Tkachuk <sasha547...@gmail.com> wrote:
>
> >If you're set up to try running a patched version on your data, I'm curious 
> >to know if this will help.
> I'll be happy to do this.
>
> >But maybe it's not so much a magic threshold as arbitrary, and specific to 
> >the data you're evaluating over.
> Well, I tested this case on the collection that you remembered, with a
> large number of fields (564133 at this moment) and more documents
> there (~68 million documents). The number of documents and their
> content are significantly different there from where I tested
> previously. And I can say that I was quickly able to reproduce the
> problem with magic number 50(51), although not as noticeable as in the
> previous one. I confirmed this on absolutely any cardinality and any
> variance using hey (I’m more than sure that it will be reproduced on
> any other benchmark). Although qtime did not differ visually or did
> not differ as much as we would like, with the intensity of queries the
> difference grows significantly (but still easier to reproduce on fl=
> data that has high unevenness and low cardinality), for example:
> Huge cardinality, values almost completely unique:
> ./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
> 'http://localhost:8983/solr/col/select?fl=fld1&wt=json&q=fld1:"fld1value"&start=0&rows=50'
>   Slowest:      0.0024 secs
>   Fastest:      0.0009 secs
>   Average:      0.0013 secs
>   Requests/sec: 3768.1874
>
> ./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
> 'http://localhost:8983/solr/col/select?fl=fld1&wt=json&q=fld1:"fld1value"&start=0&rows=51'
>   Slowest:      0.0018 secs
>   Fastest:      0.0007 secs
>   Average:      0.0009 secs
>   Requests/sec: 5620.4994
>
> Just 1.5x diff
>
>
> "fld2":["v1",30501964,"v2",4202177,"v3",210886] :
> ./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
> 'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v3"&start=0&rows=50'
>   Slowest:      0.1198 secs
>   Fastest:      0.0013 secs
>   Average:      0.0019 secs
>   Requests/sec: 2641.0227
>
> ./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
> 'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v3"&start=0&rows=51'
>   Slowest:      0.0051 secs
>   Fastest:      0.0003 secs
>   Average:      0.0003 secs
>   Requests/sec: 14610.4688
>
> ./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
> 'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v1"&start=0&rows=50'
>   Slowest:      0.0059 secs
>   Fastest:      0.0008 secs
>   Average:      0.0010 secs
>   Requests/sec: 4795.5539
>
> ./hey_linux_amd64 -n 10000 -c 5 -T "application/json"
> 'http://localhost:8983/solr/col/select?fl=fld2&wt=json&q=fld2:"v1"&start=0&rows=51'
>   Slowest:      0.0010 secs
>   Fastest:      0.0003 secs
>   Average:      0.0003 secs
>   Requests/sec: 14726.7978
>
> 4-6x diff
>
> On Fri, Jun 21, 2024 at 4:59 PM Michael Gibney
> <mich...@michaelgibney.net> wrote:
> >
> > Interesting! If turning off lazy field loading helps, I think I have a
> > trivial patch that may fix this (i.e. without requiring the workaround
> > of disabling lazy field loading -- which, as you say, makes no sense
> > to have in effect without the documentCache). The only thing that had
> > been stopping me from suggesting this patch right off the bat was the
> > "magic" threshold of 50, which I couldn't explain at all. But maybe
> > it's not so much a magic threshold as arbitrary, and specific to the
> > data you're evaluating over. I'll open an issue/PR more narrowly
> > scoped to the change. I'd say you could open the issue, except I still
> > don't fully understand the connection between the change I'm
> > considering and the behavior you're seeing -- just that they seem very
> > likely to be connected. If you're set up to try running a patched
> > version on your data, I'm curious to know if this will help.
> >
> > On Thu, Jun 20, 2024 at 6:16 PM Oleksandr Tkachuk <sasha547...@gmail.com> 
> > wrote:
> > >
> > > FYI: There is a solution in the last paragraph, but I still ran your
> > > tests, since the solution was found by "Cut and Try"  and there is no
> > > deep understanding.
> > >
> > > >I wonder what would happen if you fully bypassed the query cache (i.e., 
> > > >`q={!cache=false}product_type:"1"`?
> > > It does not help, there is not even one millisecond of difference in both 
> > > cases.
> > >
> > > >I recall that previously you had a very large number of dynamic fields. 
> > > >Is that the case here as well? And if so, are the dynamic fields mostly 
> > > >stored? docValues?
> > > This is another collection, I’ll get to the one with many many fields 
> > > later :))
> > > If this is the ~correct way to count the number of fields, then this
> > > collection has the following number of fields:
> > > curl -s "http://localhost:8983/solr/XXX/admin/luke?numTerms=0"; | grep
> > > '"type"' | wc -l
> > > 121
> > > Of these, 88 have docvalues enabled and 33 stored.
> > >
> > > As for the two fields used in query, here's how they are defined in the 
> > > schema.
> > >   <field name="product_id" type="plong" indexed="true" stored="true"/>
> > >   <field name="product_type" type="pint" indexed="true" stored="false"/>
> > >   <fieldType name="pint" class="solr.IntPointField" docValues="true"/>
> > >   <fieldType name="plong" class="solr.LongPointField" docValues="true"/>
> > >
> > > Changing fl= to something like a string field with stored=true without
> > > docvalues results in zero changes.
> > > I also tried this simple query on string type fields (copying the
> > > field) and got the same result. I also tried it on fields where the
> > > cardinality was different - the spread was not 150 times, but also
> > > often noticeable. In addition, I still do not fully understand the
> > > logic of this behavior
> > > ("product_type":["3",1069282,"2",710042,"1",13702]) if I do:
> > > 1) q=product_type:"1" rows=50 - qtime 150ms
> > > 2) q=product_type:"1" rows=51 - qtime 0ms
> > > 3) q=product_type:"2" rows=50 - qtime 3ms
> > > 4) q=product_type:"2" rows=51 - qtime 0ms
> > > 5) q=product_type:"3" rows=50 - qtime 1ms
> > > 6) q=product_type:"3" rows=51 - qtime 0ms
> > > I checked on other fields and get the same behavior - the fewer
> > > documents contain a given value, the slower the query becomes.
> > > If I can provide any more information, I will be glad.
> > >
> > > The problem was solved by turning off enableLazyFieldLoading. I am
> > > very surprised that this functionality continues to work when document
> > > cache is disabled and I thought that this parameter was intended only
> > > for it. In addition, we received an improvement in avg and 95% on many
> > > other types of queries, as well as some reduction in CPU load. Are
> > > there any consequences or disadvantages of such a decision? If not,
> > > then perhaps it is worth paying attention to this problem.
> > >
> > > On Thu, Jun 20, 2024 at 10:13 PM Michael Gibney
> > > <mich...@michaelgibney.net> wrote:
> > > >
> > > > I've been unable to reproduce anything like this behavior. If you're
> > > > really getting queryResultCache hits for these, then the field
> > > > type/etc of the field you're querying on shouldn't make a difference.
> > > > type/etc of the return field (product_id) would be more likely to
> > > > matter. I wonder what would happen if you fully bypassed the query
> > > > cache (i.e., `q={!cache=false}product_type:"1"`?
> > > >
> > > > I recall that previously you had a very large number of dynamic
> > > > fields. Is that the case here as well? And if so, are the dynamic
> > > > fields mostly stored? docValues?
> > > >
> > > >
> > > >
> > > > On Fri, Jun 14, 2024 at 7:29 AM Oleksandr Tkachuk 
> > > > <sasha547...@gmail.com> wrote:
> > > > >
> > > > > Initial data:
> > > > > Doc count: 1793026
> > > > > Field: "product_type", point int, indexed true, stored false,
> > > > > docvalues true. Values:
> > > > >  "facet_fields":{
> > > > >       "product_type":["3",1069282,"2",710042,"1",13702]
> > > > >     },
> > > > > Single shard, single instance.
> > > > >
> > > > > # ./hey_linux_amd64 -n 10000 -c 10 -T "application/json"
> > > > > 'http://localhost:8983/solr/XXX/select?fl=product_id&wt=json&q=product_type:"1"&start=0&rows=51'
> > > > > Summary:
> > > > >   Total:        0.6374 secs
> > > > >   Slowest:      0.0043 secs
> > > > >   Fastest:      0.0003 secs
> > > > >   Average:      0.0006 secs
> > > > >   Requests/sec: 15688.5755
> > > > >
> > > > > # ./hey_linux_amd64 -n 10000 -c 10 -T "application/json"
> > > > > 'http://localhost:8983/solr/XXX/select?fl=product_id&wt=json&q=product_type:"1"&start=0&rows=50'
> > > > > Summary:
> > > > >   Total:        101.3246 secs
> > > > >   Slowest:      0.2048 secs
> > > > >   Fastest:      0.0564 secs
> > > > >   Average:      0.1007 secs
> > > > >   Requests/sec: 98.6927
> > > > >
> > > > >
> > > > > 1) I've already played with queryResultWindowSize and
> > > > > queryResultMaxDocsCached by setting different, high and low values and
> > > > > this is probably not what I'm looking for since it gave a <few
> > > > > milliseconds difference in query performance
> > > > > 2) Checked on different versions of solr (9.6.1 and 8.7.0) - no
> > > > > significant changes
> > > > > 3) Tried changing the field type to string - zero performance changes
> > > > > 4) In both cases I see successful lookups in queryResultCache
> > > > > 5) Enabling documentCache solves the problem in this case (rows<=50),
> > > > > but introduces many other performance issues so it doesn't seem like a
> > > > > viable option.

Reply via email to