On Thu, Feb 15, 2024 at 12:38 AM Arun Sudhir <[email protected]> wrote:
> Thanks everyone for the tips! > @Mikhail > I tried your suggestion and it seems to work. We do not run into the cold > boot long query time for the particular facet query anymore. Had a > follow-up question: Will adding three or four-facet queries have > performance implications for the searcher? I don't think they will hurt but who knows. > I checked the memory usage and > there was no spike from this particular autowarm query. Are > docValus generally memory-hungry ? Not at all, they are memory mapped (should be), so they don't take much heap, beside the GlobalOrdinals map which is built lazily and takes some heap. > I would think they would only have a > smaller memory footprint compared to the segments themselves. > > Thanks > Arun > > > > On Wed, Feb 14, 2024 at 12:52 PM Arun Sudhir <[email protected]> wrote: > > > @Rahul Surendran <[email protected]> facet indeed is where the > time > > is spent: > > > > "process": { > > "time": 1560, > > "query": { > > "time": 10 > > }, > > "facet": { > > "time": 1545 > > }, > > > > On Tue, Feb 13, 2024, 6:03 AM Rahul Goswami <[email protected]> > wrote: > > > >> Can you pass debug=true with your query to find out which phase (query > or > >> faceting) takes more time? This is to eliminate chasing the wrong > symptom > >> to optimize for. > >> > >> -Rahul > >> > >> On Tue, Feb 13, 2024 at 3:48 AM Mikhail Khludnev <[email protected]> > wrote: > >> > >> > Hello, Arun. > >> > Why don't you warm a new searcher with a query listener? > >> > > >> > On Tue, Feb 13, 2024 at 3:18 AM Arun Sudhir <[email protected]> > >> wrote: > >> > > >> > > Hello, > >> > > We use solr for our search needs and we have documents indexed on a > >> core > >> > in > >> > > multiple machines. Over time, the index on some machines has grown > >> from > >> > 30 > >> > > GB to 60 GB now to a giant 133 GB. While others are still hovering > >> around > >> > > 80GB, and some others are still under 30GB. We manually control > which > >> > > documents go into which machine and do not use SolrCloud. > >> > > > >> > > We have a field in our index which is a docValue. What we have > >> noticed is > >> > > that facet queries on this field take around 10 seconds for almost > the > >> > > first call every minute or so on the huge server machines which have > >> ~130 > >> > > GB index size. We commit every minute on our servers as well. We > have > >> > > ensured that the machines do not starve on RAM and for the ones > which > >> > have > >> > > 130 GB of index, we have 256 GB of RAM. So the segments are all in > >> memory > >> > > all the time. > >> > > > >> > > Still, we see every call made after a minute or so takes 10 seconds > on > >> > the > >> > > big shards with index size close to 130 GB, 6 seconds on the shards > >> that > >> > > are 80GB, and less than 4 seconds on the normal shards whose size is > >> less > >> > > than 30 GB. > >> > > > >> > > How can we optimize and get rid of this latency? We have tried using > >> > > DocValuesFormat=Direct, increasing the number of facet.threads, > >> > increasing > >> > > the heap size etc. Is there anything else we can do to get the > >> > > performance of facet queries on the large shards to under 2 seconds? > >> > > > >> > > > >> > > Thanks > >> > > Arun > >> > > > >> > > >> > > >> > -- > >> > Sincerely yours > >> > Mikhail Khludnev > >> > > >> > > > > -- > [email protected] > -- Sincerely yours Mikhail Khludnev
