Hello Erick,

Wanted to start working on Solr bugs, will appreciate if you or some can
allocate me with some minor bugs.

Warm Regards,

On Wed, Jan 30, 2019 at 8:53 AM Erick Erickson <erickerick...@gmail.com>

> My suggestion is "don't do that" ;).
> Ok, seriously. Conceptually what you have is an N-dimnensional matrix.
> Each "dimension" is
> one of your pivot fields with one cell for each unique value in the
> field. So the size is
> (cardinality of field 1) x (cardinality of field 2) * (cardinality of
> field 3) .....
> To make matters worse, the results from each shard need to be
> aggregated, so you're
> correct that you're shoving potentially a _very_ large set of data
> across your network that
> then has to be sorted into the final packet.
> You don't indicate that you have OOM errors so what I suspect is
> happening is you're
> in "GC hell". Each GC cycle recovers just enough memory to continue
> for a very short
> time, then stops for another GC cycle. Rinse, repeat. Timeout.
> For more concrete suggestions.
> 1> You can use the "Luke request handler" to find the cardinality of
> the fields and then
>      have a blacklist of fields so you wind up rejecting these queries up
> front.
> 2> Consider the streaming capabilities. "Rollup" can be used for
>      high cardinality fields.
>      see: https://lucene.apache.org/solr/guide/6_6/stream-decorators.html.
>     NOTE:
>     "Facet" streams push the faceting down to the replicas, which you don't
>     want to use in this case as it'll be the same problem. The facet
> streams
>     are faster when they can be used, but I doubt you can in your case.
>     BTW, as chance would have it, Joel B. just explained this to me ;).
> Best,
> Erick
> On Wed, Jan 30, 2019 at 3:41 AM Matteo Diarena <m.diar...@volocom.it>
> wrote:
> >
> > Dear all,
> > we have a solrcloud cluster with the following features:
> >                 - 3 zookeeper nodes
> >                 - 4 solr nodes with:
> >                                - 4 CPU
> >                                - 16GB RAM
> >
> > Each solr instance is configured as follow:
> > SOLR_JAVA_MEM="-Xms2g -Xmx8g"
> > SOLR_OPTS="$SOLR_OPTS -Dlucene.cms.override_spins=false
> -Dlucene.cms.override_core_count=4"
> >
> > On the cluster we created a collection with 5 shards each with 2
> replicas for a total of 10 replicas.
> >
> > The full index size is less than 2 GB and under normal usage the used
> heap space is between 200MB and 500MB.
> >
> > Unfortunately if we try to perform a query like the this:
> >
> >
> .../select?q=*:*&fq=ActionType:MAILOPENING&facet=true&rows=0&facet.pivot=FIELD_ObjectId,FIELD_MailId&f.FIELD_ObjectId.facet.pivot.mincount=0&f.FIELD_ObjectId.facet.limit=-1&f.FIELD_ObjectId.facet.pivot.mincount=0&f.FIELD_ObjectId.facet.limit=-1
> >
> > where FIELD_ObjectId and FIELD_MailId are high cardinality fields all
> the heap space is used and the entire solr cluster becomes really slow and
> unresponsive.
> > The solr instance is not killed and the heap space is never released so
> the only way is to get the cluster up again is to restart all the solr
> instances.
> >
> > I know that the problem is the wrong query but I'd like to know how I
> can avoid this kind of problems.
> > Is there a way to limit the memory usage during query execution to avoid
> a single query to hang a cluster?
> >
> > I tried to disable all caches and to investigate the heap dump but I
> didn't manage to find any good solution.
> > I also thought that an issue could be the really big search response
> exchange between shards. Is it possible?
> >
> > Actually the cluster is not in production so I can easily perform tests
> or get all needed data.
> >
> > Any suggestion is welcome.
> >
> > Thanks a lot,
> > Matteo Diarena
> > Direttore Innovazione
> >
> > Volo.com S.r.l. (www.volocom.it<http://www.volocom.it/> - volo...@pec.it
> <mailto:volo...@pec.it>)
> > Via Luigi Rizzo, 8/1 - 20151 MILANO
> > Via Leone XIII, 95 - 00165 ROMA
> >
> > Tel +39 02 89453024 / +39 02 89453023
> > Fax +39 02 89453500
> > Mobile +39 345 2129244
> > m.diar...@volocom.it<mailto:m.diar...@volocom.it>
> >

Reply via email to