Re: unified highlighter performance in solr 8.5.1

2020-07-03 Thread David Smiley
I doubt that WORD mode is impacted much by hl.fragsizeIsMinimum in terms of quality of the highlight since there are vastly more breaks to pick from. I think that setting is more useful in SENTENCE mode if you can stand the perf hit. If you agree, then why not just let this one default to "true"?

Re: Out of memory errors with Spatial indexing

2020-07-03 Thread David Smiley
Hi Sunil, Your shape is at a pole, and I'm aware of a bug causing an exponential explosion of needed grid squares when you have polygons super-close to the pole. Might you try S2PrefixTree instead? I forget if this would fix it or not by itself. For indexing non-point data, I recommend

Re: Time-out errors while indexing (Solr 7.7.1)

2020-07-03 Thread Erick Erickson
Oops, I transposed that. If your index is a terabyte and your RAM is 128M, _that’s_ a red flag. > On Jul 3, 2020, at 5:53 PM, Erick Erickson wrote: > > You haven’t said how many _shards_ are present. Nor how many replicas of the > collection you’re hosting per physical machine. Nor how large

Re: Time-out errors while indexing (Solr 7.7.1)

2020-07-03 Thread Erick Erickson
You haven’t said how many _shards_ are present. Nor how many replicas of the collection you’re hosting per physical machine. Nor how large the indexes are on disk. Those are the numbers that count. The latter is somewhat fuzzy, but if your aggregate index size on a machine with, say, 128G of

Re: Time-out errors while indexing (Solr 7.7.1)

2020-07-03 Thread Mad have
Hi Eric, The collection has almost 13billion documents with each document around 5kb size, all the columns around 150 are the indexed. Do you think that number of documents in the collection causing this issue. Appreciate your response. Regards, Madhava Sent from my iPhone > On 3 Jul 2020,

Re: unified highlighter performance in solr 8.5.1

2020-07-03 Thread Nándor Mátravölgyi
Since the issue seems to be affecting the highlighter differently based on which mode it is using, having different defaults for the modes could be explored. WORD may have the new defaults as it has little effect on performance and it creates nicer highlights. SENTENCE should have the defaults

Re: unified highlighter performance in solr 8.5.1

2020-07-03 Thread David Smiley
I think we should flip the default of hl.fragsizeIsMinimum to be 'true', thus have the behavior close to what preceded 8.5. (a) it was very recently (<= 8.4) the previous behavior and so may require less tuning for users in 8.6 henceforth (b) it's significantly faster for long text -- seems to be

Re: Solr Float/Double multivalues fields

2020-07-03 Thread Thomas Corthals
Op vr 3 jul. 2020 om 14:11 schreef Bram Van Dam : > On 03/07/2020 09:50, Thomas Corthals wrote: > > I think this should go in the ref guide. If your product depends on this > > behaviour, you want reassurance that it isn't going to change in the next > > release. Not everyone will go looking

Re: Solr Float/Double multivalues fields

2020-07-03 Thread Bram Van Dam
On 03/07/2020 09:50, Thomas Corthals wrote: > I think this should go in the ref guide. If your product depends on this > behaviour, you want reassurance that it isn't going to change in the next > release. Not everyone will go looking through the javadoc to see if this is > implied. This is in

Re: Adding solr-core via maven fails

2020-07-03 Thread Erick Erickson
If you feel strongly that Solr needs to keep up the Maven bits up to date, you can volunteer to help maintain it, Solr is open source after all. > On Jul 3, 2020, at 12:08 AM, Ali Akhtar wrote: > > I had to add an additional repository to get the failing dependency to > resolve: > > resolvers

Out of memory errors with Spatial indexing

2020-07-03 Thread Sunil Varma
We are seeing OOM errors when trying to index some spatial data. I believe the data itself might not be valid but it shouldn't cause the Server to crash. We see this on both Solr 7.6 and Solr 8. Below is the input that is causing the error. { "id": "bad_data_1", "spatialwkt_srpt": "LINESTRING

Re: Time-out errors while indexing (Solr 7.7.1)

2020-07-03 Thread Erick Erickson
If you’re seeing low CPU utilization at the same time, you probably just have too much data on too little hardware. Check your swapping, how much of your I/O is just because Lucene can’t hold all the parts of the index it needs in memory at once? Lucene uses MMapDirectory to hold the index and you

Re: Solr Float/Double multivalues fields

2020-07-03 Thread Toke Eskildsen
On Fri, 2020-07-03 at 10:00 +0200, Vincenzo D'Amore wrote: > Hi Erick, not sure I got. > Does this mean that the order of values within a multivalued field: > - docValues=true the result will be both re-ordered and deduplicated. > - docValues=false the result order is guaranteed to be maintained

Re: Time-out errors while indexing (Solr 7.7.1)

2020-07-03 Thread Toke Eskildsen
On Thu, 2020-07-02 at 11:16 +, Kommu, Vinodh K. wrote: > We are performing QA performance testing on couple of collections > which holds 2 billion and 3.5 billion docs respectively. How many shards? > 1. Our performance team noticed that read operations are pretty > more than write

Re: ***URGENT***Re: Questions about Solr Search

2020-07-03 Thread Dave
Seriously. Doug answered all of your questions. > On Jul 3, 2020, at 6:12 AM, Atri Sharma wrote: > > Please do not cross post. I believe your questions were already answered? > >> On Fri, Jul 3, 2020 at 3:08 PM Gautam K wrote: >> >> Since it's a bit of an urgent request so if could please

Re: ***URGENT***Re: Questions about Solr Search

2020-07-03 Thread Atri Sharma
Please do not cross post. I believe your questions were already answered? On Fri, Jul 3, 2020 at 3:08 PM Gautam K wrote: > > Since it's a bit of an urgent request so if could please help me on this by > today it will be highly appreciated. > > Thanks & Regards, > Gautam Kanaujia > > On Thu, Jul

Re: How to use two search string in a single solr query

2020-07-03 Thread Tushar Arora
Hi Thanks Erick and Walter for your response. Solr Version Used : 6.5.0 I tried to elaborate the issue: Case 1 : Search String : Industrial Electric Oven Results=945 Case 2 : Search String : Dell laptop bags Results=992 In above both cases, mm play its role.(match

Re: Solr Float/Double multivalues fields

2020-07-03 Thread Vincenzo D'Amore
Hi Erick, not sure I got. Does this mean that the order of values within a multivalued field: - docValues=true the result will be both re-ordered and deduplicated. - docValues=false the result order is guaranteed to be maintained for values in the insertion-order. Is this correct? On Thu, Jul 2,

Changing Response for Group Query - Custom Request Handler

2020-07-03 Thread dnz
Dear Community, I am currently working on a Solr Custom Plugin, which - for a group query - adds both total matches and number of groups to the response and also keeps the response format as if it is not a group query. One additional requirement is that numFound should contain the number of

Re: Solr Float/Double multivalues fields

2020-07-03 Thread Thomas Corthals
I think this should go in the ref guide. If your product depends on this behaviour, you want reassurance that it isn't going to change in the next release. Not everyone will go looking through the javadoc to see if this is implied. Typically it'll either be something like "are always returned in

RE: Time-out errors while indexing (Solr 7.7.1)

2020-07-03 Thread Kommu, Vinodh K.
Anyone has any thoughts or suggestions on this issue? Thanks & Regards, Vinodh From: Kommu, Vinodh K. Sent: Thursday, July 2, 2020 4:46 PM To: solr-user@lucene.apache.org Subject: Time-out errors while indexing (Solr 7.7.1) Hi, We are performing QA performance testing on couple of collections