Re: BUMP: Atomic updates and POST command?

2018-09-03 Thread Scott Prentice
Thanks, Shawn. That helps with the meaning of the "solr" format. Our needs are pretty basic. We have some upstream processes that crawl the data and generate a JSON feed that works with the default post command. So far this works well and keeps things simple. Thanks! ...scott On 9/1/18

RE: Heap Memory Problem after Upgrading to 7.4.0

2018-09-03 Thread Markus Jelsma
Hello Björn, Take great care, 7.2.1 cannot read an index written by 7.4.0, so you cannot roll back but need to reindex! Andrey Kudryavtsev made a good suggestion in the thread on how to find the culprit, but it will be a tedious task. I have not yet had the time or courage to venture there.

Re: Heap Memory Problem after Upgrading to 7.4.0

2018-09-03 Thread Björn Häuser
Hi, > On 3. Sep 2018, at 22:18, Erick Erickson wrote: > > Reducing to 10 won't be definitive, but if the problem gets better > it'll be a clue. > > How are you committing? Is it just based on the solrconfig settings or > do you have any clients submitting commit commands? Only through the

Re: Heap Memory Problem after Upgrading to 7.4.0

2018-09-03 Thread Björn Häuser
Hi Markus, this reads exactly like what we have. Where you able to figure out anything? Currently thinking about rollbacking to 7.2.1. > On 3. Sep 2018, at 21:54, Markus Jelsma wrote: > > Hello, > > Getting an OOM plus the fact you are having a lot of IndexSearcher instances > rings a

Re: Heap Memory Problem after Upgrading to 7.4.0

2018-09-03 Thread Erick Erickson
Reducing to 10 won't be definitive, but if the problem gets better it'll be a clue. How are you committing? Is it just based on the solrconfig settings or do you have any clients submitting commit commands? One fat clue would be if, in your solr logs, you were getting any warnings about "too

RE: Heap Memory Problem after Upgrading to 7.4.0

2018-09-03 Thread Markus Jelsma
Hello, Getting an OOM plus the fact you are having a lot of IndexSearcher instances rings a familiar bell. One of our collections has the same issue [1] when we attempted an upgrade 7.2.1 > 7.3.0. I managed to rule out all our custom Solr code but had to keep our Lucene filters in the schema,

Re: Heap Memory Problem after Upgrading to 7.4.0

2018-09-03 Thread Björn Häuser
Hi Erick, thank you for your answer. Unfortunately I do not have a heap dump from 6.6. > On 3. Sep 2018, at 20:48, Erick Erickson wrote: > > I would expect at least 1 IndexSearcher per replica, how many total > replicas hosted in your JVM? 27 replicas per JVM. > > Plus, if you're actively

Re: Heap Memory Problem after Upgrading to 7.4.0

2018-09-03 Thread Erick Erickson
I would expect at least 1 IndexSearcher per replica, how many total replicas hosted in your JVM? Plus, if you're actively indexing, there may temporarily be 2 IndexSearchers open while the new searcher warms. And there may be quite a few caches, at least queryResultCache and filterCache and

Heap Memory Problem after Upgrading to 7.4.0

2018-09-03 Thread Björn Häuser
Hello, we recently upgraded our solrcloud (5 nodes, 25 collections, 1 shard each, 4 replicas each) from 6.6.0 to 7.3.0 and shortly after to 7.4.0. We are running Zookeeper 4.1.13. Since the upgrade to 7.3.0 and also 7.4.0 we encountering heap space exhaustion. After obtaining a heap dump it

Re: How long does a query?q=field1:2312 should cost? exactly hit one document.

2018-09-03 Thread Erick Erickson
My guess is that you're searching un-warmed instances of Solr and are seeing the time it takes to read the index structures into memory the first time. What happens if you turn off indexing and query a number of values (not the same one or you'll hit the queryResultCache). So your first query

Feature Selection and Model Training for Solr LTR

2018-09-03 Thread Zheng Lin Edwin Yeo
Hi, I am in the process of setting up Solr LTR in Solr 7.4.0. Understand that there are different types of model like Linear Model, Multiple Additive Trees Model and Neural Network Model. Any one has information on which model is the most suitable to be use for the best performance for dealing

Re: Contextual Synonym Filter

2018-09-03 Thread Andrea Gazzarini
Hi Luca, I believe this is not an easy task to do passing through Solr/Lucene internals; did you try to use what Solr offers out of the box? For example, you could define several fields associated where each corresponding field type uses a different synonym set. So you would have * F1 -> FT1

Contextual Synonym Filter

2018-09-03 Thread Vergantini Luca
I need to create a contextual Synonym Filter: I need that the Synonym Filter load different synonym configuration based on the fq query parameter. I've already modified the SynonymGraphFilterFactory to load from DB (this is another requirement) but I can't understand how to make the fq

Re: MLT in Cloud Mode - Not Returning Fields?

2018-09-03 Thread Doug Turnbull
Thanks Charlie, those are helpful. I think at this point we will attach a debugger and see what shakes out. Perhaps it's one of these cases you list. Perhaps we're missing something. We'll report back. -Doug On Mon, Sep 3, 2018 at 5:23 AM Charlie Hull wrote: > On 31/08/2018 19:36, Doug

Re: Replacing Double Quotes from a field

2018-09-03 Thread Shawn Heisey
On 9/3/2018 1:51 AM, Gopesh Sharma wrote: I am trying to remove the double quotes from a field and that's why written PatternReplaceCharFilterFactory, but it doesn't seem to be working. When you say it's not working, how precisely are you checking?  If you're looking at the field value in

Re: How long does a query?q=field1:2312 should cost? exactly hit one document.

2018-09-03 Thread zhenyuan wei
Only a termQuery q=field1:2312, No other conditions. I try debug now, but can not find out what is the main cost. Debug=timing output like : { "responseHeader":{ "zkConnected":true, "status":0, "QTime":157, "params":{ "q":"v00_s:15de21c670ae7c3f6f3f1f37029303c9",

Re: How long does a query?q=field1:2312 should cost? exactly hit one document.

2018-09-03 Thread zhenyuan wei
Only a termQuery q=field1:2312, No other conditions. I try debug now, but can not find out what is the main cost. Debug=timing output like : { "responseHeader":{ "zkConnected":true, "status":0, "QTime":157, "params":{ "q":"v00_s:15de21c670ae7c3f6f3f1f37029303c9",

Streaming timeseries() and buckets with no docs

2018-09-03 Thread Jan Høydahl
Hi We have a timeseries expression with gap="+1DAY" and a sum(imps_l) to aggregate sums of an integer for each bucket. Now, some day buckets do not contain any documents at all, and instead of returning a tuple with value 0, it returns a tuple with no entry at all for the sum, see the bucket

Re: java.lang.NullPointerException at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:421)

2018-09-03 Thread asis.ind...@gmail.com
Hi thanks for posting this, was getting same error and had same stored false ID. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: How long does a query?q=field1:2312 should cost? exactly hit one document.

2018-09-03 Thread Erik Hatcher
Add debug=true and see where the time goes, in which components? Highlighting is my culprit guess. Or faceting? > On Sep 3, 2018, at 07:45, zhenyuan wei wrote: > > Hi , > I am curious “How long does a query q=field1:2312 cost , which > exactly match only one document? ”, Of course we

AW: Solr suggestions: why are exact matches omitted

2018-09-03 Thread Clemens Wyss DEV
> I'm afraid only thorough debugging might answer I'd say debugging is only required if everybody (not just me ) expects to get "the exact match" in the spellcheck-response ... If it's nonsense to expect "the exact match" in the spellcheck-respons, then it's a feature of spellchecking

How long does a query?q=field1:2312 should cost? exactly hit one document.

2018-09-03 Thread zhenyuan wei
Hi , I am curious “How long does a query q=field1:2312 cost , which exactly match only one document? ”, Of course we just discuss no queryResultCache with match in this situation. In fact my QTime is 150ms+, it is too long.

Re: Solr suggestions: why are exact matches omitted

2018-09-03 Thread Mikhail Khludnev
I'm afraid only thorough debugging might answer. On Mon, Sep 3, 2018 at 1:58 PM Clemens Wyss DEV wrote: > Sorry for not giving up on this issue: > is this "behavior" a feature or a bug? > > -Ursprüngliche Nachricht- > Von: Clemens Wyss DEV > Gesendet: Donnerstag, 30. August 2018 18:01

Re: Boost only first 10 records

2018-09-03 Thread Mikhail Khludnev
Hello, I hardly follow, but subj sounds like reranking. http://people.apache.org/~mkhl/searchable-solr-guide-7-3/query-re-ranking.html#query-re-ranking On Mon, Sep 3, 2018 at 1:02 PM mama wrote: > Hi > We have requirement to boost only first few records & rest of result should > be as per

AW: Solr suggestions: why are exact matches omitted

2018-09-03 Thread Clemens Wyss DEV
Sorry for not giving up on this issue: is this "behavior" a feature or a bug? -Ursprüngliche Nachricht- Von: Clemens Wyss DEV Gesendet: Donnerstag, 30. August 2018 18:01 An: 'solr-user@lucene.apache.org' Betreff: Solr suggestions: why are exact matches omitted Given the following

Re: Boost only first 10 records

2018-09-03 Thread Rahul Singh
I agree , the tow query solution is the simplest to implement and you have much more control on the UI as well. It seems you want to have a “featured” set of results above and separate from the organic results from the index. You could choose to request only specific fields in the “featured”

Re: Boost only first 10 records

2018-09-03 Thread Emir Arnautović
Hi, The requirement is not 100% clear or logical. If user selects filter type:comedy, it does not make sense to show anything else. You might have “Other categories relavant results” and that can be done as a separate query. It seems that you want to prefer comedy, but you have an issue with

Re: Multiple solr instances per host vs Multiple cores in same solr instance

2018-09-03 Thread Bernd Fehling
Yes thats right, there is no "best" setup at all, only one that gives most advantage to your requirements. And any setup has some disadvantages. Currently I'm short in time and have to bring our Cloud to production but a write-up is in the queue as already done with other developments.

Boost only first 10 records

2018-09-03 Thread mama
Hi We have requirement to boost only first few records & rest of result should be as per search. e.g. if i have books of different genre & if user search for some book (intrested in genere : comedy) then we want to show say first 3 records of genre:comedy and rest of results should be of diff

Storing PID below /run

2018-09-03 Thread Andreas Hubold
Hi, we'd like to store the PID file for the Solr service in a directory below the /run directory (CentOS 7.5). I've set "SOLR_PID_DIR=/run/solr" in solr.in.sh. But if /run is mounted as tmpfs, the directory /run/solr will not exist after boot and the pid file cannot be stored in that

Re: Multiple solr instances per host vs Multiple cores in same solr instance

2018-09-03 Thread Toke Eskildsen
On Tue, 2018-08-28 at 09:37 +0200, Bernd Fehling wrote: > Yes, I tested many cases. Erick is absolutely right about the challenge of finding "best" setups. What we can do is gather observations, as you have done, and hope that people with similar use cases finds them. With that in mind, have you

Re: MLT in Cloud Mode - Not Returning Fields?

2018-09-03 Thread Charlie Hull
On 31/08/2018 19:36, Doug Turnbull wrote: Hello, We're working on a Solr More Like This project (Solr 6.6.2), using the More Like This searchComponent. What we note is in standalone Solr, when we request MLT using the search component, we get every more like this document fully formed with

Re: Is that a mistake or bug?

2018-09-03 Thread zhenyuan wei
Oh ~ I feel embarrassed to explaining it again, maybe my english not so well~ my actually mean is: IF QueryResult.segmentTerminatedEarly is boolean ,not Boolean , declared in QueryResult. public class QueryResult{ private boolean partialResults * private Boolean

Re: Is that a mistake or bug?

2018-09-03 Thread p.bodnar
Hi, really nope :) Because as MK writes below, result.segmentTerminatedEarly is used as a 3-state variable. The only line that could be improved, is probably replacing "Boolean.FALSE" by simply "false", but that is really a minor thing... Regards PB

Re: Is that a mistake or bug?

2018-09-03 Thread zhenyuan wei
I mean, use terminatedEarly as basic boolean type, then no need to explicitly assign it as Boolean.FALSE, because basic boolean's default value is false. Mikhail Khludnev 于2018年9月3日周一 下午4:13写道: > Nope. In this case, it will respond terminatedEarly=false even if noone > request it. > > On Mon,

Re: change DocExpirationUpdateProcessorFactory deleteByQuery NOW parameter time zone

2018-09-03 Thread Derek Poh
SG refers to Singaporeand the time is UTC +8. That means I need to set the P_TradeShowOnlineEndDate date to UTC instead of UTC +8 as a workaround to it. On 31/8/2018 10:16 PM, Shawn Heisey wrote: On 8/30/2018 7:26 PM, Derek Poh wrote: Can the timezone of the NOW parameter in the

Re: Is that a mistake or bug?

2018-09-03 Thread Mikhail Khludnev
Nope. In this case, it will respond terminatedEarly=false even if noone request it. On Mon, Sep 3, 2018 at 9:09 AM zhenyuan wei wrote: > Yeah,got it~. So the QueryResult.segmentTerminatedEarly maybe a boolean, > instead of Boolean, is better, right? > > Mikhail Khludnev 于2018年9月3日周一 下午1:36写道:

Replacing Double Quotes from a field

2018-09-03 Thread Gopesh Sharma
Hello All, I am trying to remove the double quotes from a field and that's why written PatternReplaceCharFilterFactory, but it doesn't seem to be working. I also tried to replace it on querying time, but the SOLR throwing error that entity must be closed with > select replace(t.name, '\"',

Re: Is that a mistake or bug?

2018-09-03 Thread zhenyuan wei
Yeah,got it~. So the QueryResult.segmentTerminatedEarly maybe a boolean, instead of Boolean, is better, right? Mikhail Khludnev 于2018年9月3日周一 下午1:36写道: > It's neither, it's on purpose. By default result.segmentTerminatedEarly is > null, hence it doesn't appear in result output. see >