Using SOLR J 5.5.4 with SOLR 6.5

2017-09-18 Thread Felix Stanley
Hi there, We are planning to use SOLR J 5.5.4 to query from SOLR 6.5. The reason was that we have to rely on JDK 1.7 at the client and as far as I know SOLR J 6.x.x only support JDK 1.8. I understood that SOLR J generally maintains backwards/forward compatibility from this article:

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread shamik
I agree, should have made it clear in my initial post. The reason I thought it's little trivial since the newly introduced collection has only few hundred documents and is not being used in search yet. Neither it's being indexed at a regular interval. The cache parameters are kept to a minimum as

Dates and DataImportHandler

2017-09-18 Thread Jamie Jackson
Hi folks, My DB server is on America/Chicago time. Solr (on Docker) is running on UTC. Dates coming from my (MariaDB) data source seem to get translated properly into the Solr index without me doing anything special. However when doing delta imports using last_index_time (

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread Erick Erickson
Shamik: bq: The part I'm trying to understand is whether the memory footprint is higher for 6.6... bq: it has two collections, one being introduced with 6.6 upgrade If I'm reading this right, you added another collection to the system as part of the upgrade. Of course it will take more memory.

Re: SOLR and string comparison functions

2017-09-18 Thread Shawn Heisey
On 9/18/2017 4:01 PM, Dariusz Wojtas wrote: > There is one very important requirement. > No marther how many parameters are out there, the total result score cannot > exceed 1 (100%). Right there, you've got an unrealistic requirement. Scores are not absolute, they only have meaning relative to

Re: SOLR and string comparison functions

2017-09-18 Thread Dariusz Wojtas
Hi Emir, I am calculating a "normalizzed" score, as it will be later used by automatic decisioning processes to find if the result found "matches enough". For example I might create rule to decide if found result score is higher that 97% (matches), otherwise it is just a noise. I've been thinking

Re: SOLR and string comparison functions

2017-09-18 Thread Emir Arnautović
Hi Darius, This seems to me like misuse/misunderstanding of Solr. As you probably noticed, Solr score is not normalised - you cannot compare scores of two queries and tell if one result match better query than the other. There are some techniques to achieve something close, but that is not that

SOLR and string comparison functions

2017-09-18 Thread Dariusz Wojtas
Hi, I am working on an application that searches for entries that may be queried by multiple parameters. These parameters may be sent to SOLR in different sets, each parameter with it's own weight. Values for the example below might be as follows: firstName=John& firstName.weight=0.2&

RE: How to remove control characters in stored value at Solr side

2017-09-18 Thread Chris Hostetter
: But, can you then explain why Apache Nutch with SolrJ had this problem? : It seems that by default SolrJ does use XML as transport format. We have : always used SolrJ which i assumed would default to javabin, but we had : this exact problem anyway, and solved it by stripping non-character

RE: How to remove control characters in stored value at Solr side

2017-09-18 Thread Markus Jelsma
I agree. But, can you then explain why Apache Nutch with SolrJ had this problem? It seems that by default SolrJ does use XML as transport format. We have always used SolrJ which i assumed would default to javabin, but we had this exact problem anyway, and solved it by stripping non-character

RE: How to remove control characters in stored value at Solr side

2017-09-18 Thread Chris Hostetter
: You can not do this in Solr, you cannot even send non-character code : points in the first place. For Apache Nutch we solved the problem by Strictly speak: this is false. You *can* send control characters to solr as field values -- assuming your transport format allows it. Example: using

CVE-2017-9803: Security vulnerability in kerberos delegation token functionality

2017-09-18 Thread Shalin Shekhar Mangar
CVE-2017-9803: Security vulnerability in kerberos delegation token functionality Severity: Important Vendor: The Apache Software Foundation Versions Affected: Apache Solr 6.2.0 to 6.6.0 Description: Solr's Kerberos plugin can be configured to use delegation tokens, which allows an application

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread Joe Obernberger
Very nice article - thank you!  Is there a similar article available when the index is on HDFS?  Sorry to hijack!  I'm very interested in how we can improve cache/general performance when running with HDFS. -Joe On 9/18/2017 11:35 AM, Erick Erickson wrote: This is suspicious too. Each

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread shamik
Walter, thanks again. Here's some information on the index and search feature. The index size is close to 25gb, with 20 million documents. it has two collections, one being introduced with 6.6 upgrade. The primary collection carries the bulk of the index, newly formed one being aimed at getting

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread shamik
Thanks for your suggesting, I'm going to tune it and bring it down. It just happened to carry over from 5.5 settings. Based on Walter's suggestion, I'm going to reduce the heap size and see if it addresses the problem. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread Erick Erickson
This is suspicious too. Each entry is up to about maxDoc/8 bytes + (string size of fq clause) long and you can have up to 20,000 of them. An autowarm count of 512 is almost never a good thing. Walter's comments about your memory are spot on of course, see:

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread Walter Underwood
29G on a 30G machine is still a bad config. That leaves no space for the OS, file buffers, or any other processes. Try with 8G. Also, give us some information about the number of docs, size of the indexes, and the kinds of search features you are using. wunder Walter Underwood

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread shamik
Apologies, 290gb was a typo on my end, it should read 29gb instead. I started with my 5.5 configurations of limiting the RAM to 15gb. But it started going down once it reached the 15gb ceiling. I tried bumping it up to 29gb since memory seemed to stabilize at 22gb after running for few hours, of

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread Walter Underwood
You are running with a 290 Gb heap () on a 30 Gb machine. That is the worst Java config I have ever seen. Use this: SOLR_JAVA_MEM="-Xms8g -Xmx8g” That starts with an 8 Gb heap and stays there. Also, you might think about simplifying the GC configuration. Or if you are on a recent release

Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread Shamik Bandopadhyay
Hi, I recently upgraded to Solr 6.6 from 5.5. After running for a couple of days, the entire Solr cluster suddenly came down with OOM exception. Once the servers are being restarted, the memory footprint stays stable for a while before the sudden spike in memory occurs. The heap surges up

Re: solr Facet.contains

2017-09-18 Thread vobium
help me sove this problem -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr- Data search across multiple vores

2017-09-18 Thread Susheel Kumar
What fields do you want to search among two separate collections/cores and provide some details on your use case. Thnx On Mon, Sep 18, 2017 at 1:42 AM, Agrawal, Harshal (GE Digital) < harshal.agra...@ge.com> wrote: > Hello Folks, > > I want to search data in two separate cores. Both cores are

Re: Knn classifier doesn't work

2017-09-18 Thread alessandro.benedetti
Hi Tommaso, you are definitely right! I see that the method : MultiFields.getTerms returns : if (termsPerLeaf.size() == 0) { return null; } As you correctly mentioned this is not handled in : org/apache/lucene/classification/document/SimpleNaiveBayesDocumentClassifier.java:115

Learning-to-Rank with Bees: question answer follow-up

2017-09-18 Thread Christine Poerschke (BLOOMBERG/ LONDON)
Hi everyone, At my "Learning-to-Rank with Apache Solr and Bees" talk on Friday [1] there was one question that wasn't properly understood (by me) and so not fully answered in the room but later in individual conversation the question/answer became clearer. So here I just wanted to follow-up

Re: Apache Solr 4.10.x - Collection Reload times out

2017-09-18 Thread alessandro.benedetti
I finally have an explanation, I post it here for future reference : The cause was a combination of : 1) /select request handler has default with the spellcheck ON and few spellcheck options ( such as collationQuery ON and max collation tries set to 5) 2) the firstSearcher has a warm-up query

Re: Solr - google like suggestion

2017-09-18 Thread alessandro.benedetti
If you are referring to the number of words per suggestion, you may need to play with the free text lookup type [1] [1] http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html - --- Alessandro Benedetti Search Consultant, R Software Engineer, Director Sease Ltd. -