Re: Accent insensitive search for greek characters

2017-10-16 Thread Chitra
Hi Shawan, Thank you so much for the kind response. > Those filters operate on single characters from the input, which means > they cannot take character context into account like ICU does. If I am > reading what the ASCII filter does correctly, it may not work for Greek >

Re: [EXTERNAL] Re: OOM during indexing with 24G heap - Solr 6.5.1

2017-10-16 Thread David M Giannone
Sent via the Samsung Galaxy S® 6, an AT 4G LTE smartphone Original message From: Randy Fradin Date: 10/16/17 7:38 PM (GMT-05:00) To: solr-user@lucene.apache.org Subject: [EXTERNAL] Re: OOM during indexing with 24G heap - Solr 6.5.1 Each shard has

Re: OOM during indexing with 24G heap - Solr 6.5.1

2017-10-16 Thread Randy Fradin
Each shard has around 4.2 million documents which are around 40GB on disk. Two nodes have 3 shard replicas each and the third has 2 shard replicas. The text of the exception is: java.lang.OutOfMemoryError: Java heap space And the heap dump is a full 24GB indicating the full heap space was being

Re: OOM during indexing with 24G heap - Solr 6.5.1

2017-10-16 Thread Shawn Heisey
On 10/16/2017 3:19 PM, Randy Fradin wrote: > We are seeing a lot of full GC events and eventual OOM errors in Solr > during indexing. This is Solr 6.5.1 running in cloud mode with a 24G heap. > At these times indexing is the only activity taking place. The collection > has 4 shards and 2 replicas

Re: JAr errors with SoLr 6.6.1 and http client and core

2017-10-16 Thread Shawn Heisey
On 10/16/2017 1:45 PM, Johnson, Jaya wrote: > Hi I have the following code. > System.out.println("Initializing server"); > SystemDefaultHttpClient cl = new SystemDefaultHttpClient(); > client = new > HttpSolrClient("http://localhost:8983/solr/#/prosp_poc_collection",cl); >

Re: Several critical vulnerabilities discovered in Apache Solr (XXE & RCE)

2017-10-16 Thread Shawn Heisey
On 10/13/2017 7:13 AM, Rick Leir wrote: > What is the earliest version which was vulnerable? The XML query parser was added to Solr in version 5.5.  Since that's a critical part of the remote exploit, that's the minimum version to be worried about in situations where end users cannot reach Solr

Re: Concern on solr commit

2017-10-16 Thread Shawn Heisey
I'm supplementing the other replies you've already gotten.  See inline: On 10/13/2017 2:30 AM, Leo Prince wrote: > I am getting the following errors/warnings from Solr > > 1, ERROR: > org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: > > Error opening new searcher. exceeded

Re: Solr related questions

2017-10-16 Thread Shawn Heisey
On 10/13/2017 5:50 AM, startrekfan wrote: > Thank you for your answer. > > To 3.) > The file is on server A, my program is on server B and solr is on server > C. If I use a normal http(rest) post, my program has to fetch the file > content from server A to Server B and then post it from server B

Re: Accent insensitive search for greek characters

2017-10-16 Thread Shawn Heisey
On 10/13/2017 1:28 AM, Chitra wrote: >I want to search greek characters(with accent insensitive) by removing > or replacing accent marks with similar characters. > > Eg: when searching a greek accent word say *πῬοἲὅν*, we expect accent > insensitive search ie need equivalent greek accent like

SolrCloud - scalability issues with many collections

2017-10-16 Thread Shawn Heisey
Some time ago, I did some testing with SolrCloud (mostly 5.0 and branch_5x) handling thousands of collections. How that testing went is documented in SOLR-7191.  That issue has been marked as resolved in version 6.3, but no commits were made in the issue, and I haven't seen any evidence to

OOM during indexing with 24G heap - Solr 6.5.1

2017-10-16 Thread Randy Fradin
We are seeing a lot of full GC events and eventual OOM errors in Solr during indexing. This is Solr 6.5.1 running in cloud mode with a 24G heap. At these times indexing is the only activity taking place. The collection has 4 shards and 2 replicas across 3 nodes. Each document is ~10KB (a few

JAr errors with SoLr 6.6.1 and http client and core

2017-10-16 Thread Johnson, Jaya
Hi I have the following code. System.out.println("Initializing server"); SystemDefaultHttpClient cl = new SystemDefaultHttpClient(); client = new HttpSolrClient("http://localhost:8983/solr/#/prosp_poc_collection",cl); System.out.println("Completed initializing the server"); client.deleteByQuery(

Re: HOW DO I UNSUBSCRIBE FROM GROUP?

2017-10-16 Thread Gus Heck
Headers however do not display in many mail clients/webUIs... On Mon, Oct 16, 2017 at 9:23 AM, Richard wrote: > The list help/unsubscribe/post/etc. details are, as is not uncommon, > in the message header: > > List-Help:

Re: spell-check does not return collations when using search query with filter

2017-10-16 Thread Arnold Bronley
With q instead of spellcheck.q I get following response: { "responseHeader": { "status": 0, "QTime": 23, "params": { "q": "tag:polt", "spellcheck.collateExtendedResults": "true", "indent": "true", "spellcheck": "true", "spellcheck.accuracy": "0.72",

Re: spell-check does not return collations when using search query with filter

2017-10-16 Thread Arnold Bronley
with spellcheck.q I don't get anything back at all. { "responseHeader": { "status": 0, "QTime": 10, "params": { "spellcheck.collateExtendedResults": "true", "spellcheck.q": "tag:polt", "indent": "true", "spellcheck": "true", "spellcheck.accuracy":

Re: Solr JDBC with Core (vs Collection)

2017-10-16 Thread OTH
Hello, Sorry for continuing this thread after such a long time. I just wanted to check, whether streaming expressions / SQL are now working in non-SolrCloud mode, in the latest Solr release? Much thanks Omer On Thu, Mar 9, 2017 at 1:27 AM, Joel Bernstein wrote: > Getting

Re: Strange Behavior When Extracting Features

2017-10-16 Thread Michael Alcorn
If anyone else is following this thread, I replied on the Jira. On Mon, Oct 16, 2017 at 4:07 AM, alessandro.benedetti wrote: > This is interesting, the EFI parameter resolution should work using the > quotes independently of the query parser. > At that point, the query

Re: is there a way to remove deleted documents from index without optimize

2017-10-16 Thread Shawn Heisey
On 10/12/2017 10:01 PM, Erick Erickson wrote: > You can use the IndexUpgradeTool that ships with each version of Solr > (well, actually Lucene) to, well, upgrade your index. So you can use > the IndexUpgradeTool that ships with 5x to upgrade from 4x. And the > one that ships with 6x to upgrade

RE: solrcloud dead-lock

2017-10-16 Thread Younge, Kent A - Norman, OK - Contractor
Jack, No I still have the issue on one box only. I have re-requested certificates several times and still come back with the same issue. If I put a working certificate on the box everything works the way it should. Also if I browse the https: to the server name instead of the registered

Re: solrcloud dead-lock

2017-10-16 Thread SOLR6931
Hey Kent, Have you managed to find a solution to your problem? I'm currently encountering the exact same issue. Jack -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

unsubscribe please

2017-10-16 Thread Horace
The Westfield Leader and The Times www.goleader.com

Re: Unbalanced CPU no SolrCloud

2017-10-16 Thread Mahmoud Almokadem
It takes more time after I stopped the indexing. The load firstly was with the first node and after I restarted the indexing process the load with changed to the second node the first node worked properly. Thanks, Mahmoud On Mon, Oct 16, 2017 at 5:29 PM, Emir Arnautović <

Re: Unbalanced CPU no SolrCloud

2017-10-16 Thread Emir Arnautović
Does the load stops when you stop indexing or it last for some more time? Is it always one node that behaves like this and it starts as soon as you start indexing? Is load different between nodes when you are doing lighter indexing? -- Monitoring - Log Management - Alerting - Anomaly Detection

Re: Parallel SQL: GROUP BY throws exception

2017-10-16 Thread Joel Bernstein
Ok, I just the read the query again. Try the failing query like this: SELECT people_person_id, sum(amount) as total FROM donation GROUP BY people_person_id That is the correct syntax for the SQL group by aggregation. It looks like you found a null pointer though where a proper error message is

Re: CVE-2017-12629 which versions are vulnerable?

2017-10-16 Thread Uwe Reh
Sorry, I missed the post from Florian Gleixner: >Re: Several critical vulnerabilities discovered in Apache Solr (XXE & RCE) Am 16.10.2017 um 16:52 schrieb Uwe Reh: Hi, I'm still using V4.10. Is this version also vulnerable by http://openwall.com/lists/oss-security/2017/10/13/1 ? Uwe

CVE-2017-12629 which versions are vulnerable?

2017-10-16 Thread Uwe Reh
Hi, I'm still using V4.10. Is this version also vulnerable by http://openwall.com/lists/oss-security/2017/10/13/1 ? Uwe

Re: zero-day exploit security issue

2017-10-16 Thread Keith L
Additionally, it looks like the commits are public on github. Is this backported to 5.5.x too? Users that are still on 5x might want to backport some of the issues themselves since is not officially supported anymore. On Mon, Oct 16, 2017 at 10:11 AM Mike Drob wrote: > Given

Re: zero-day exploit security issue

2017-10-16 Thread Mike Drob
Given that the already public nature of the disclosure, does it make sense to make the work being done public prior to release as well? Normally security fixes are kept private while the vulnerabilities are private, but that's not the case here... On Mon, Oct 16, 2017 at 1:20 AM, Shalin Shekhar

Re: SOLR cores are getting locked

2017-10-16 Thread Erick Erickson
bin/solr start -help will give you a lot of info. But yes, the -s option is what you should use. Here's one of my batch files I used to start various cloud examples: bin/solr start -c -z localhost:2181 -p 898 -s example/cloud/node1/solr On Sun, Oct 15, 2017 at 11:48 PM, Gunalan V

Re: Parallel SQL: GROUP BY throws exception

2017-10-16 Thread Joel Bernstein
Also what version are you using? Joel Bernstein http://joelsolr.blogspot.com/ On Mon, Oct 16, 2017 at 9:49 AM, Joel Bernstein wrote: > Can you provide the stack trace? > > Are you in SolrCloud mode? > > > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Mon, Oct 16,

Re: Parallel SQL: GROUP BY throws exception

2017-10-16 Thread Joel Bernstein
Can you provide the stack trace? Are you in SolrCloud mode? Joel Bernstein http://joelsolr.blogspot.com/ On Mon, Oct 16, 2017 at 9:20 AM, Dmitry Gerasimov wrote: > Hi all! > > This query works as expected: > SELECT sum(amount) as total FROM donation > > Adding

Re: HOW DO I UNSUBSCRIBE FROM GROUP?

2017-10-16 Thread Richard
The list help/unsubscribe/post/etc. details are, as is not uncommon, in the message header: List-Help: List-Unsubscribe: List-Post: of all messages posted to the

Parallel SQL: GROUP BY throws exception

2017-10-16 Thread Dmitry Gerasimov
Hi all! This query works as expected: SELECT sum(amount) as total FROM donation Adding GROUP BY: SELECT sum(amount) as total FROM donation GROUP BY people_person_id Now I get response: { "result-set":{ "docs":[{ "EXCEPTION":"Failed to execute sqlQuery 'SELECT sum(amount) as total

Re: HOW DO I UNSUBSCRIBE FROM GROUP?

2017-10-16 Thread Gus Heck
While this has been the traditional response, and it's accurate and helpful, the user that complained about no unsubscribe link has a point. This is the normal expectation in this day and age. Maybe Apache should consider appending a "You are receiving this because you are subscribed to (list)

Re: Unbalanced CPU no SolrCloud

2017-10-16 Thread Mahmoud Almokadem
The transition of the load happened after I restarted the bulk insert process. The size of the index on each server about 500GB. There are about 8 warnings on each server for "Not found segment file" like that Error getting file length for [segments_2s4] java.nio.file.NoSuchFileException:

Re: Unbalanced CPU no SolrCloud

2017-10-16 Thread Emir Arnautović
I did not look at graph details - now I see that it is over 3h time span. It seems that there was a load on the other server before this one and ended with 14GB read spike and 10GB write spike, just before load started on this server. Do you see any errors or suspicious logs lines? How big is

Re: Unbalanced CPU no SolrCloud

2017-10-16 Thread Mahmoud Almokadem
Yes, it's constantly since I started this bulk indexing process. As you see the write operations on the loaded server are 3x the normal server despite Disk writes not 3x times. Mahmoud On Mon, Oct 16, 2017 at 12:32 PM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > Hi Mahmoud, > Is

Re: Unbalanced CPU no SolrCloud

2017-10-16 Thread Emir Arnautović
Hi Mahmoud, Is this something that you see constantly? Network charts suggests that your servers are loaded equally and as you said - you are not using routing so expected. Disk read/write and CPU are not equal and it is expected to not be equal during heavy indexing since it also triggers

Re: E-Commerce Search: tf-idf, tie-break and boolean model

2017-10-16 Thread alessandro.benedetti
I was having the discussion with a colleague of mine recently, about E-commerce search. Of course there are tons of things you can do to improve relevancy: Custom similarity - edismax tuning - basic user events processing - machine learning integrations - semantic search ect ect more you do,

Re: Unbalanced CPU no SolrCloud

2017-10-16 Thread Mahmoud Almokadem
Here are the screen shots for the two server metrics on Amazon https://ibb.co/kxBQam https://ibb.co/fn0Jvm https://ibb.co/kUpYT6 On Mon, Oct 16, 2017 at 11:37 AM, Mahmoud Almokadem wrote: > Hi Emir, > > We doesn't use routing. > > Servers is already balanced and the

Re: Unbalanced CPU no SolrCloud

2017-10-16 Thread Mahmoud Almokadem
Hi Emir, We doesn't use routing. Servers is already balanced and the number of documents on each shard are approximately the same. Nothing running on the servers except Solr and ZooKeeper. I initialized the client as String zkHost = "192.168.1.89:2181,192.168.1.99:2181"; CloudSolrClient

Re: HOW DO I UNSUBSCRIBE FROM GROUP?

2017-10-16 Thread alessandro.benedetti
The Terms component[1] should do the trick for you. Just use the regular expression or prefix filtering and you should be able to get the stats you want. If you were interested in extracting the DV when returning docs you may be interested in function queries and specifically this one :

Re: Strange Behavior When Extracting Features

2017-10-16 Thread alessandro.benedetti
This is interesting, the EFI parameter resolution should work using the quotes independently of the query parser. At that point, the query parsers (both) receive a multi term text. Both of them should work the same. At the time I saw the mail I tried to reproduce it through the LTR module tests

Re: spell-check does not return collations when using search query with filter

2017-10-16 Thread alessandro.benedetti
Interesting, what happens when you pass it as spellcheck.q=polt ? What is the behavior you get ? - --- Alessandro Benedetti Search Consultant, R Software Engineer, Director Sease Ltd. - www.sease.io -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Unbalanced CPU no SolrCloud

2017-10-16 Thread Emir Arnautović
Hi Mahmoud, Do you use routing? Are your servers equally balanced - do you end up having approximately the same number of documents hosted on both servers (counted all shards)? Do you have anything else running on those servers? How do you initialise your SolrJ client? Are documents of similar

Re: E-Commerce Search: tf-idf, tie-break and boolean model

2017-10-16 Thread Emir Arnautović
Hi Vincenzo, Unless you have really specific ranking requirements, I would not suggest you to start with you proprietary similarity implementation. In most cases edismax will be good enough to cover your requirements. It is not easy task to tune edismax since it has a log knobs that you can

Unbalanced CPU no SolrCloud

2017-10-16 Thread Mahmoud Almokadem
We've installed SolrCloud 7.0.1 with two nodes and 8 shards per node. The configurations and the specs of the two servers are identical. When running bulk indexing using SolrJ we see one of the servers is fully loaded as you see on the images and the other is normal. Images URLs:

E-Commerce Search: tf-idf, tie-break and boolean model

2017-10-16 Thread Vincenzo D'Amore
Hi all, I'm trying to figure out how to tune Solr for an e-commerce search. I want to share with you what I did in the hope to understand if I was right and, if there, I could also improve my configuration. I also read that the boolean model has to be preferred in this case.

Re: HOW DO I UNSUBSCRIBE FROM GROUP?

2017-10-16 Thread Amrit Sarkar
Hi, If you wish the emails to "stop", kindly "UNSUBSCRIBE" by following the instructions on the http://lucene.apache.org/solr/community.html. Hope this helps. Amrit Sarkar Search Engineer Lucidworks, Inc. 415-589-9269 www.lucidworks.com Twitter http://twitter.com/lucidworks LinkedIn:

Re: SOLR cores are getting locked

2017-10-16 Thread Gunalan V
Thanks Erick, I'm using the one VM where all SOLRCloud and Zookeeper nodes are running. I have two solr nodes in solrcloud. Just wanted to check do I need to create different solr home directory using -s param for each SOLR nodes ? If yes kindly share me some documentation to configure separate

Re: zero-day exploit security issue

2017-10-16 Thread Shalin Shekhar Mangar
Yes, there is but it is private i.e. only the Apache Lucene PMC members can see it. This is standard for all security issues in Apache land. The fixes for this issue has been applied to the release branches and the Solr 7.1.0 release candidate is already up for vote. Barring any unforeseen