CDCR with SSL enabled

2017-05-01 Thread Xie, Sean
Does CDCR support SSL encrypted SolrCloud? I have two clusters started with SSL, and CDCR setup instruction is followed on source and target. However, from the solr.log, I’m not able to see CDCR is occurring. Not sure what has been setup incorrectly. From the solr.log, I can’t find useful info

Suggester uses lots of 'Page cache' memory

2017-05-01 Thread Damien Kamerman
Hi all, I have a Solr v6.4.2 collection with 12 shards and 2 replicas. Each replica uses about 14GB disk usage. I'm using Solaris 11 and I see the 'Page cache' grow by about 7GB for each suggester replica I build. The suggester index itself is very small. The 'Page cache' memory is freed when the

IndexFormatTooNewException - MapReduceIndexerTool for PDF files

2017-05-01 Thread ecos
Hi I'm getting the following error when trying to index PDF documents using the MapReduceIndexerTool in Cloudera: The cause of the error is: org.apache.lucene.index.IndexFormatTooNewException: Format

Re: Solr performance on EC2 linux

2017-05-01 Thread Jeff Wartes
Yes, that’s the Xenial I tried. Ubuntu 16.04.2 LTS. On 5/1/17, 7:22 PM, "Will Martin" wrote: Ubuntu 16.04 LTS - Xenial (HVM) Is this your Xenial version? On 5/1/2017 6:37 PM, Jeff Wartes wrote: > I tried a few variations of

Re: Solr performance on EC2 linux

2017-05-01 Thread Jeff Wartes
I started with the same three-node 15-shard configuration I’d been used to, in an RF1 cluster. (the index is almost 700G so this takes three r4.8xlarge’s if I want to be entirely memory-resident) I eventually dropped down to a 1/3rd size index on a single node (so 5 shards, 100M docs each) so I

Re: Solr performance on EC2 linux

2017-05-01 Thread Will Martin
Ubuntu 16.04 LTS - Xenial (HVM) Is this your Xenial version? On 5/1/2017 6:37 PM, Jeff Wartes wrote: > I tried a few variations of various things before we found and tried that > linux/EC2 tuning page, including: >- EC2 instance type: r4, c4, and i3 >- Ubuntu version: Xenial and

Re: CDCR & firewall holes

2017-05-01 Thread Susheel Kumar
I believe you need to open a) ports from source cluster to target zookeepers (usually 2181 unless you change it) b) ports from source to target solr ports (usually 8983 unless you change it) Thanks, Susheel On Mon, May 1, 2017 at 2:17 PM, Oakley, Craig (NIH/NLM/NCBI) [C] < craig.oak...@nih.gov>

Re: Both main and replica are trying to access solr_gc.log.0.current file

2017-05-01 Thread Zheng Lin Edwin Yeo
Is this the correct way to start both of the replicas? bin\solr.cmd start -cloud -p 8983 -s solr\node1\solr -m 8g -z "localhost:9981,localhost:9982,localhost:9983" bin\solr.cmd start -cloud -p 8984 -s solr\node2\solr -m 8g -z "localhost:9981,localhost:9982,localhost:9983" Regards, Edwin On 30

Joining more than 2 collections

2017-05-01 Thread Zheng Lin Edwin Yeo
Hi, Is it possible to join more than 2 collections using one of the streaming expressions (Eg: innerJoin)? If not, is there other ways we can do it? Currently, I may need to join 3 or 4 collections together, and to output selected fields from all these collections together. I'm using Solr

Re: Solr performance on EC2 linux

2017-05-01 Thread Walter Underwood
Might want to measure the single CPU performance of your EC2 instance. The last time I checked, my MacBook was twice as fast as the EC2 instance I was using. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On May 1, 2017, at 6:24 PM, Chris Hostetter

Re: Solr performance on EC2 linux

2017-05-01 Thread Chris Hostetter
: tldr: Recently, I tried moving an existing solrcloud configuration from : a local datacenter to EC2. Performance was roughly 1/10th what I’d : expected, until I applied a bunch of linux tweaks. How many total nodes in your cluster? How many of them running ZooKeeper? Did you observe the

Re: Solr performance on EC2 linux

2017-05-01 Thread Jeff Wartes
I tried a few variations of various things before we found and tried that linux/EC2 tuning page, including: - EC2 instance type: r4, c4, and i3 - Ubuntu version: Xenial and Trusty - EBS vs local storage - Stock openjdk vs Zulu openjdk (Recent java8 in both cases - I’m aware of the issues

Re: Building Solr greater than 6.2.1

2017-05-01 Thread Alexandre Rafalovitch
There was a Java compiler bug I think that was introduced and then fixed. Took me two days to figure out when I hit that a while ago. Regards, Alex On 1 May 2017 1:00 PM, "Ryan Yacyshyn" wrote: I was using Java 8 all along but more specifically, it was 1.8.0_25

CDCR & firewall holes

2017-05-01 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
We are considering using Cross Data Center Replication between SolrClouds in different domains which have a firewall between them. Is it documented anywhere how many firewall holes will be needed? From each source SolrCloud node to each target SolrCloud node? From each target SolrCloud node to

choosing placement upon RESTORE

2017-05-01 Thread xavier jmlucjav
hi, I am facing this situation: - I have a 3 node Solr 6.1 with some 1 shard, 1 node collections (it's just for dev work) - the collections where created with: action=CREATE&...=EMPTY" then action=ADDREPLICA&...=$NODEA=$DATADIR" - I have taken a BACKUP of the collections - Solr is upgraded

Re: Step By Step guide to create Solr Cloud in Solr 6.x

2017-05-01 Thread Erick Erickson
First, you should not have to restart Solr. Second, generally Solr will distribute replicas fairly evenly, just use the Collections API, CREATE command and optionally supply a "nodeSet" parameter. If you really require exact placement of replicas on exact machines (which I contend you probably do

Re: Slow indexing speed when collection size is large

2017-05-01 Thread Zheng Lin Edwin Yeo
Hi Rick, I'm using Solrj for the indexing, not using curl. Normally I bundle about 1000 documents for each POST. There's more than 300GB of RAM for that server, and I do not use any sharing at the moment. Regards, Edwin On 1 May 2017 at 19:08, Rick Leir wrote: > Zheng, >

Re: After upgrade to Solr 6.5, q.op=AND affects filter query differently than in older version

2017-05-01 Thread Shawn Heisey
On 5/1/2017 9:19 AM, Andy C wrote: > Your state that the best performing query that gives the desired results is: >> fq=ctindex:myId OR (*:* -ctindex:[* TO *]) > Is this because there some sort of optimization invoked when you use [* TO > *], or just because a single range will be more efficient

Re: After upgrade to Solr 6.5, q.op=AND affects filter query differently than in older version

2017-05-01 Thread Andy C
Thanks for the response Shawn. Adding "*:*" in front of my filter query does indeed resolve the issue. It seems odd to me that the fully negated query does work if I don't set q.op=AND. I guess this must be "adding complexity". Actually I just discovered that that simply removing the extraneous

Re: Clean checkbox on DIH

2017-05-01 Thread Shawn Heisey
On 4/28/2017 9:01 AM, Mahmoud Almokadem wrote: > We already using a shell scripts to do our import and using fullimport > command to do our delta import and everything is doing well several > years ago. But default of the UI is full import with clean and commit. > If I press the Execute button by

Re: Solr performance on EC2 linux

2017-05-01 Thread John Bickerstaff
It's also very important to consider the type of EC2 instance you are using... We settled on the R4.2XL... The R series is labeled "High-Memory" Which instance type did you end up using? On Mon, May 1, 2017 at 8:22 AM, Shawn Heisey wrote: > On 4/28/2017 10:09 AM, Jeff

Re: Solr performance on EC2 linux

2017-05-01 Thread Shawn Heisey
On 4/28/2017 10:09 AM, Jeff Wartes wrote: > tldr: Recently, I tried moving an existing solrcloud configuration from a > local datacenter to EC2. Performance was roughly 1/10th what I’d expected, > until I applied a bunch of linux tweaks. How very strange. I knew virtualization would have

Re: After upgrade to Solr 6.5, q.op=AND affects filter query differently than in older version

2017-05-01 Thread Shawn Heisey
On 4/26/2017 1:04 PM, Andy C wrote: > I'm looking at upgrading the version of Solr used with our application from > 5.3 to 6.5. > > Having an issue with a change in the behavior of one of the filter queries > we generate. > > The field "ctindex" is only present in a subset of documents. It

Re: recommended zookeeper version for solr cloud

2017-05-01 Thread Shawn Heisey
On 4/26/2017 3:44 AM, David Michael Gang wrote: > Which version of external zookeper is recommended to use in production > environments? 3.4.6 which is the version shipped with solr or 3.4.10 > which is the latest stable? If it were me, I would use the latest. The list of bugs fixed in each ZK

Re: Troubleshooting solr errors

2017-05-01 Thread Shawn Heisey
On 4/25/2017 12:05 PM, Daniel Miller wrote: > The problem isn't a particular email message - I get a cascade of > those errors (every time a new message is received) once the server > "breaks". The fix is to restart the server. I did find a Java heap > error in the log - so I've increased the

Re: Building Solr greater than 6.2.1

2017-05-01 Thread Ryan Yacyshyn
I was using Java 8 all along but more specifically, it was 1.8.0_25 (full details below). java version "1.8.0_25" Java(TM) SE Runtime Environment (build 1.8.0_25-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode) I initially didn't think it was my Java version so I just cleared

Re: Is it expected for Synonyms to work vice-versa

2017-05-01 Thread ravi432
I also getting the same results with following scenario Anderson window => american craftsman. when i type Anderson window i want solr to return results for Anderson window and american craftsman. but it is giving only Anderson window. but when i type american craftsman solr is returning

Re: Building Solr greater than 6.2.1

2017-05-01 Thread Shawn Heisey
On 5/1/2017 6:34 AM, Ryan Yacyshyn wrote: > Thanks Alex, it's working now. I had to update Java. What version were you using? Lucene/Solr 6 requires Java 8. I don't think that building 6.2.1 would have been successful if it weren't Java 8. I'm not familiar with any specific Java release

Re: Building Solr greater than 6.2.1

2017-05-01 Thread Ryan Yacyshyn
Thanks Alex, it's working now. I had to update Java. Regards, Ryan On Mon, 1 May 2017 at 14:48 Alexandre Rafalovitch wrote: > Make sure your Java is latest update. Seriously > > Also, if still failing, try blowing away your Ivy cache. > > Regards, > Alex > > On 1

Re: BooleanQuery and WordDelimiterFilter

2017-05-01 Thread Rick Leir
Avi, Tell us the relevant field types you have in schema.xml. You can also solve this all for yourself in the Solr Admin Analysis panel. Cheers -- Rick On May 1, 2017 2:34:31 AM EDT, Avi Steiner wrote: >Hi > >I have a question regarding the use of query parser and

Re: Slow indexing speed when collection size is large

2017-05-01 Thread Rick Leir
Zheng, Are you POSTing using curl? Get several processes working in parallel to get a small boost. Solrj should speed you up a bit too (numbers anyone?). How many documents do you bundle in a POST? Do you have lots of RAM? Sharding? Cheers -- Rick On April 30, 2017 10:39:29 PM EDT, Zheng Lin

RE: Term no longer matches if PositionLengthAttr is set to two

2017-05-01 Thread Markus Jelsma
Hello again, apologies for cross-posting and having to get back to this unsolved problem. Initially i thought this is a problem i have with, or in Lucene. Maybe not, so is this problem in Solr? Is here anyone who has seen this problem before? Many thanks, Markus -Original message- >

Is it expected for Synonyms to work vice-versa

2017-05-01 Thread Atita Arora
Hi, I have this strange issues happening today where I specified certain keyword to match as synonym word as : (^|[^a-zA-Z0-9])[cC][#]([^a-zA-Z0-9]|$)=>$1csharp$2 Which essentially means anyone searching for "C#" should be matched with a document containing "csharp" too. Now I have ran into

RE: pagination of results of grouping by more than one field

2017-05-01 Thread Mikhail Ibraheem
Hi, Any clue? Thanks -Original Message- From: Mikhail Ibraheem Sent: Sunday, April 30, 2017 10:09 AM To: solr-user@lucene.apache.org Subject: pagination of results of grouping by more than one field Hi, I have a problem that I need to group by X and Y and aggregator on Z and I need

Re: Building Solr greater than 6.2.1

2017-05-01 Thread Alexandre Rafalovitch
Make sure your Java is latest update. Seriously Also, if still failing, try blowing away your Ivy cache. Regards, Alex On 1 May 2017 6:34 AM, "Ryan Yacyshyn" wrote: > Hi all, > > I'm trying to build Solr 6.5.1 but it's is failing. I'm able to > successfully

BooleanQuery and WordDelimiterFilter

2017-05-01 Thread Avi Steiner
Hi I have a question regarding the use of query parser and BooleanQuery. I have 3 documents indexed. Doc1 contains the words huntman's and huntman Doc2 contains the word huntman's Doc3 contains the word huntman When I search for huntman's I get Doc1 and Doc2 When I search for +huntman's I get