Re: Joining more than 2 collections

2017-05-02 Thread Zheng Lin Edwin Yeo
Hi Joel, Thanks for the info. Regards, Edwin On 3 May 2017 at 02:04, Joel Bernstein wrote: > Also take a look at the documentation for the "fetch" streaming expression. > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Tue, May 2, 2017 at 2:03 PM, Joel Bernstein

Re: Suggester uses lots of 'Page cache' memory

2017-05-02 Thread Damien Kamerman
Thanks Shawn, I'll have to look closer into this. On 3 May 2017 at 12:10, Shawn Heisey wrote: > On 5/2/2017 6:46 PM, Damien Kamerman wrote: > > Shalin, yes I think it's a case of the Suggester build hitting the index > > all at once. I'm thinking it's hitting all docs, even

Re: Suggester uses lots of 'Page cache' memory

2017-05-02 Thread Shawn Heisey
On 5/2/2017 6:46 PM, Damien Kamerman wrote: > Shalin, yes I think it's a case of the Suggester build hitting the index > all at once. I'm thinking it's hitting all docs, even the ones without > fields relevant to the suggester. > > Shawn, I am using ZFS, though I think it's comparable to other

Re: Suggester uses lots of 'Page cache' memory

2017-05-02 Thread Damien Kamerman
Shalin, yes I think it's a case of the Suggester build hitting the index all at once. I'm thinking it's hitting all docs, even the ones without fields relevant to the suggester. Shawn, I am using ZFS, though I think it's comparable to other setups. mmap() should still be faster, while the ZFS ARC

Re: Reload an unloaded core

2017-05-02 Thread David Lee
I have similar needs but for a slightly different use-case. In my case, I am breaking up cores / indexes based on the month and year so that I can add an alias that always points to the last few months, but beyond that I want to simply unload the other indexes once they get past a few months

Re: Reload an unloaded core

2017-05-02 Thread Shashank Pedamallu
Thank you Simon, Erick and Shawn for your replies. Unfortunately, restarting Solr is not a option for me. So, I’ll try to follow the steps given by Shawn to see where I’m standing. Btw, I’m using Solr 6.4.2. Shawn, once again thank you very much for the detailed reply. Thanks, Shashank

Re: Reload an unloaded core

2017-05-02 Thread Shawn Heisey
On 5/2/2017 10:53 AM, Shashank Pedamallu wrote: > I want to unload a core from Solr without deleting data-dir or instance-dir. > I’m performing some operations on the data-dir after this and then I would > like to reload the core from the same data-dir. These are the things I tried: > > 1.

Re: solr-6.3.0 error port is running already

2017-05-02 Thread Rick Leir
Satya Say netstat --inet -lP You might need to add -ipv4 to that command. The P might be lower case (I am on the bus!). And the output might show misleading service names, see /etc/services. Cheers-- Rick On May 2, 2017 3:10:30 PM EDT, Satya Marivada wrote: >Hi, > >I

Re: Reload an unloaded core

2017-05-02 Thread simon
the core properties definitely disappears if you use a configset, as in # #Written by CorePropertiesLocator #Tue May 02 20:19:40 UTC 2017 name=testcore dataDir=/indexes/solrindexes/testcore configSet=myconf Using a conf directory, as in #Written by CorePropertiesLocator #Tue May 02 20:30:44

Re: Solr performance on EC2 linux

2017-05-02 Thread Tomás Fernández Löbbe
I remember seeing some performance impact (even when not using it) and it was attributed to the calls to System.nanoTime. See SOLR-7875 and SOLR-7876 (fixed for 5.3 and 5.4). Those two Jiras fix the impact when timeAllowed is not used, but I don't know if there were more changes to improve the

Re: Reload an unloaded core

2017-05-02 Thread Erick Erickson
IIRC, the core.properties file _is_ renamed to core.properties.unloaded or something like that. Yeah, this is something of a pain. The inverse of "unload" is "create" but you have to know exactly how to create a core, and in SolrCloud mode that's...interesting. It's much safer to bring the Solr

Re: solr-6.3.0 error port is running already

2017-05-02 Thread Erick Erickson
Well, if an ephemeral node exists, restarting your Zookeeper ensemble will delete it. Not sure what the precursor here is. Are you absolutely and totally sure you don't have a solr process still running on the node you try and start the shows this error? 'ps aux | grep solr' will show you all of

Re: solr-6.3.0 error port is running already

2017-05-02 Thread Satya Marivada
Any ideas? "null:org.apache.solr.common.SolrException: A previous ephemeral live node still exists. Solr cannot continue. Please ensure that no other Solr process using the same port is running already." Not sure, if JMX enablement has caused this. Thanks, Satya On Tue, May 2, 2017 at 3:10 PM

solr-6.3.0 error port is running already

2017-05-02 Thread Satya Marivada
Hi, I am getting the below exception all of a sudden with solr-6.3.0. "null:org.apache.solr.common.SolrException: A previous ephemeral live node still exists. Solr cannot continue. Please ensure that no other Solr process using the same port is running already." We are using external zookeeper

Re: Joining more than 2 collections

2017-05-02 Thread Joel Bernstein
Also take a look at the documentation for the "fetch" streaming expression. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, May 2, 2017 at 2:03 PM, Joel Bernstein wrote: > Yes you join more then one collection with Streaming Expressions. Here are > a few things to keep

Re: Joining more than 2 collections

2017-05-02 Thread Joel Bernstein
Yes you join more then one collection with Streaming Expressions. Here are a few things to keep in mind. * You'll likely want to use the parallel function around the largest join. You'll need to use the join keys as the partitionKeys. * innerJoin: requires that the streams be sorted on the join

Re: Reload an unloaded core

2017-05-02 Thread simon
I ran into the exact same situation recently. I unloaded from the browser GUI which does not delete the data or instance dirs, but does delete core.properties. I couldn't find any API either so I eventually manually recreated core.properties and restarted Solr. Would be nice if the

Re: Solr performance on EC2 linux

2017-05-02 Thread Walter Underwood
Hmm, has anyone measured the overhead of timeAllowed? We use it all the time. If nobody has, I’ll run a benchmark with and without it. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On May 2, 2017, at 9:52 AM, Chris Hostetter

Re: Clean checkbox on DIH

2017-05-02 Thread Mahmoud Almokadem
Thanks Shawn, We already use the admin UI for testing and bulk uploads. We are using curl scripts for automation process. I'll report the issues regarding the new UI on JIRA. Thanks, Mahmoud On Tuesday, May 2, 2017, Shawn Heisey wrote: > On 5/2/2017 6:53 AM, Mahmoud

Re: Solr performance on EC2 linux

2017-05-02 Thread Chris Hostetter
: I specify a timeout on all queries, Ah -- ok, yeah -- you mean using "timeAllowed" correct? If the root issue you were seeing is in fact clocksource related, then using timeAllowed would probably be a significant compounding factor there since it would involve a lot of time checks in a

Reload an unloaded core

2017-05-02 Thread Shashank Pedamallu
Hi all, I want to unload a core from Solr without deleting data-dir or instance-dir. I’m performing some operations on the data-dir after this and then I would like to reload the core from the same data-dir. These are the things I tried: 1. Reload api – throws an exception saying no such

Re: Clean checkbox on DIH

2017-05-02 Thread Rick Leir
Mahmoud, Would it help to have field validation? If the DIH fields are still default when you press execute, then field validation puts out a message and blocks any clearing. Just an idea, please excuse if I am off track. -- Rick On May 2, 2017 8:53:12 AM EDT, Mahmoud Almokadem

Re: choosing placement upon RESTORE

2017-05-02 Thread xavier jmlucjav
thanks Mikhail, that sounds like it would help me as it allows you to set createNodeSet on RESTORE calls On Tue, May 2, 2017 at 2:50 PM, Mikhail Khludnev wrote: > This sounds relevant, but different to https://issues.apache.org/ > jira/browse/SOLR-9527 > You may want to follow

Re: Add new Solr Node to existing Solr setup

2017-05-02 Thread Erick Erickson
Along with Shawn's comments, if you create a new collection, consider "oversharding". Say you calculate (more later) you can fit your collection in N shards, but you expect, over time, for your collection to triple. _start out_ with 3N shards, many of them will be co-located. As you get more docs

Re: IndexFormatTooNewException - MapReduceIndexerTool for PDF files

2017-05-02 Thread Shawn Heisey
On 5/1/2017 10:48 PM, ecos wrote: > The cause of the error is: > org.apache.lucene.index.IndexFormatTooNewException: Format version is not > supported (resource: BufferedChecksumIndexInput (segments_1)): 4 (needs to > be between 0 and 3). > > Reading out there I found the exception is thrown when

Re: Clean checkbox on DIH

2017-05-02 Thread Shawn Heisey
On 5/2/2017 6:53 AM, Mahmoud Almokadem wrote: > And for the dataimport I always use the old UI cause the new UI > doesn't show the live update and sometimes doesn't show the > configuration. I think there are many bugs on the new UI. Do you know if these problems have been reported in the Jira

Re: Suggester uses lots of 'Page cache' memory

2017-05-02 Thread Shawn Heisey
On 5/1/2017 10:52 PM, Damien Kamerman wrote: > I have a Solr v6.4.2 collection with 12 shards and 2 replicas. Each > replica uses about 14GB disk usage. I'm using Solaris 11 and I see the > 'Page cache' grow by about 7GB for each suggester replica I build. The > suggester index itself is very

Re: Clean checkbox on DIH

2017-05-02 Thread Mahmoud Almokadem
Thanks Shawn for your clarifications, I think showing a confirmation message saying that "The whole index will be cleaned" when the clean option is checked will be good. I always remove the check from the file /opt/solr/server/solr-webapp/webapp/tpl/dataimport.html after installing solr but when

Re: IndexFormatTooNewException - MapReduceIndexerTool for PDF files

2017-05-02 Thread ravi432
Hi ecos, Is it giving solr documents when running mapreduce indexer tool with debug mode. if not can you run it with debug mode and send out any error. -- View this message in context:

Re: Add new Solr Node to existing Solr setup

2017-05-02 Thread Shawn Heisey
On 5/2/2017 4:24 AM, Venkateswarlu Bommineni wrote: > We have Solr setup with below configuration. > > 1) 1 collection with one shard > 2) 4 Solr Nodes > 2) and replication factor 4 with one replication to each Solr Node. > > as of now, it's working fine.But going forward it Size may reach

Add new Solr Node to existing Solr setup

2017-05-02 Thread Venkateswarlu Bommineni
Hello Team, We have Solr setup with below configuration. 1) 1 collection with one shard 2) 4 Solr Nodes 2) and replication factor 4 with one replication to each Solr Node. as of now, it's working fine.But going forward it Size may reach high and we would need to add new Node. Could you guys

Re: Suggester uses lots of 'Page cache' memory

2017-05-02 Thread Shalin Shekhar Mangar
On Tue, May 2, 2017 at 10:22 AM, Damien Kamerman wrote: > Hi all, > > I have a Solr v6.4.2 collection with 12 shards and 2 replicas. Each replica > uses about 14GB disk usage. I'm using Solaris 11 and I see the 'Page cache' > grow by about 7GB for each suggester replica I

Re: CDCR with SSL enabled

2017-05-02 Thread Xie, Sean
From the QUEUE action, the output is: 0 0 34741356 2 stopped On 5/2/17, 1:43 AM, "Xie, Sean" wrote: Does CDCR support SSL encrypted SolrCloud? I have two clusters started with SSL, and CDCR setup instruction is followed on source and target. However,