Re: Finding out optimal hash ranges for shard split

2015-05-06 Thread anand.mahajan
Okay - Thanks for the confirmation Shalin. Could this be a feature request in the Collections API - that we have a Split shard dry run API that accepts sub-shards count as a request param and returns the optimal shard ranges for the number of sub-shards requested to be created along with the

Re: Finding out optimal hash ranges for shard split

2015-05-06 Thread anand.mahajan
Yes - I'm using 2 level composite ids and that has caused the imbalance for some shards. Its cars data and the composite ids are of the form year-make!model-and couple of other specifications. e.g. 2013Ford!Edge!123456 - but there are just far too many Ford 2013 or 2011 cars that go and occupy the

Re: Finding out optimal hash ranges for shard split

2015-05-05 Thread anand.mahajan
Looks like its not possible to find out the optimal hash ranges for a split before you actually split it. So the only way out is to keep splitting out the large subshards? -- View this message in context:

Finding out optimal hash ranges for shard split

2015-05-03 Thread anand.mahajan
Hi all, Before doing a splitshard - Is there a way to figure out optimal hash ranges for the shard that will evenly split the documents on the new sub-shards that get created? Sort of a dry-run to the actual split shard command with ranges parameter specified with it that just shows the number of

Re: Leaders in Recovery Failed state

2015-02-09 Thread anand.mahajan
Erick Erickson erickerickson at gmail.com writes: What version of Solr? On Tue, Jan 20, 2015 at 7:07 AM, anand.mahajan anand at zerebral.co.in wrote: Hi all, I have a cluster with 36 Shards and 3 replica per shard. I had to recently restart the entire cluster - most

Delete Replica API Async Calls not being processed

2015-02-09 Thread anand.mahajan
Hi, I needed to delete a couple replica for a shard and used the Async Collections API calls to do that. I see all my requests in the 'submitted' state but none have been processed yet. (been 4 hours or so) How do I know whether these requests are under process at all? And if required how could

Leaders in Recovery Failed state

2015-01-20 Thread anand.mahajan
Hi all, I have a cluster with 36 Shards and 3 replica per shard. I had to recently restart the entire cluster - most of the shards replica are back up - but a few shards have not had any leaders for a long long time (close to 18 hours now) - I tried reloading these cores and even the servlet

Leaders in Recovery Failed state

2015-01-20 Thread anand.mahajan
Hi all,I have a cluster with 36 Shards and 3 replica per shard. I had to recently restart the entire cluster - most of the shards replica are back up - but a few shards have not had any leaders for a long long time (close to 18 hours now) - I tried reloading these cores and even the servlet

SolrCloud Slow to boot up

2014-09-25 Thread anand.mahajan
Hello all, Hosted a SolrCloud - 6 Nodes - 36 Shards x 3 Replica each - 108 cores across 6 servers. Moved in about 250M documents in this cluster. When I restart this cluster - only the leaders per shard comes up live instantly (within a minute) and all the replicas are shown as Recovering on the

Re: SolrCloud Slow to boot up

2014-09-25 Thread anand.mahajan
1. I've hosted it with Helios v 0.07 that ships with Solr 4.10 2. Change to solrconfig.xml - a. commits every 10 mins b. soft commits every 10 secs c. disabled all caches as the usage is very random (no end users only services doing the searches) and mostly single requests d. use cold

Re: SolrCloud Scale Struggle

2014-08-10 Thread anand.mahajan
Hello all, Thank you for your suggestions. With the autoCommit (every 10 mins) and softCommit (every 10 secs) frequencies reduced things work much better now. The CPU usages has gone down considerably too (by about 60%) and the read/write throughput is showing considerable improvements too.

Re: SolrCloud Scale Struggle

2014-08-02 Thread anand.mahajan
Thank you everyone for your responses. Increased the hard commit to 10mins and autoSoftCommit to 10 secs. (I wont really need a real time get - tweaked the app code to cache the doc and use the app side cached version instead of fetching it from Solr) Will watch it for a day or two and clock the

Re: SolrCloud Scale Struggle

2014-08-02 Thread anand.mahajan
Thanks Shawn. I'm using 2 level composite id routing right now. These are all Used Cars listings and all search queries always have car year and make in the search criteria - hence that made sense to have Year+Make as level 1 in the composite id. Beyond that the second level composite id is based

SolrCloud Scale Struggle

2014-08-01 Thread anand.mahajan
Hello all, Struggling to get this going with SolrCloud - Requirement in brief : - Ingest about 4M Used Cars listings a day and track all unique cars for changes - 4M automated searches a day (during the ingestion phase to check if a doc exists in the index (based on values of 4-5 key fields)

Re: SolrCloud Scale Struggle

2014-08-01 Thread anand.mahajan
Oops - my bad - Its autoSoftCommit that is set after every doc and not an autoCommit. Following snippet from the solrconfig - autoCommit maxTime1/maxTime openSearchertrue/openSearcher /autoCommit autoSoftCommit maxDocs1/maxDocs /autoSoftCommit Shall I increase

Re: SolrCloud Scale Struggle

2014-08-01 Thread anand.mahajan
Thanks for the reply Shalin. 1. I'll try increasing the softCommit interval and the autoSoftCommit too. One mistake I made that I realized just now is that I am using /solr/select and expecting it to do an NRT - for NRT search its got to be /select/get handler that needs to be used. Please