Re: solrcloud 4.3.1 - stability and failure scenario questions

Shalin Shekhar Mangar Mon, 24 Jun 2013 00:28:20 -0700

On Mon, Jun 24, 2013 at 12:16 AM, Utkarsh Sengar <utkarsh2...@gmail.com> wrote:
> Thanks!
>
> 1. "shards.tolerant=true" works, shouldn't this parameter be default?


A whole shard being unavailable is a big deal. The default behavior
should not hide such a condition. Some people may be willing to take a
hit on coverage to preserve availability and for them Solr has such an
option.

>
> 2. Regarding zk, yes it should be outside the solr nodes and I am
> evaluating what difference does it make.
>
> 3. Regarding usecase: Daily queries will be about 100k to 200k, not much.
> The total data to be indexed is about 45M documents with a total size of
> 20GB. 3 nodes (sharded and RAM of 30GB each) with 3 replicas sounds like an
> overkill for this?

So three shards with two replicas each? That sounds alright
considering your boxes have a lot of RAM, YMMV. Your best bet is to
run performance tests on your data with the kind of queries and
traffic you expect.

>
> Thanks,
> -Utkarsh
>
>
> On Sat, Jun 22, 2013 at 8:53 PM, Shalin Shekhar Mangar <
> shalinman...@gmail.com> wrote:
>
>> Use shards.tolerant=true to return documents that are available in the
>> shards that are still alive.
>>
>> Typically people setup ZooKeeper outside of Solr so that solr nodes
>> can be added/removed easily independent of ZooKeeper plus it isolates
>> ZK from large GC pauses due to Solr's garbage. See
>> http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A7
>>
>> Depending on you use-case, 2-3 replicas might be okay. We don't have
>> enough information to answer that question.
>>
>>
>>
>> On Sat, Jun 22, 2013 at 10:40 PM, Utkarsh Sengar <utkarsh2...@gmail.com>
>> wrote:
>> > Thanks Anshum.
>> >
>> > Sure, creating a replica will make it failure resistant, but death of
>> one shard should not make the whole cluster unusable.
>> >
>> > 1/3rd of the keys hosted in the killed shard should be unavailable but
>> others should be available. Right?
>> >
>> > Also, any suggestions on the recommended size of zk and solr cluster
>> size and configuration?
>> >
>> > Example: 3 shards with 3 replicas and 3 zk processes running on the same
>> solr mode sounds acceptable? (Total of 6 VMs)
>> >
>> > Thanks,
>> > -Utkarsh
>> >
>> > On Jun 22, 2013, at 4:20 AM, Anshum Gupta <ans...@anshumgupta.net>
>> wrote:
>> >
>> >> You need to have at least 1 replica from each shard for the SolrCloud
>> setup
>> >> to work for you.
>> >> When you kill 1 shard, you essentially are taking away 1/3 of the range
>> of
>> >> shard key.
>> >>
>> >>
>> >> On Sat, Jun 22, 2013 at 4:31 PM, Utkarsh Sengar <utkarsh2...@gmail.com
>> >wrote:
>> >>
>> >>> Hello,
>> >>>
>> >>> I am testing a 3 node solrcloud cluster with 3 shards. 3 zk nodes are
>> >>> running in a different process in the same machines.
>> >>>
>> >>> I wanted to know the recommended size of a solrcloud cluster (min zk
>> >>> nodes?)
>> >>>
>> >>> This is the SolrCloud dump:
>> https://gist.github.com/utkarsh2012/5840455
>> >>>
>> >>> And, I am not sure if I am hitting this frustrating bug or this is
>> just a
>> >>> configuration error from my side. When I kill any *one* of the nodes,
>> the
>> >>> whole cluster stops responding and I get this request when I query any
>> one
>> >>> of the two alive nodes.
>> >>>
>> >>> {
>> >>>  "responseHeader":{
>> >>>    "status":503,
>> >>>    "QTime":2,
>> >>>    "params":{
>> >>>      "indent":"true",
>> >>>      "q":"*:*",
>> >>>      "wt":"json"}},
>> >>>  "error":{
>> >>>    "msg":"no servers hosting shard: ",
>> >>>    "code":503}}
>> >>>
>> >>>
>> >>>
>> >>> I see this exception:
>> >>> 952399 [qtp516992923-74] ERROR org.apache.solr.core.SolrCore  –
>> >>> org.apache.solr.common.SolrException: no servers hosting shard:
>> >>>    at
>> >>>
>> >>>
>> org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:149)
>> >>>    at
>> >>>
>> >>>
>> org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119)
>> >>>    at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >>>    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> >>>    at
>> >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>> >>>    at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >>>    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> >>>    at
>> >>>
>> >>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>> >>>    at
>> >>>
>> >>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>> >>>    at java.lang.Thread.run(Thread.java:662)
>> >>>
>> >>>
>> >>> --
>> >>> Thanks,
>> >>> -Utkarsh
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >> Anshum Gupta
>> >> http://www.anshumgupta.net
>>
>>
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>>
>
>
>
> --
> Thanks,
> -Utkarsh



--
Regards,
Shalin Shekhar Mangar.

Re: solrcloud 4.3.1 - stability and failure scenario questions

Reply via email to