[ 
https://issues.apache.org/jira/browse/SOLR-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788924#action_12788924
 ] 

Patrick Hunt commented on SOLR-1277:
------------------------------------

I'm not familiar with solr requirements but, at a higher level, I wanted to 
point out that when
designing your ZooKeeper model you should keep scaling issue in mind, also 
identifying "patterns" is very useful
see this link for some background (discussions we are having with hbase on 
similar vein)
http://wiki.apache.org/hadoop/ZooKeeper/HBaseUseCases

The basic concerns users should think about:

1) number of sessions (clients) the ZK service will maintain (for solr I think 
this is small, 10's or 100's of sessions right?)

2) number of znodes and size of znodes. 
a) Memory requirements mainly, also gc effects as current sun vms are poor wrt 
gc pauses
b) we generally discourage large data size on the znodes (generally we suggest 
< 10k, <1k even better) as the ZK service copies this data from 
server->leader->followers as part of a write - so what I'm saying is that large 
data can slow your write performance as a result)
c) second issue re. data  size on znodes - we don't have partial read/write 
operations so you want to partition your data into multiple znodes rather than 
having 1 large znode

3) number of watches - typically you'll be using watches to dynamically update 
solr based on changes to the system. You want to think about the watches you 
are setting (in particular you would like to limit "herd" effect)


> Implement a Solr specific naming service (using Zookeeper)
> ----------------------------------------------------------
>
>                 Key: SOLR-1277
>                 URL: https://issues.apache.org/jira/browse/SOLR-1277
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.4
>            Reporter: Jason Rutherglen
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: log4j-1.2.15.jar, SOLR-1277.patch, SOLR-1277.patch, 
> SOLR-1277.patch, SOLR-1277.patch, zookeeper-3.2.1.jar
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> The goal is to give Solr server clusters self-healing attributes
> where if a server fails, indexing and searching don't stop and
> all of the partitions remain searchable. For configuration, the
> ability to centrally deploy a new configuration without servers
> going offline.
> We can start with basic failover and start from there?
> Features:
> * Automatic failover (i.e. when a server fails, clients stop
> trying to index to or search it)
> * Centralized configuration management (i.e. new solrconfig.xml
> or schema.xml propagates to a live Solr cluster)
> * Optionally allow shards of a partition to be moved to another
> server (i.e. if a server gets hot, move the hot segments out to
> cooler servers). Ideally we'd have a way to detect hot segments
> and move them seamlessly. With NRT this becomes somewhat more
> difficult but not impossible?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to