[ 
https://issues.apache.org/jira/browse/SOLR-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791557#action_12791557
 ] 

Mark Miller commented on SOLR-1277:
-----------------------------------

bq. maybe this is already in the spec

Nothing is completely nailed down in the spec - Yonik has done a bunch of work 
on the SolrCloud page, but a lot of that is: we could do this, or we could do 
that, or we might do this. We haven't really nailed much down firmly. Still 
pretty high level at the moment.

bq. How are we addressing a failed connection to a slave server, and instead of 
failing the request, re-making the request to an adjacent slave?

We haven't really gotten there. But we want to cover that. What do you propose?

The more we get these discussions going, the faster things will start getting 
nailed down ...

bq. A failure is a failure and whether it's the GC or something else, it's 
really the same thing.

Its kind of arbitrary distinctions. Your saying, we would say a GC pause of 4 
seconds (under the ZK client timeout) is not a failure, and a GC timeout of 6 
seconds (over the ZK client timeout) is a failure. I'm not claiming any 
distinction is better than another though - just trying to work out the 
directions we want to go so I can start paddling.

I can code till the cows come home with no input, but you might not like the 
results :)

> Implement a Solr specific naming service (using Zookeeper)
> ----------------------------------------------------------
>
>                 Key: SOLR-1277
>                 URL: https://issues.apache.org/jira/browse/SOLR-1277
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.4
>            Reporter: Jason Rutherglen
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: log4j-1.2.15.jar, SOLR-1277.patch, SOLR-1277.patch, 
> SOLR-1277.patch, SOLR-1277.patch, zookeeper-3.2.1.jar
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> The goal is to give Solr server clusters self-healing attributes
> where if a server fails, indexing and searching don't stop and
> all of the partitions remain searchable. For configuration, the
> ability to centrally deploy a new configuration without servers
> going offline.
> We can start with basic failover and start from there?
> Features:
> * Automatic failover (i.e. when a server fails, clients stop
> trying to index to or search it)
> * Centralized configuration management (i.e. new solrconfig.xml
> or schema.xml propagates to a live Solr cluster)
> * Optionally allow shards of a partition to be moved to another
> server (i.e. if a server gets hot, move the hot segments out to
> cooler servers). Ideally we'd have a way to detect hot segments
> and move them seamlessly. With NRT this becomes somewhat more
> difficult but not impossible?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to