Hello Dong,

Thanks for the very well written KIP. I had a general thought on the ZK
path management, wondering if the following alternative would work:

1. Bump up versions in "brokers/topics/[topic]" and
"/brokers/topics/[topic]/partitions/[partitionId]/state"
to 2, in which the replica id is no longer an int but a string.

2. Bump up versions in "/brokers/ids/[brokerId]" to add another field:

{ "fields":
    [ {"name": "version", "type": "int", "doc": "version id"},
      {"name": "host", "type": "string", "doc": "ip address or host name of
the broker"},
      {"name": "port", "type": "int", "doc": "port of the broker"},
      {"name": "jmx_port", "type": "int", "doc": "port for jmx"}
      {"name": "log_dirs",
       "type": {"type": "array",
                "items": "int",
                "doc": "an array of the id of the log dirs in broker"}
      },
    ]
}

3. The replica id can now either be an string-typed integer indicating that
all partitions on the broker still treated as failed or not as a whole,
i.e. no support needed for JBOD; or be a string typed "[brokerID]-[dirID]",
in which brokers / controllers can still parse to determine which broker is
hosting this replica: in this case the management of replicas is finer
grained, no longer at the broker level (i.e. if broker dies all replicas go
offline) but broker-dir level.

4. When broker had one of the dir failed, it can modify its "
/brokers/ids/[brokerId]" registry and remove the dir id, controller already
listening on this path can then be notified and run the replica assignment
accordingly where replica id is computed as above.


By doing this controller can also naturally reassign replicas between dirs
within the same broker.


Guozhang


On Thu, Jan 12, 2017 at 6:25 PM, Ismael Juma <ism...@juma.me.uk> wrote:

> Thanks for the KIP. Just wanted to quickly say that it's great to see
> proposals for improving JBOD (KIP-113 too). More feedback soon, hopefully.
>
> Ismael
>
> On Thu, Jan 12, 2017 at 6:46 PM, Dong Lin <lindon...@gmail.com> wrote:
>
> > Hi all,
> >
> > We created KIP-112: Handle disk failure for JBOD. Please find the KIP
> wiki
> > in the link https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 112%3A+Handle+disk+failure+for+JBOD.
> >
> > This KIP is related to KIP-113
> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 113%3A+Support+replicas+movement+between+log+directories>:
> > Support replicas movement between log directories. They are needed in
> order
> > to support JBOD in Kafka. Please help review the KIP. You feedback is
> > appreciated!
> >
> > Thanks,
> > Dong
> >
>



-- 
-- Guozhang

Reply via email to