Hi Jun,

The inter-broker movement case has two subcases:

1. Where no log dir is supplied. This corresponds to the existing
kafka-reassign-partitions script. This just needs the appropriate JSON to
be written to the reassignment znode.
2. Where the log dir is supplied. This is covered in KIP-113 (in addition
to the intra-broker case) and that KIP defines an algorithm where an
initial AlterReplicaDirRequests is sent to each receiving broker, then the
znode gets updated, then there are further AlterReplicaDirRequests.

In the first case, the JSON lacks any log dir information. In the second
case the JSON includes log dir information. I'm suggesting that a single
PartitionReassignmentRequest class could be used to represent (and be
convertible to) both kinds of JSON. (In fact the one JSON schema is a
subset of the other).

So PartitionReassignmentRequest would indeed only be necessary for
inter-broker movement, but it would be necessary in both the with- and
without log dir cases of that.

While I could have a PartitionReassignmentRequest that only dealt with
inter-broker-without-log-dirs data movement, that wouldn't be enough to
address the needs of KIP-179, because the inter-broker-with-log-dirs case
still needs to update the znode, and KIP-179 is all about the
script/command not talking to Zookeeper any more.

Does that make sense to you?

Cheers,

Tom


On 11 August 2017 at 16:22, Jun Rao <j...@confluent.io> wrote:

> Hi, Tom,
>
> One approach is to have a PartitionReassignmentRequest that only deals with
> inter broker data movement (i.e, w/o any log dirs in the request). The
> request is directed to any broker, which then just writes the reassignment
> json to ZK. There is a separate AlterReplicaDirRequest that only deals with
> intra broker data movement (i.e., with the log dirs in the request). This
> request is directed to the specific broker who replicas need to moved btw
> log dirs. This seems to be what's in your original proposal in KIP-179,
> which I think makes sense.
>
> In your early email, I thought you were proposing to have
> PartitionReassignmentRequest
> dealing with both inter and intra broker data movement (i.e., include log
> dirs in the request). Then, I am not sure how this request will be
> processed on the broker. So, you were not proposing that?
>
> Thanks,
>
> Jun
>
> On Fri, Aug 11, 2017 at 5:37 AM, Tom Bentley <t.j.bent...@gmail.com>
> wrote:
>
> > Hi Jun and Dong,
> >
> > Thanks for your replies...
> >
> > On 10 August 2017 at 20:43, Dong Lin <lindon...@gmail.com> wrote:
> >
> > > This is a very good idea. I have updated the KIP-113 so that
> > > DescribeDirResponse returns lag instead of LEO.
> >
> >
> > Excellent!
> >
> > On Thu, Aug 10, 2017 at 10:21 AM, Jun Rao <j...@confluent.io> wrote:
> > >
> > > > 2. Tom, note that currently, the LeaderAndIsrRequest doesn't specify
> > the
> > > > log dir. So, I am not sure in your new proposal, how the log dir info
> > is
> > > > communicated to all brokers. Is the broker receiving the
> > > > ReassignPartitionsRequest
> > > > going to forward that to all brokers?
> > >
> >
> > My understanding of KIP-113 is that each broker has its own set of log
> dirs
> > (even though in practice they might all have the same names, and might
> all
> > be distributed across the brokers disks in the same way, and all those
> > disks might be identical), so it doesn't make sense for one broker to be
> > told about the log dirs of another broker.
> >
> > Furthermore, it is the AlterReplicaDirRequest that is sent to the
> receiving
> > broker which associates the partition with the log dir on that broker. To
> > quote from KIP-113 (specifically, the notes in this section
> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-113%
> > 3A+Support+replicas+movement+between+log+directories#KIP-
> > 113:Supportreplicasmovementbetweenlogdirectories-1%29Howtomo
> > vereplicabetweenlogdirectoriesonthesamebroker>
> > ):
> >
> > - If broker doesn't not have already replica created for the specified
> > > topicParition when it receives AlterReplicaDirRequest, it will reply
> > > ReplicaNotAvailableException AND remember (replica, destination log
> > > directory) pair in memory to create the replica in the specified log
> > > directory when it receives LeaderAndIsrRequest later.
> > >
> >
> > I've not proposed anything to change that, really. All I've done is
> change
> > who creates the znode which causes the LeaderAndIsrRequest. Because
> KIP-113
> > has been accepted, I've tried to avoid attempting to change it too much.
> >
> > Cheers,
> >
> > Tom
> >
>

Reply via email to