Hi, Tom,

The inter-broker-with-log-dirs case can be split into
inter-broker-w/o-log-dirs and log-dirs change per broker. KIP-113 proposes
to do the split in the tool. I am not sure if we really need to persist
log-dirs changes in ZK. During the discussion in KIP-113, we realized that
there is only a very short window when this information could be lost. In
the rare cases when this info is lost, one can always issue
AlterReplicaDirRequests
again. With this, once we add PartitionReassignmentRequest (w/o log dirs),
the reassignment tool won't need ZK, right?

If you have a single request for inter-broker-with-log-dirs, the receiving
broker has to do the split. Perhaps you can write down how the receiving
broker processes the merged request. Then, we can see how much additional
complexity is needed. Ideally, it would be useful to avoid adding
additional logic in the controller for it to understand log dirs.

Thanks,

Jun

On Fri, Aug 11, 2017 at 8:49 AM, Tom Bentley <t.j.bent...@gmail.com> wrote:

> Hi Jun,
>
> The inter-broker movement case has two subcases:
>
> 1. Where no log dir is supplied. This corresponds to the existing
> kafka-reassign-partitions script. This just needs the appropriate JSON to
> be written to the reassignment znode.
> 2. Where the log dir is supplied. This is covered in KIP-113 (in addition
> to the intra-broker case) and that KIP defines an algorithm where an
> initial AlterReplicaDirRequests is sent to each receiving broker, then the
> znode gets updated, then there are further AlterReplicaDirRequests.
>
> In the first case, the JSON lacks any log dir information. In the second
> case the JSON includes log dir information. I'm suggesting that a single
> PartitionReassignmentRequest class could be used to represent (and be
> convertible to) both kinds of JSON. (In fact the one JSON schema is a
> subset of the other).
>
> So PartitionReassignmentRequest would indeed only be necessary for
> inter-broker movement, but it would be necessary in both the with- and
> without log dir cases of that.
>
> While I could have a PartitionReassignmentRequest that only dealt with
> inter-broker-without-log-dirs data movement, that wouldn't be enough to
> address the needs of KIP-179, because the inter-broker-with-log-dirs case
> still needs to update the znode, and KIP-179 is all about the
> script/command not talking to Zookeeper any more.
>
> Does that make sense to you?
>
> Cheers,
>
> Tom
>
>
> On 11 August 2017 at 16:22, Jun Rao <j...@confluent.io> wrote:
>
> > Hi, Tom,
> >
> > One approach is to have a PartitionReassignmentRequest that only deals
> with
> > inter broker data movement (i.e, w/o any log dirs in the request). The
> > request is directed to any broker, which then just writes the
> reassignment
> > json to ZK. There is a separate AlterReplicaDirRequest that only deals
> with
> > intra broker data movement (i.e., with the log dirs in the request). This
> > request is directed to the specific broker who replicas need to moved btw
> > log dirs. This seems to be what's in your original proposal in KIP-179,
> > which I think makes sense.
> >
> > In your early email, I thought you were proposing to have
> > PartitionReassignmentRequest
> > dealing with both inter and intra broker data movement (i.e., include log
> > dirs in the request). Then, I am not sure how this request will be
> > processed on the broker. So, you were not proposing that?
> >
> > Thanks,
> >
> > Jun
> >
> > On Fri, Aug 11, 2017 at 5:37 AM, Tom Bentley <t.j.bent...@gmail.com>
> > wrote:
> >
> > > Hi Jun and Dong,
> > >
> > > Thanks for your replies...
> > >
> > > On 10 August 2017 at 20:43, Dong Lin <lindon...@gmail.com> wrote:
> > >
> > > > This is a very good idea. I have updated the KIP-113 so that
> > > > DescribeDirResponse returns lag instead of LEO.
> > >
> > >
> > > Excellent!
> > >
> > > On Thu, Aug 10, 2017 at 10:21 AM, Jun Rao <j...@confluent.io> wrote:
> > > >
> > > > > 2. Tom, note that currently, the LeaderAndIsrRequest doesn't
> specify
> > > the
> > > > > log dir. So, I am not sure in your new proposal, how the log dir
> info
> > > is
> > > > > communicated to all brokers. Is the broker receiving the
> > > > > ReassignPartitionsRequest
> > > > > going to forward that to all brokers?
> > > >
> > >
> > > My understanding of KIP-113 is that each broker has its own set of log
> > dirs
> > > (even though in practice they might all have the same names, and might
> > all
> > > be distributed across the brokers disks in the same way, and all those
> > > disks might be identical), so it doesn't make sense for one broker to
> be
> > > told about the log dirs of another broker.
> > >
> > > Furthermore, it is the AlterReplicaDirRequest that is sent to the
> > receiving
> > > broker which associates the partition with the log dir on that broker.
> To
> > > quote from KIP-113 (specifically, the notes in this section
> > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-113%
> > > 3A+Support+replicas+movement+between+log+directories#KIP-
> > > 113:Supportreplicasmovementbetweenlogdirectories-1%29Howtomo
> > > vereplicabetweenlogdirectoriesonthesamebroker>
> > > ):
> > >
> > > - If broker doesn't not have already replica created for the specified
> > > > topicParition when it receives AlterReplicaDirRequest, it will reply
> > > > ReplicaNotAvailableException AND remember (replica, destination log
> > > > directory) pair in memory to create the replica in the specified log
> > > > directory when it receives LeaderAndIsrRequest later.
> > > >
> > >
> > > I've not proposed anything to change that, really. All I've done is
> > change
> > > who creates the znode which causes the LeaderAndIsrRequest. Because
> > KIP-113
> > > has been accepted, I've tried to avoid attempting to change it too
> much.
> > >
> > > Cheers,
> > >
> > > Tom
> > >
> >
>

Reply via email to