Guys,

Thank you for your time. A short summary of our discussion.
Answering previous items:

1. 2. I will double check existing error codes to align the list of
errors that needs to be added.

3. We agreed to think again about the batch requests semantics.
The main concern is that users would expect we allow executing
multiple instructions for one topic in one batch.
I will start implementation and check whether there are any impediments
to handle it this way.

The same for AlterTopicRequest - I will try to make request semantics
as easy as possible and allow users change different things at one
time - e.g. change nr of partitions and replicas in one instruction.

4. We agreed not to add to TMR lag information.

5. We discussed preferred replica command and it was pointed out
that generally users shouldn't call this command manually now since
this is automatically handled by the cluster.
If there are no objections (especially from devops people) I will remove
respective request.

6. As discussed AdminClient API is a phase 2 and will go after Wire
Protocol extensions. It will be finalized as java-doc after I complete
patch for phase 1 - Wire Protocol + server-side code handling requests.

Thanks,
Andrii Biletskyi

On Wed, Apr 22, 2015 at 12:36 AM, Jay Kreps <jay.kr...@gmail.com> wrote:

> Hey Andrii, thanks for all the hard work on this, it has come a long way.
>
> A couple questions and comments on this.
>
> For the errors, can we do the following:
> 1. Remove IllegalArgument from the name, we haven't used that convention
> for other errors.
> 2. Normalize this list with the existing errors. For example, elsewhere
> when you give an invalid topic name we give back an InvalidTopicException
> but this is proposing a new error for that. It would be good that these
> kinds of errors are handled the same way across all requests in the
> protocol.
>
> Other comments:
> 3. I don't understand MultipleInstructionsForOneTopic
> and MultipleTopicInstructionsInOneBatch and the description is quite vague.
> There is some implicit assumption in this proposal about how batching will
> be done that doesn't seem to be explained.
> 4. I think adding replica lag to the metadata request is out of place and
> should not be in the metadata request. Two reasons: a. This is something
> that can only be answered by the leader for that partition. So querying N
> partitions fundamentally mean querying N brokers (roughly). This is
> different from the other properties which are "shared knowledge".
> b. This is a monitoring property not a configuration/metadata property. I
> recommend we remove this here and in the future add an API that gets all
> the monitoring stats from the server including lag. Adding all these to the
> metadata request won't make sense, right?
> 5. This includes a special request for preferred replica leader election. I
> feel that we should not expose an API for this because the user should not
> be in the business of managing leaders. We have gotten this feature to the
> point where preferred leadership election is enabled automatically. I think
> we should go further in that direction and do whatever work is required to
> make this the only option rather than trying to institute public apis for
> manually controlling it.
> 6. The API changes we discussed for the java api still aren't reflected in
> the proposal.
>
> -Jay
>
> On Tue, Apr 21, 2015 at 7:47 AM, Andrii Biletskyi <
> andrii.bilets...@stealth.ly> wrote:
>
> > Hi all,
> >
> > I've updated KIP-4 page to include all previously discussed items such
> as:
> > new error codes, merged alter-topic and reassign-partitions requests,
> added
> > TMR_V1.
> >
> > It'd be great if we concentrate on the Errors+Wire Protocol schema and
> > discuss
> > any remaining issues today, since first patch will include only
> server-side
> > implementation.
> >
> > Thanks,
> > Andrii Biletskyi
> >
> >
> > On Tue, Apr 21, 2015 at 9:46 AM, Andrii Biletskyi <
> > andrii.bilets...@stealth.ly> wrote:
> >
> > > 1. Yes, this will be much easier. Okay, let's add it.
> > >
> > > 2, Okay. This will differ a little bit from the way currently
> > > kafka-topics.sh handles alter-topic command, but I think it's
> > > a reasonable restriction.
> > >
> > > I'll update KIP acordingly to our weekly call.
> > >
> > > Thanks,
> > > Andrii Biletskyi
> > >
> > > On Mon, Apr 20, 2015 at 10:56 PM, Jun Rao <j...@confluent.io> wrote:
> > >
> > >> 1. Yes, lag is probably only going to be useful for the admin client.
> > >> However, so is isr. It seems to me that we should get lag and isr from
> > the
> > >> same request. I was thinking that we can just extend TMR by changing
> > >> replicas from an array of int to an array of (int, lag) pairs. Is that
> > too
> > >> complicated?
> > >>
> > >> 3. I was thinking that we just don't allow the cli to change more than
> > one
> > >> thing at a time. So, you will get an error if you want to change both
> > >> partitions and configs.
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >> On Sun, Apr 19, 2015 at 8:22 AM, Andrii Biletskyi <
> > >> andrii.bilets...@stealth.ly> wrote:
> > >>
> > >> > Jun,
> > >> >
> > >> > 1. Yes, seems we can add lag info to the TMR. But before that I
> wonder
> > >> > whether there are other reasons we need this info except for
> reassign
> > >> > partition command? As we discussed earlier the problem with poor
> > >> > monitoring capabilities for reassign-partitions (as currently we
> only
> > >> > inform
> > >> > users Completed/In Progress per partition) may require separate
> > >> solution.
> > >> > We were thinking about separate Wire protocol request. And I
> actually
> > >> like
> > >> > your
> > >> > idea about adding some sort of BrokerMetadataRequest for these
> > purposes.
> > >> > I actually think we can cover some other items (like rack-awareness)
> > but
> > >> > for
> > >> > me it deserves a separate KIP really.
> > >> > Also, adding Replica->Lag map per partition will make
> > >> TopicMetadataResponse
> > >> > very sophisticated:
> > >> > Map[TopicName, Map[PartitionId, Map[ReplicaId, Lag]].
> > >> > Maybe we need to leave it for a moment and propose new request
> rather
> > >> than
> > >> > making a new step towards one "monster" request.
> > >> >
> > >> > 2. Yes, error per topic.
> > >> > The only question is whether we should execute at least the very
> first
> > >> > alter topic
> > >> > command from the "duplicated" topic set or return error for all ...
> I
> > >> think
> > >> > the more
> > >> > predictable and reasonable option for clients would be returning
> > errors
> > >> for
> > >> > all
> > >> > duplicated topics.
> > >> >
> > >> > 3. Hm, yes. Actually we also have "change topic config" there. But
> it
> > is
> > >> > not
> > >> > related to such "replication" commands as increase replicas or
> change
> > >> > replica
> > >> > assignment.
> > >> > This will make CLI implementation a bit strange: if user specifies
> > >> increase
> > >> > partitions and change topic config in one line - taking into account
> > 2.
> > >> we
> > >> > will have
> > >> > to create two separate alter topic requests, which were designed as
> > >> batch
> > >> > requests :),
> > >> > but probably we can live with it.
> > >> > Okay, I will think about a separate error code to cover such cases.
> > >> >
> > >> > 4. We will need InvalidArgumentTopic (e.g. contains prohibited
> chars),
> > >> > IAPartitions, IAReplicas, IAReplicaAssignment, IATopicConfiguration.
> > >> > A server side implementation will be a little bit messy (like dozens
> > "if
> > >> > this then this
> > >> > error code") but maybe we should think about clients at the first
> > place
> > >> > here.
> > >> >
> > >> > Thanks,
> > >> > Andrii Biletskyi
> > >> >
> > >> > On Fri, Apr 17, 2015 at 1:46 AM, Jun Rao <j...@confluent.io> wrote:
> > >> >
> > >> > > 1. For the lags, we can add a new field "lags" per partition. It
> > will
> > >> > > return for each replica that's not in isr, the replica id and the
> > lag
> > >> in
> > >> > > messages. Also, if TMR is sent to a non-leader, the response can
> > just
> > >> > > include an empty array for isr and lags.
> > >> > >
> > >> > > 2. So, we will just return a topic level error for the duplicated
> > >> topics,
> > >> > > right?
> > >> > >
> > >> > > 3. Yes, it's true that today, one can specify both partitions and
> > >> > > replicaAssignment in the TopicCommand. However, partitions is
> > actually
> > >> > > ignored. So, it will be clearer if we don't allow users to do
> this.
> > >> > >
> > >> > > 4. How many specific error codes like InvalidPartitions and
> > >> > InvalidReplicas
> > >> > > are needed? If it's not that many, giving out more specific error
> > >> will be
> > >> > > useful for non-java clients.
> > >> > >
> > >> > > Thanks,
> > >> > >
> > >> > > Jun
> > >> > >
> > >> > >
> > >> > > On Wed, Apr 15, 2015 at 10:23 AM, Andrii Biletskyi <
> > >> > > andrii.bilets...@stealth.ly> wrote:
> > >> > >
> > >> > > > Guys,
> > >> > > >
> > >> > > > Thanks for the discussion!
> > >> > > >
> > >> > > > Summary:
> > >> > > >
> > >> > > > 1. Q: How KAFKA-1367 (isr is inconsistent in brokers' metadata
> > >> cache)
> > >> > can
> > >> > > >         affect implementation?
> > >> > > >     A: We can fix this issue for the leading broker -
> > ReplicaManager
> > >> > (or
> > >> > > > Partition)
> > >> > > >         component should have accurate isr list, then with
> leading
> > >> > broker
> > >> > > > having correct
> > >> > > >         info, to do a describe-topic we will need to define
> > leading
> > >> > > brokers
> > >> > > > for partitions
> > >> > > >         and ask those for a correct isr list.
> > >> > > >         Also, we should consider adding lag information to TMR
> for
> > >> each
> > >> > > > follower for
> > >> > > >         partition reassignment, as Jun suggested above.
> > >> > > >
> > >> > > > 2. Q: What if user adds different alter commands for the same
> > topic
> > >> in
> > >> > > > scope
> > >> > > >          of one batch request?
> > >> > > >     A: Because of the async nature of AlterTopicRequest it will
> be
> > >> very
> > >> > > > hard then
> > >> > > >         to "assemble" the expected (in terms of checking whether
> > >> > request
> > >> > > is
> > >> > > > complete)
> > >> > > >         result if we let users do this. Also it will be very
> > >> confusing.
> > >> > > It
> > >> > > > was proposed not to
> > >> > > >         let users do this (probably add new Error for such
> cases).
> > >> > > >
> > >> > > > 3. Q: AlterTopicRequest semantics: now when we merged AlterTopic
> > and
> > >> > > >         ReassingPartitons in which order AlterTopic fields
> should
> > be
> > >> > > > resolved?
> > >> > > >     A: This item is not clear. There was a proposal to let user
> > >> change
> > >> > > only
> > >> > > >         one thing at a time, e.g. specify either new Replicas,
> or
> > >> > > > ReplicaAssignment.
> > >> > > >         This can be a simple solution, but it's a very strict
> > rule.
> > >> > E.g.
> > >> > > > currently with
> > >> > > >         TopicCommand user can increase nr of partitions and
> define
> > >> > > replica
> > >> > > > assignment
> > >> > > >         for newly added partitions. Taking into account item 2.
> > this
> > >> > will
> > >> > > > be even harder
> > >> > > >         for user to achieve this.
> > >> > > >
> > >> > > > 4. Q: Do we need such accurate errors returned from the server:
> > >> > > > InvalidArgumentPartitions,
> > >> > > >          InvalidArgumentReplicas etc.
> > >> > > >     A: I started implementation to add proposed error codes and
> > now
> > >> I
> > >> > > think
> > >> > > > probably
> > >> > > >         InvalidArgumentError should be sufficient. We can do
> > simple
> > >> > > > validations on
> > >> > > >         the client side (e.g. AdminClient can ensure nr of
> > >> partitions
> > >> > > > argument is positive),
> > >> > > >         others - which can be covered only on server (probably
> > >> invalid
> > >> > > > topic config,
> > >> > > >         replica assignment includes dead broker etc) - will be
> > done
> > >> on
> > >> > > > server, and in case
> > >> > > >         of invalid argument we will return InvalidArgumentError
> > >> without
> > >> > > > specifying the
> > >> > > >         concrete field.
> > >> > > >
> > >> > > > It'd be great if we could cover these remaining issues, looks
> like
> > >> they
> > >> > > are
> > >> > > > minor,
> > >> > > > at least related to specific messages, not the overall protocol.
> > - I
> > >> > > think
> > >> > > > with that I can
> > >> > > > update confluence page and update patch to reflect all discussed
> > >> items.
> > >> > > > This patch
> > >> > > > will probably include Wire protocol messages and server-side
> code
> > to
> > >> > > handle
> > >> > > > new
> > >> > > > requests. AdminClient and cli-tool implementation can be the
> next
> > >> step.
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Andrii Biletskyi
> > >> > > >
> > >> > > > On Wed, Apr 15, 2015 at 7:26 PM, Jun Rao <j...@confluent.io>
> > wrote:
> > >> > > >
> > >> > > > > Andrii,
> > >> > > > >
> > >> > > > > 500. I think what you suggested also sounds reasonable. Since
> > ISR
> > >> is
> > >> > > only
> > >> > > > > maintained accurately at the leader, TMR can return ISR if the
> > >> broker
> > >> > > is
> > >> > > > > the leader of a partition. Otherwise, we can return an empty
> > ISR.
> > >> For
> > >> > > > > partition reassignment, it would be useful to know the lag of
> > each
> > >> > > > > follower. Again, the leader knows this info. We can probably
> > >> include
> > >> > > that
> > >> > > > > info in TMR as well.
> > >> > > > >
> > >> > > > > 300. I think it's probably reasonable to restrict
> > >> AlterTopicRequest
> > >> > to
> > >> > > > > change only one thing at a time, i.e., either partitions,
> > >> replicas,
> > >> > > > replica
> > >> > > > > assignment or config.
> > >> > > > >
> > >> > > > > Thanks,
> > >> > > > >
> > >> > > > > Jun
> > >> > > > >
> > >> > > > > On Mon, Apr 13, 2015 at 10:56 AM, Andrii Biletskyi <
> > >> > > > > andrii.bilets...@stealth.ly> wrote:
> > >> > > > >
> > >> > > > > > Jun,
> > >> > > > > >
> > >> > > > > > 404. Great, thanks!
> > >> > > > > >
> > >> > > > > > 500. If I understand correctly KAFKA-1367 says ISR part of
> TMR
> > >> may
> > >> > > > > > be inconsistent. If so, then I believe all admin commands
> but
> > >> > > > > describeTopic
> > >> > > > > > are not affected. Let me emphasize that it's about
> AdminClient
> > >> > > > > operations,
> > >> > > > > > not about Wire Protocol requests. What I mean:
> > >> > > > > > To verify AdminClient.createTopic we will need (consistent)
> > >> > 'topics'
> > >> > > > set
> > >> > > > > > from TMR (we don't need isr)
> > >> > > > > > To verify alterTopic - again, probably 'topics' and
> 'assigned
> > >> > > > replicas' +
> > >> > > > > > configs
> > >> > > > > > To verify deleteTopic - only 'topics'
> > >> > > > > > To verify preferredReplica - 'leader', 'assigned replicas'
> > >> > > > > > To verify reassignPartitions - 'assigned replicas' ? (I'm
> not
> > >> sure
> > >> > > > about
> > >> > > > > > this one)
> > >> > > > > > If everything above is correct, then
> AdminClient.describeTopic
> > >> is
> > >> > the
> > >> > > > > only
> > >> > > > > > command under risk. We can actually workaround it - find out
> > the
> > >> > > leader
> > >> > > > > > broker
> > >> > > > > > and ask TMR that leading broker to get up-to-date isr list.
> > >> > > > > > Bottom line: looks like 1367 is a separate issue, and is
> not a
> > >> > > blocker
> > >> > > > > for
> > >> > > > > > this
> > >> > > > > > KIP. I'm a bit concerned about adding new requests as a
> > >> must-have
> > >> > > part
> > >> > > > > > of this KIP when we don't know what we want to include to
> > those
> > >> > > > requests.
> > >> > > > > >
> > >> > > > > > Also, I'd like to write down the new AlterTopicRequest
> > semantics
> > >> > (if
> > >> > > we
> > >> > > > > > decide
> > >> > > > > > to include replicas there and merge it with
> > >> > > ReassignPartitionsRequest)
> > >> > > > > > 300. AlterTopicRequest => [TopicName Partitions Replicas
> > >> > > > > ReplicaAssignment
> > >> > > > > > [AddedConfigEntry] [DeletedConfig]]
> > >> > > > > > The fields are resolved in this sequence:
> > >> > > > > > 1. Either partition or replicas is defined:
> > >> > > > > > ---1.1. ReplicaAssignment is not defined - generate
> automatic
> > >> > replica
> > >> > > > > > assignment
> > >> > > > > >           for newly added partitions or for replicas
> parameter
> > >> > > > increased
> > >> > > > > > ---1.2. ReplicaAssignment is defined - increase topic
> > >> partitions if
> > >> > > > > > 'partitions' defined,
> > >> > > > > >           reassign partitions according to ReplicaAssignment
> > >> > > > > > 2. Neither partition nor replicas is defined:
> > >> > > > > > ---2.1. ReplicaAssignment is defined - it's a reassign
> > replicas
> > >> > > request
> > >> > > > > > ---2.2. ReplicaAssignment is not defined - just change topic
> > >> > configs
> > >> > > > > > 3. Config fields are handled always and independently from
> > >> > > > > >     partitions+replicas/replicaAssingment
> > >> > > > > > A bit sophisticated, but should cover all cases. Another
> > option
> > >> -
> > >> > we
> > >> > > > can
> > >> > > > > > say you can define either partitions+replicas or
> > >> replicaAssignment.
> > >> > > > > >
> > >> > > > > > 300.5. There is also a new question related to
> > >> AlterTopicRequest -
> > >> > > > should
> > >> > > > > > we
> > >> > > > > > allow users multiple alter-topic instructions for one topic
> in
> > >> one
> > >> > > > batch?
> > >> > > > > > I think if we go this way, user will expect we optimize and
> > >> group
> > >> > > > > requests
> > >> > > > > > for one topic, but it will add a lot of burden, especially
> > taken
> > >> > into
> > >> > > > > > account
> > >> > > > > > async semantics of the AlterTopicRequest. I'd rather return
> > some
> > >> > > error
> > >> > > > > > code,
> > >> > > > > > or ignore all but first. Thoughts?
> > >> > > > > >
> > >> > > > > > Thanks,
> > >> > > > > > Andrii Biletskyi
> > >> > > > > >
> > >> > > > > > On Mon, Apr 13, 2015 at 6:34 AM, Jun Rao <j...@confluent.io>
> > >> wrote:
> > >> > > > > >
> > >> > > > > > > Andrii,
> > >> > > > > > >
> > >> > > > > > > 404. Jay and I chatted a bit. We agreed to leave
> > >> > createTopicRequest
> > >> > > > as
> > >> > > > > > > async for now.
> > >> > > > > > >
> > >> > > > > > > There is another thing.
> > >> > > > > > >
> > >> > > > > > > 500. Currently, we have this issue (KAFKA-1367) that the
> ISR
> > >> in
> > >> > the
> > >> > > > > > > metadata cache can be out of sync. The reason is that ISR
> is
> > >> > really
> > >> > > > > > > maintained at the leader. We can potentially add a new
> > >> > > > > BrokerMetaRequest,
> > >> > > > > > > which will return useful stats specific to a broker. Such
> > >> stats
> > >> > can
> > >> > > > > > include
> > >> > > > > > > (1) for each partition whose leader is on this broker, the
> > ISR
> > >> > and
> > >> > > > the
> > >> > > > > > lag
> > >> > > > > > > (in messages) for each of the followers, (2) space used
> per
> > >> > > > partition,
> > >> > > > > > (3)
> > >> > > > > > > remaining space per log dir (not sure how easy it is to
> get
> > >> this
> > >> > > > info).
> > >> > > > > > If
> > >> > > > > > > we have this new request, we can probably remove the ISR
> > part
> > >> > from
> > >> > > > TMR
> > >> > > > > > v1.
> > >> > > > > > > Currently, the producer/consumer client don't really care
> > >> about
> > >> > > ISR.
> > >> > > > > The
> > >> > > > > > > admin client will then issue BrokerMetaRequest to find out
> > ISR
> > >> > and
> > >> > > > > other
> > >> > > > > > > stats.
> > >> > > > > > >
> > >> > > > > > > Thanks,
> > >> > > > > > >
> > >> > > > > > > Jun
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > On Tue, Apr 7, 2015 at 12:10 PM, Andrii Biletskyi <
> > >> > > > > > > andrii.bilets...@stealth.ly> wrote:
> > >> > > > > > >
> > >> > > > > > > > Hi all,
> > >> > > > > > > >
> > >> > > > > > > > A summary of our discussion:
> > >> > > > > > > >
> > >> > > > > > > > 201. Q: Cluster updates in backward compatible way.
> > >> > > > > > > >         A: Add topicConfigs map property and change
> > >> > constructor,
> > >> > > > this
> > >> > > > > > > > shouldn't break Consumer/Producer since TMR is used in
> > >> > > > NetworkClient,
> > >> > > > > > > > not directly by Consumer/Producer.
> > >> > > > > > > >
> > >> > > > > > > > 300. Q: Can we merge AlterTopic and ReassignPartitions
> > >> > requests?
> > >> > > > > > > >         A: It looks like in terms of Wire Protocol
> > partition
> > >> > > > > > reassignment
> > >> > > > > > > > can
> > >> > > > > > > > be just an application of AlterTopicRequest. On the
> > >> AdminClient
> > >> > > > side
> > >> > > > > we
> > >> > > > > > > can
> > >> > > > > > > > split this into two separate methods, if needed.
> > >> > > > > > > >
> > >> > > > > > > > Some additional items that were added today:
> > >> > > > > > > > 400. Q: Do we need ListTopicsRequest, we can use TMR for
> > >> this
> > >> > > > > purpose.
> > >> > > > > > > >         A: The answer depends on whether we can leverage
> > >> > > ListTopics
> > >> > > > > in
> > >> > > > > > > > consumer/producer, because the only benefit of the
> > >> ListTopics
> > >> > is
> > >> > > > > > > > performance
> > >> > > > > > > > optimization, otherwise it doesn't worth it.
> > >> > > > > > > >
> > >> > > > > > > > 401. Q: AdminClient.topicExists - do we need it?
> > >> > > > > > > >         A: AdminClient.listTopics should be sufficient.
> > >> > > > > > > >
> > >> > > > > > > > 402. Review AdminClient API and use separate objects
> > >> instead of
> > >> > > > > > > collections
> > >> > > > > > > > for methods arguments / return results (e.g.
> > >> preferredReplica
> > >> > > > accepts
> > >> > > > > > > > Map<String, List<Int>>
> > >> > > > > > > > might be better to add separate java object)
> > >> > > > > > > >
> > >> > > > > > > > 403. Error number in KIP-4 (100x). Currently there are
> no
> > >> > > dedicated
> > >> > > > > > > ranges
> > >> > > > > > > > for errors, we will probably continue doing it this way.
> > >> > > > > > > >
> > >> > > > > > > > 404. There were some concerns again about the
> asynchronous
> > >> > > > semantics
> > >> > > > > > > > of the admin requests. Jun and Jay to agree separately
> how
> > >> we
> > >> > > want
> > >> > > > > > > > to handle it.
> > >> > > > > > > >
> > >> > > > > > > > Please add / correct me if I missed something.
> > >> > > > > > > >
> > >> > > > > > > > Thanks,
> > >> > > > > > > > Andrii Biletskyi
> > >> > > > > > > >
> > >> > > > > > > > On Tue, Apr 7, 2015 at 4:11 PM, Andrii Biletskyi <
> > >> > > > > > > > andrii.bilets...@stealth.ly> wrote:
> > >> > > > > > > >
> > >> > > > > > > > > Hi all,
> > >> > > > > > > > >
> > >> > > > > > > > > I wasn't able to send email to our thread (it says we
> > >> > exceeded
> > >> > > > > > message
> > >> > > > > > > > > size limit :)).
> > >> > > > > > > > > So I'm starting the new one.
> > >> > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > > > Jun,
> > >> > > > > > > > >
> > >> > > > > > > > > Thanks again for the review. Answering your comments:
> > >> > > > > > > > >
> > >> > > > > > > > > 201. I'm not sure I understand how can we evolve
> Cluster
> > >> in
> > >> > > > > backward
> > >> > > > > > > > > compatible way. In my understanding topic configs are
> > not
> > >> > > > returned
> > >> > > > > > > > > currently -
> > >> > > > > > > > > in TMR_V0. Thus we need to add new property in
> Cluster -
> > >> smth
> > >> > > > like
> > >> > > > > > > > > private final Map<String, List<ConfigEntry>>
> > topicConfigs;
> > >> > > > > > > > > Which affects Cluster constructor, which is used in
> > >> > > > > > > MetadataResponse.java
> > >> > > > > > > > > - not sure whether we can change Cluster this way so
> > it's
> > >> > > > backward
> > >> > > > > > > > > compatible,
> > >> > > > > > > > > I suppose - no.
> > >> > > > > > > > > Let me know if I'm missing something...
> > >> > > > > > > > >
> > >> > > > > > > > > 300. Hm, so you propose to give up ReassignPartition
> as
> > a
> > >> > > > separate
> > >> > > > > > > > command?
> > >> > > > > > > > > That's interesting, let's discuss it today in detail.
> > >> > > > > > > > > Two small points here: 1) afaik currently
> > >> replica-assignment
> > >> > > > > argument
> > >> > > > > > > in
> > >> > > > > > > > > alter-topic
> > >> > > > > > > > > (from TopicCommand) doesn't reassign partitions, it
> lets
> > >> > users
> > >> > > > > > specify
> > >> > > > > > > > > replica
> > >> > > > > > > > > assignment for newly added partition
> > >> (AddPartitionsListener)
> > >> > 2)
> > >> > > > > > > > > ReassignPartitions
> > >> > > > > > > > > command involves a little bit more than just changing
> > >> replica
> > >> > > > > > > assignment
> > >> > > > > > > > > in zk.
> > >> > > > > > > > > People are struggling with partition reassignment so I
> > >> think
> > >> > > it's
> > >> > > > > > good
> > >> > > > > > > to
> > >> > > > > > > > > have explicit
> > >> > > > > > > > > request for it so we can handle it independently, also
> > as
> > >> > > > mentioned
> > >> > > > > > > > > earlier we'll
> > >> > > > > > > > > probably add in future some better status check
> > procedure
> > >> for
> > >> > > > this
> > >> > > > > > > > > long-running
> > >> > > > > > > > > request.
> > >> > > > > > > > >
> > >> > > > > > > > > 301. Good point. We also agreed to use clientId as an
> > >> > > identifier
> > >> > > > > for
> > >> > > > > > > the
> > >> > > > > > > > > requester -
> > >> > > > > > > > > whether it's a producer client or admin. I think we
> can
> > go
> > >> > with
> > >> > > > > -1/-1
> > >> > > > > > > > > approach.
> > >> > > > > > > > >
> > >> > > > > > > > > 302. Again, as said above replica-assignment in
> > >> alter-topic
> > >> > > > doesn't
> > >> > > > > > > > change
> > >> > > > > > > > > replica assignment of the existing partitions. But we
> > can
> > >> > think
> > >> > > > > about
> > >> > > > > > > it
> > >> > > > > > > > > in general -
> > >> > > > > > > > > how can we change topic replication factor? The easy
> way
> > >> - we
> > >> > > > don't
> > >> > > > > > > need
> > >> > > > > > > > > it,
> > >> > > > > > > > > we can use reassign partitions. Not sure whether we
> want
> > >> to
> > >> > add
> > >> > > > > > special
> > >> > > > > > > > > logic
> > >> > > > > > > > > to treat this case...
> > >> > > > > > > > >
> > >> > > > > > > > > 303.1. Okay, sure, I'll generalize topicExists().
> > >> > > > > > > > > 303.2. I think, yes, we need separate verify methods
> as
> > a
> > >> > > status
> > >> > > > > > check
> > >> > > > > > > > > procedure,
> > >> > > > > > > > > because respective requests are long running, and CLI
> > user
> > >> > > > > > potentially
> > >> > > > > > > > > will
> > >> > > > > > > > > asynchronously call reassign-partitions, do other
> stuff
> > >> (e.g.
> > >> > > > > create
> > >> > > > > > > > > topics) periodically
> > >> > > > > > > > > checking status of the partition reassignment. Anyway
> > >> we'll
> > >> > > have
> > >> > > > to
> > >> > > > > > > > > implement this logic
> > >> > > > > > > > > because it's a criterion of the completed Future of
> the
> > >> > > reassign
> > >> > > > > > > > > partitions async
> > >> > > > > > > > > call, we'll to have make those methods just public.
> > >> > > > > > > > > 303.3. If preferredReplica returns Future<Map<String,
> > >> > Errors>>
> > >> > > > than
> > >> > > > > > > what
> > >> > > > > > > > > is an error
> > >> > > > > > > > > in terms of preferred replica leader election? As I
> > >> > understand
> > >> > > we
> > >> > > > > can
> > >> > > > > > > > only
> > >> > > > > > > > > check
> > >> > > > > > > > > whether it has succeeded (leader == AR.head)  or not
> > >> _yet_.
> > >> > > > > > > > >
> > >> > > > > > > > > 304.1. Sure, let's add timeout to reassign/preferred
> > >> replica.
> > >> > > > > > > > > 304.2. This can be finalized after we discuss 300.
> > >> > > > > > > > >
> > >> > > > > > > > > 305. Misprints - thanks, fixed.
> > >> > > > > > > > >
> > >> > > > > > > > > Thanks,
> > >> > > > > > > > > Andrii Biletskyi
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Reply via email to