Regarding the JSON version of Broker:

I don't why the ZkUtils.getBrokerInfo() restricts the JSON versions it can
read. It will throw exception if version is not 1 or 2. Seems to me that it
will cause compatibility problem whenever the version needs to be changed
and make the upgrade path difficult.

One option we have is to make rack also part of version 2 and keep the
version 2 unchanged for this update. This will make the old clients
compatible. During rolling upgrade, it will also avoid problems if the
controller/broker is still the old version.

However, ZkUtils.getBrokerInfo() will be updated to return the Broker with
rack so the rack information will be available once the server/client is
upgraded to the latest version.



On Wed, Jan 6, 2016 at 6:28 PM, Allen Wang <allenxw...@gmail.com> wrote:

> Updated KIP according to Jun's comment and included changes to TMR.
>
> On Tue, Jan 5, 2016 at 5:59 PM, Jun Rao <j...@confluent.io> wrote:
>
>> Hi, Allen,
>>
>> A couple of minor comments on the KIP.
>>
>> 1. The version of the broker JSON string says 2. It should be 3.
>>
>> 2. The new version of UpdateMetadataRequest should be 2, instead of 1.
>> Could you include the full wire protocol of version 2 of
>> UpdateMetadataRequest and highlight the changed part?
>>
>> Thanks,
>>
>> Jun
>>
>> On Tue, Jan 5, 2016 at 3:11 PM, Allen Wang <allenxw...@gmail.com> wrote:
>>
>> > Jun and I had a chance to discuss it in a meeting and it is agreed to
>> > change the TMR in a different patch.
>> >
>> > I can change the KIP to include rack in TMR. The essential change is to
>> add
>> > rack into class BrokerEndPoint and make TMR version aware.
>> >
>> >
>> >
>> > On Tue, Jan 5, 2016 at 10:21 AM, Aditya Auradkar <
>> > aaurad...@linkedin.com.invalid> wrote:
>> >
>> > > Jun/Allen -
>> > >
>> > > Did we ever actually agree on whether we should evolve the TMR to
>> include
>> > > rack info or not?
>> > > I don't feel strongly about it but I if it's the right thing to do we
>> > > should probably do it in this KIP (can be a separate patch).. it
>> isn't a
>> > > large change.
>> > >
>> > > Aditya
>> > >
>> > > On Sat, Dec 26, 2015 at 3:01 PM, Allen Wang <allenxw...@gmail.com>
>> > wrote:
>> > >
>> > > > Added the rolling upgrade instruction in the KIP, similar to those
>> in
>> > > 0.9.0
>> > > > release notes.
>> > > >
>> > > > On Wed, Dec 16, 2015 at 11:32 AM, Allen Wang <allenxw...@gmail.com>
>> > > wrote:
>> > > >
>> > > > > Hi Jun,
>> > > > >
>> > > > > The reason that TopicMetadataResponse is not included in the KIP
>> is
>> > > that
>> > > > > it currently is not version aware . So we need to introduce
>> version
>> > to
>> > > it
>> > > > > in order to make sure backward compatibility. It seems to me a big
>> > > > change.
>> > > > > Do we want to couple it with this KIP? Do we need to further
>> discuss
>> > > what
>> > > > > information to include in the new version besides rack? For
>> example,
>> > > > should
>> > > > > we include broker security protocol in TopicMetadataResponse?
>> > > > >
>> > > > > The other option is to make it a separate KIP to make
>> > > > > TopicMetadataResponse version aware and decide what to include,
>> and
>> > > make
>> > > > > this KIP focus on the rack aware algorithm, admin tools  and
>> related
>> > > > > changes to inter-broker protocol .
>> > > > >
>> > > > > Thanks,
>> > > > > Allen
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Mon, Dec 14, 2015 at 8:30 AM, Jun Rao <j...@confluent.io>
>> wrote:
>> > > > >
>> > > > >> Allen,
>> > > > >>
>> > > > >> Thanks for the proposal. A few comments.
>> > > > >>
>> > > > >> 1. Since this KIP changes the inter broker communication protocol
>> > > > >> (UpdateMetadataRequest), we will need to document the upgrade
>> path
>> > > > >> (similar
>> > > > >> to what's described in
>> > > > >> http://kafka.apache.org/090/documentation.html#upgrade).
>> > > > >>
>> > > > >> 2. It might be useful to include the rack info of the broker in
>> > > > >> TopicMetadataResponse. This can be useful for administrative
>> tasks,
>> > as
>> > > > >> well
>> > > > >> as read affinity in the future.
>> > > > >>
>> > > > >> Jun
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> On Thu, Dec 10, 2015 at 9:38 AM, Allen Wang <
>> allenxw...@gmail.com>
>> > > > wrote:
>> > > > >>
>> > > > >> > If there are no more comments I would like to call for a vote.
>> > > > >> >
>> > > > >> >
>> > > > >> > On Sun, Nov 15, 2015 at 10:08 PM, Allen Wang <
>> > allenxw...@gmail.com>
>> > > > >> wrote:
>> > > > >> >
>> > > > >> > > KIP is updated with more details and how to handle the
>> situation
>> > > > where
>> > > > >> > > rack information is incomplete.
>> > > > >> > >
>> > > > >> > > In the situation where rack information is incomplete, but we
>> > want
>> > > > to
>> > > > >> > > continue with the assignment, I have suggested to ignore all
>> > rack
>> > > > >> > > information and fallback to original algorithm. The reason is
>> > > > >> explained
>> > > > >> > > below:
>> > > > >> > >
>> > > > >> > > The other options are to assume that the broker without the
>> rack
>> > > > >> belong
>> > > > >> > to
>> > > > >> > > its own unique rack, or they belong to one "default" rack.
>> > Either
>> > > > way
>> > > > >> we
>> > > > >> > > choose, it is highly likely to result in uneven number of
>> > brokers
>> > > in
>> > > > >> > racks,
>> > > > >> > > and it is quite possible that the "made up" racks will have
>> much
>> > > > fewer
>> > > > >> > > number of brokers. As I explained in the KIP, uneven number
>> of
>> > > > >> brokers in
>> > > > >> > > racks will lead to uneven distribution of replicas among
>> brokers
>> > > > (even
>> > > > >> > > though the leader distribution is still even). The brokers in
>> > the
>> > > > rack
>> > > > >> > that
>> > > > >> > > has fewer number of brokers will get more replicas per broker
>> > than
>> > > > >> > brokers
>> > > > >> > > in other racks.
>> > > > >> > >
>> > > > >> > > Given this fact and the replica assignment produced will be
>> > > > incorrect
>> > > > >> > > anyway from rack aware point of view, ignoring all rack
>> > > information
>> > > > >> and
>> > > > >> > > fallback to the original algorithm is not a bad choice since
>> it
>> > > will
>> > > > >> at
>> > > > >> > > least have a better guarantee of replica distribution.
>> > > > >> > >
>> > > > >> > > Also for command line tools it gives user a choice if for any
>> > > reason
>> > > > >> they
>> > > > >> > > want to ignore rack information and fallback to the original
>> > > > >> algorithm.
>> > > > >> > >
>> > > > >> > >
>> > > > >> > > On Tue, Nov 10, 2015 at 9:04 AM, Allen Wang <
>> > allenxw...@gmail.com
>> > > >
>> > > > >> > wrote:
>> > > > >> > >
>> > > > >> > >> I am busy with some time pressing issues for the last few
>> > days. I
>> > > > >> will
>> > > > >> > >> think about how the incomplete rack information will affect
>> the
>> > > > >> balance
>> > > > >> > and
>> > > > >> > >> update the KIP by early next week.
>> > > > >> > >>
>> > > > >> > >> Thanks,
>> > > > >> > >> Allen
>> > > > >> > >>
>> > > > >> > >>
>> > > > >> > >> On Tue, Nov 3, 2015 at 9:03 AM, Neha Narkhede <
>> > n...@confluent.io
>> > > >
>> > > > >> > wrote:
>> > > > >> > >>
>> > > > >> > >>> Few suggestions on improving the KIP
>> > > > >> > >>>
>> > > > >> > >>> *If some brokers have rack, and some do not, the algorithm
>> > will
>> > > > >> thrown
>> > > > >> > an
>> > > > >> > >>> > exception. This is to prevent incorrect assignment
>> caused by
>> > > > user
>> > > > >> > >>> error.*
>> > > > >> > >>>
>> > > > >> > >>>
>> > > > >> > >>> In the KIP, can you clearly state the user-facing behavior
>> > when
>> > > > some
>> > > > >> > >>> brokers have rack information and some don't. Which actions
>> > and
>> > > > >> > requests
>> > > > >> > >>> will error out and how?
>> > > > >> > >>>
>> > > > >> > >>> *Even distribution of partition leadership among brokers*
>> > > > >> > >>>
>> > > > >> > >>>
>> > > > >> > >>> There is some information about arranging the sorted broker
>> > list
>> > > > >> > >>> interlaced
>> > > > >> > >>> with rack ids. Can you describe the changes to the current
>> > > > algorithm
>> > > > >> > in a
>> > > > >> > >>> little more detail? How does this interlacing work if only
>> a
>> > > > subset
>> > > > >> of
>> > > > >> > >>> brokers have the rack id configured? Does this still work
>> if
>> > > > uneven
>> > > > >> #
>> > > > >> > of
>> > > > >> > >>> brokers are assigned to each rack? It might work, I'm
>> looking
>> > > for
>> > > > >> more
>> > > > >> > >>> details on the changes, since it will affect the behavior
>> seen
>> > > by
>> > > > >> the
>> > > > >> > >>> user
>> > > > >> > >>> - imbalance on either the leaders or data or both.
>> > > > >> > >>>
>> > > > >> > >>> On Mon, Nov 2, 2015 at 6:39 PM, Aditya Auradkar <
>> > > > >> > aaurad...@linkedin.com>
>> > > > >> > >>> wrote:
>> > > > >> > >>>
>> > > > >> > >>> > I think this sounds reasonable. Anyone else have
>> comments?
>> > > > >> > >>> >
>> > > > >> > >>> > Aditya
>> > > > >> > >>> >
>> > > > >> > >>> > On Tue, Oct 27, 2015 at 5:23 PM, Allen Wang <
>> > > > allenxw...@gmail.com
>> > > > >> >
>> > > > >> > >>> wrote:
>> > > > >> > >>> >
>> > > > >> > >>> > > During the discussion in the hangout, it was mentioned
>> > that
>> > > it
>> > > > >> > would
>> > > > >> > >>> be
>> > > > >> > >>> > > desirable that consumers know the rack information of
>> the
>> > > > >> brokers
>> > > > >> > so
>> > > > >> > >>> that
>> > > > >> > >>> > > they can consume from the broker in the same rack to
>> > reduce
>> > > > >> > latency.
>> > > > >> > >>> As I
>> > > > >> > >>> > > understand this will only be beneficial if consumer can
>> > > > consume
>> > > > >> > from
>> > > > >> > >>> any
>> > > > >> > >>> > > broker in ISR, which is not possible now.
>> > > > >> > >>> > >
>> > > > >> > >>> > > I suggest we skip the change to TMR. Once the change is
>> > made
>> > > > to
>> > > > >> > >>> consumer
>> > > > >> > >>> > to
>> > > > >> > >>> > > be able to consume from any broker in ISR, the rack
>> > > > information
>> > > > >> can
>> > > > >> > >>> be
>> > > > >> > >>> > > added to TMR.
>> > > > >> > >>> > >
>> > > > >> > >>> > > Another thing I want to confirm is  command line
>> > behavior. I
>> > > > >> think
>> > > > >> > >>> the
>> > > > >> > >>> > > desirable default behavior is to fail fast on command
>> line
>> > > for
>> > > > >> > >>> incomplete
>> > > > >> > >>> > > rack mapping. The error message can include further
>> > > > instruction
>> > > > >> > that
>> > > > >> > >>> > tells
>> > > > >> > >>> > > the user to add an extra argument (like
>> > > > >> "--allow-partial-rackinfo")
>> > > > >> > >>> to
>> > > > >> > >>> > > suppress the error and do an imperfect rack aware
>> > > assignment.
>> > > > If
>> > > > >> > the
>> > > > >> > >>> > > default behavior is to allow incomplete mapping, the
>> error
>> > > can
>> > > > >> > still
>> > > > >> > >>> be
>> > > > >> > >>> > > easily missed.
>> > > > >> > >>> > >
>> > > > >> > >>> > > The affected command line tools are TopicCommand and
>> > > > >> > >>> > > ReassignPartitionsCommand.
>> > > > >> > >>> > >
>> > > > >> > >>> > > Thanks,
>> > > > >> > >>> > > Allen
>> > > > >> > >>> > >
>> > > > >> > >>> > >
>> > > > >> > >>> > >
>> > > > >> > >>> > >
>> > > > >> > >>> > >
>> > > > >> > >>> > > On Mon, Oct 26, 2015 at 12:55 PM, Aditya Auradkar <
>> > > > >> > >>> > aaurad...@linkedin.com>
>> > > > >> > >>> > > wrote:
>> > > > >> > >>> > >
>> > > > >> > >>> > > > Hi Allen,
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > For TopicMetadataResponse to understand version, you
>> can
>> > > > bump
>> > > > >> up
>> > > > >> > >>> the
>> > > > >> > >>> > > > request version itself. Based on the version of the
>> > > request,
>> > > > >> the
>> > > > >> > >>> > response
>> > > > >> > >>> > > > can be appropriately serialized. It shouldn't be a
>> huge
>> > > > >> change.
>> > > > >> > For
>> > > > >> > >>> > > > example: We went through something similar for
>> > > > ProduceRequest
>> > > > >> > >>> recently
>> > > > >> > >>> > (
>> > > > >> > >>> > > > https://reviews.apache.org/r/33378/)
>> > > > >> > >>> > > > I guess the reason protocol information is not
>> included
>> > in
>> > > > the
>> > > > >> > TMR
>> > > > >> > >>> is
>> > > > >> > >>> > > > because the topic itself is independent of any
>> > particular
>> > > > >> > protocol
>> > > > >> > >>> (SSL
>> > > > >> > >>> > > vs
>> > > > >> > >>> > > > Plaintext). Having said that, I'm not sure we even
>> need
>> > > rack
>> > > > >> > >>> > information
>> > > > >> > >>> > > in
>> > > > >> > >>> > > > TMR. What usecase were you thinking of initially?
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > For 1 - I'd be fine with adding an option to the
>> command
>> > > > line
>> > > > >> > tools
>> > > > >> > >>> > that
>> > > > >> > >>> > > > check rack assignment. For e.g.
>> "--strict-assignment" or
>> > > > >> > something
>> > > > >> > >>> > > similar.
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > Aditya
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > On Thu, Oct 22, 2015 at 6:44 PM, Allen Wang <
>> > > > >> > allenxw...@gmail.com>
>> > > > >> > >>> > > wrote:
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > > For 2 and 3, I have updated the KIP. Please take a
>> > look.
>> > > > One
>> > > > >> > >>> thing I
>> > > > >> > >>> > > have
>> > > > >> > >>> > > > > changed is removing the proposal to add rack to
>> > > > >> > >>> > TopicMetadataResponse.
>> > > > >> > >>> > > > The
>> > > > >> > >>> > > > > reason is that unlike UpdateMetadataRequest,
>> > > > >> > >>> TopicMetadataResponse
>> > > > >> > >>> > does
>> > > > >> > >>> > > > not
>> > > > >> > >>> > > > > understand version. I don't see a way to include
>> rack
>> > > > >> without
>> > > > >> > >>> > breaking
>> > > > >> > >>> > > > old
>> > > > >> > >>> > > > > version of clients. That's probably why secure
>> > protocol
>> > > is
>> > > > >> not
>> > > > >> > >>> > included
>> > > > >> > >>> > > > in
>> > > > >> > >>> > > > > the TopicMetadataResponse either. I think it will
>> be a
>> > > > much
>> > > > >> > >>> bigger
>> > > > >> > >>> > > change
>> > > > >> > >>> > > > > to include rack in TopicMetadataResponse.
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > For 1, my concern is that doing rack aware
>> assignment
>> > > > >> without
>> > > > >> > >>> > complete
>> > > > >> > >>> > > > > broker to rack mapping will result in assignment
>> that
>> > is
>> > > > not
>> > > > >> > rack
>> > > > >> > >>> > aware
>> > > > >> > >>> > > > and
>> > > > >> > >>> > > > > fail to provide fault tolerance in the event of
>> rack
>> > > > outage.
>> > > > >> > This
>> > > > >> > >>> > kind
>> > > > >> > >>> > > of
>> > > > >> > >>> > > > > problem will be difficult to surface. And the cost
>> of
>> > > this
>> > > > >> > >>> problem is
>> > > > >> > >>> > > > high:
>> > > > >> > >>> > > > > you have to do partition reassignment if you are
>> lucky
>> > > to
>> > > > >> spot
>> > > > >> > >>> the
>> > > > >> > >>> > > > problem
>> > > > >> > >>> > > > > early on or face the consequence of data loss
>> during
>> > > real
>> > > > >> rack
>> > > > >> > >>> > outage.
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > I do see the concern of fail-fast as it might also
>> > cause
>> > > > >> data
>> > > > >> > >>> loss if
>> > > > >> > >>> > > > > producer is not able produce the message due to
>> topic
>> > > > >> creation
>> > > > >> > >>> > failure.
>> > > > >> > >>> > > > Is
>> > > > >> > >>> > > > > it feasible to treat dynamic topic creation and
>> > command
>> > > > >> tools
>> > > > >> > >>> > > > differently?
>> > > > >> > >>> > > > > We allow dynamic topic creation with incomplete
>> > > > broker-rack
>> > > > >> > >>> mapping
>> > > > >> > >>> > and
>> > > > >> > >>> > > > > fail fast in command line. Another option is to let
>> > user
>> > > > >> > >>> determine
>> > > > >> > >>> > the
>> > > > >> > >>> > > > > behavior for command line. For example, by default
>> > fail
>> > > > >> fast in
>> > > > >> > >>> > command
>> > > > >> > >>> > > > > line but allow incomplete broker-rack mapping if
>> > another
>> > > > >> switch
>> > > > >> > >>> is
>> > > > >> > >>> > > > > provided.
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > On Tue, Oct 20, 2015 at 10:05 AM, Aditya Auradkar <
>> > > > >> > >>> > > > > aaurad...@linkedin.com.invalid> wrote:
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > > Hey Allen,
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > > > 1. If we choose fail fast topic creation, we will
>> > have
>> > > > >> topic
>> > > > >> > >>> > creation
>> > > > >> > >>> > > > > > failures while upgrading the cluster. I really
>> doubt
>> > > we
>> > > > >> want
>> > > > >> > >>> this
>> > > > >> > >>> > > > > behavior.
>> > > > >> > >>> > > > > > Ideally, this should be invisible to clients of a
>> > > > cluster.
>> > > > >> > >>> > Currently,
>> > > > >> > >>> > > > > each
>> > > > >> > >>> > > > > > broker is effectively its own rack. So we
>> probably
>> > can
>> > > > use
>> > > > >> > the
>> > > > >> > >>> rack
>> > > > >> > >>> > > > > > information whenever possible but not make it a
>> hard
>> > > > >> > >>> requirement.
>> > > > >> > >>> > To
>> > > > >> > >>> > > > > extend
>> > > > >> > >>> > > > > > Gwen's example, one badly configured broker
>> should
>> > not
>> > > > >> > degrade
>> > > > >> > >>> > topic
>> > > > >> > >>> > > > > > creation for the entire cluster.
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > > > 2. Upgrade scenario - Can you add a section on
>> the
>> > > > upgrade
>> > > > >> > >>> piece to
>> > > > >> > >>> > > > > confirm
>> > > > >> > >>> > > > > > that old clients will not see errors? I believe
>> > > > >> > >>> > > > > ZookeeperConsumerConnector
>> > > > >> > >>> > > > > > reads the Broker objects from ZK. I wanted to
>> > confirm
>> > > > that
>> > > > >> > this
>> > > > >> > >>> > will
>> > > > >> > >>> > > > not
>> > > > >> > >>> > > > > > cause any problems.
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > > > 3. Could you elaborate your proposed changes to
>> the
>> > > > >> > >>> > > > UpdateMetadataRequest
>> > > > >> > >>> > > > > > in the "Public Interfaces" section? Personally, I
>> > find
>> > > > >> this
>> > > > >> > >>> format
>> > > > >> > >>> > > easy
>> > > > >> > >>> > > > > to
>> > > > >> > >>> > > > > > read in terms of wire protocol changes:
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > >
>> > > > >> > >>> > >
>> > > > >> > >>> >
>> > > > >> > >>>
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-CreateTopicRequest
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > > > Aditya
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > > > On Fri, Oct 16, 2015 at 3:45 PM, Allen Wang <
>> > > > >> > >>> allenxw...@gmail.com>
>> > > > >> > >>> > > > > wrote:
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > > > > KIP is updated include rack as an optional
>> > property
>> > > > for
>> > > > >> > >>> broker.
>> > > > >> > >>> > > > Please
>> > > > >> > >>> > > > > > take
>> > > > >> > >>> > > > > > > a look and let me know if more details are
>> needed.
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > > > > For the case where some brokers have rack and
>> some
>> > > do
>> > > > >> not,
>> > > > >> > >>> the
>> > > > >> > >>> > > > current
>> > > > >> > >>> > > > > > KIP
>> > > > >> > >>> > > > > > > uses the fail-fast behavior. If there are
>> > concerns,
>> > > we
>> > > > >> can
>> > > > >> > >>> > further
>> > > > >> > >>> > > > > > discuss
>> > > > >> > >>> > > > > > > this in the email thread or next hangout.
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > > > > On Thu, Oct 15, 2015 at 10:42 AM, Allen Wang <
>> > > > >> > >>> > allenxw...@gmail.com
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > > > wrote:
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > > > > > That's a good question. I can think of three
>> > > actions
>> > > > >> if
>> > > > >> > the
>> > > > >> > >>> > rack
>> > > > >> > >>> > > > > > > > information is incomplete:
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > > > 1. Treat the node without rack as if it is on
>> > its
>> > > > >> unique
>> > > > >> > >>> rack
>> > > > >> > >>> > > > > > > > 2. Disregard all rack information and
>> fallback
>> > to
>> > > > >> current
>> > > > >> > >>> > > algorithm
>> > > > >> > >>> > > > > > > > 3. Fail-fast
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > > > Now I think about it, one and three make more
>> > > sense.
>> > > > >> The
>> > > > >> > >>> reason
>> > > > >> > >>> > > for
>> > > > >> > >>> > > > > > > > fail-fast is that user mistake for not
>> providing
>> > > the
>> > > > >> rack
>> > > > >> > >>> may
>> > > > >> > >>> > > never
>> > > > >> > >>> > > > > be
>> > > > >> > >>> > > > > > > > found if we tolerate that and the assignment
>> may
>> > > not
>> > > > >> be
>> > > > >> > >>> rack
>> > > > >> > >>> > > aware
>> > > > >> > >>> > > > as
>> > > > >> > >>> > > > > > the
>> > > > >> > >>> > > > > > > > user has expected and this creates debug
>> > problems
>> > > > when
>> > > > >> > >>> things
>> > > > >> > >>> > > fail.
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > > > What do you think? If not fail-fast, is there
>> > > anyway
>> > > > >> we
>> > > > >> > can
>> > > > >> > >>> > make
>> > > > >> > >>> > > > the
>> > > > >> > >>> > > > > > user
>> > > > >> > >>> > > > > > > > error standing out?
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > > > On Thu, Oct 15, 2015 at 10:17 AM, Gwen
>> Shapira <
>> > > > >> > >>> > > g...@confluent.io>
>> > > > >> > >>> > > > > > > wrote:
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > > >> Thanks! Just to clarify, when some brokers
>> have
>> > > > rack
>> > > > >> > >>> > assignment
>> > > > >> > >>> > > > and
>> > > > >> > >>> > > > > > some
>> > > > >> > >>> > > > > > > >> don't, do we act like none of them have it?
>> or
>> > > like
>> > > > >> > those
>> > > > >> > >>> > > without
>> > > > >> > >>> > > > > > > >> assignment are in their own rack?
>> > > > >> > >>> > > > > > > >>
>> > > > >> > >>> > > > > > > >> The first scenario is good when first
>> setting
>> > up
>> > > > >> > >>> > rack-awareness,
>> > > > >> > >>> > > > but
>> > > > >> > >>> > > > > > the
>> > > > >> > >>> > > > > > > >> second makes more sense for on-going
>> > maintenance
>> > > (I
>> > > > >> can
>> > > > >> > >>> > totally
>> > > > >> > >>> > > > see
>> > > > >> > >>> > > > > > > >> someone
>> > > > >> > >>> > > > > > > >> adding a node and forgetting to set the rack
>> > > > >> property,
>> > > > >> > we
>> > > > >> > >>> > don't
>> > > > >> > >>> > > > want
>> > > > >> > >>> > > > > > > this
>> > > > >> > >>> > > > > > > >> to change behavior for anything except the
>> new
>> > > > node).
>> > > > >> > >>> > > > > > > >>
>> > > > >> > >>> > > > > > > >> What do you think?
>> > > > >> > >>> > > > > > > >>
>> > > > >> > >>> > > > > > > >> Gwen
>> > > > >> > >>> > > > > > > >>
>> > > > >> > >>> > > > > > > >> On Thu, Oct 15, 2015 at 10:13 AM, Allen
>> Wang <
>> > > > >> > >>> > > > allenxw...@gmail.com>
>> > > > >> > >>> > > > > > > >> wrote:
>> > > > >> > >>> > > > > > > >>
>> > > > >> > >>> > > > > > > >> > For scenario 1:
>> > > > >> > >>> > > > > > > >> >
>> > > > >> > >>> > > > > > > >> > - Add the rack information to broker
>> property
>> > > > file
>> > > > >> or
>> > > > >> > >>> > > > dynamically
>> > > > >> > >>> > > > > > set
>> > > > >> > >>> > > > > > > >> it in
>> > > > >> > >>> > > > > > > >> > the wrapper code to bootstrap Kafka
>> server.
>> > You
>> > > > >> would
>> > > > >> > do
>> > > > >> > >>> > that
>> > > > >> > >>> > > > for
>> > > > >> > >>> > > > > > all
>> > > > >> > >>> > > > > > > >> > brokers and restart the brokers one by
>> one.
>> > > > >> > >>> > > > > > > >> >
>> > > > >> > >>> > > > > > > >> > In this scenario, the complete broker to
>> rack
>> > > > >> mapping
>> > > > >> > >>> may
>> > > > >> > >>> > not
>> > > > >> > >>> > > be
>> > > > >> > >>> > > > > > > >> available
>> > > > >> > >>> > > > > > > >> > until every broker is restarted. During
>> that
>> > > time
>> > > > >> we
>> > > > >> > >>> fall
>> > > > >> > >>> > back
>> > > > >> > >>> > > > to
>> > > > >> > >>> > > > > > > >> default
>> > > > >> > >>> > > > > > > >> > replica assignment algorithm.
>> > > > >> > >>> > > > > > > >> >
>> > > > >> > >>> > > > > > > >> > For scenario 2:
>> > > > >> > >>> > > > > > > >> >
>> > > > >> > >>> > > > > > > >> > - Add the rack information to broker
>> property
>> > > > file
>> > > > >> or
>> > > > >> > >>> > > > dynamically
>> > > > >> > >>> > > > > > set
>> > > > >> > >>> > > > > > > >> it in
>> > > > >> > >>> > > > > > > >> > the wrapper code and start the broker.
>> > > > >> > >>> > > > > > > >> >
>> > > > >> > >>> > > > > > > >> >
>> > > > >> > >>> > > > > > > >> > On Wed, Oct 14, 2015 at 2:36 PM, Gwen
>> > Shapira <
>> > > > >> > >>> > > > g...@confluent.io>
>> > > > >> > >>> > > > > > > >> wrote:
>> > > > >> > >>> > > > > > > >> >
>> > > > >> > >>> > > > > > > >> > > Can you clarify the workflow for the
>> > > following
>> > > > >> > >>> scenarios:
>> > > > >> > >>> > > > > > > >> > >
>> > > > >> > >>> > > > > > > >> > > 1. I currently have 6 brokers and want
>> to
>> > add
>> > > > >> rack
>> > > > >> > >>> > > information
>> > > > >> > >>> > > > > for
>> > > > >> > >>> > > > > > > >> each
>> > > > >> > >>> > > > > > > >> > > 2. I'm adding a new broker and I want to
>> > > > specify
>> > > > >> > which
>> > > > >> > >>> > rack
>> > > > >> > >>> > > it
>> > > > >> > >>> > > > > > > >> belongs on
>> > > > >> > >>> > > > > > > >> > > while adding it.
>> > > > >> > >>> > > > > > > >> > >
>> > > > >> > >>> > > > > > > >> > > Thanks!
>> > > > >> > >>> > > > > > > >> > >
>> > > > >> > >>> > > > > > > >> > > On Tue, Oct 13, 2015 at 2:21 PM, Allen
>> > Wang <
>> > > > >> > >>> > > > > allenxw...@gmail.com
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > > > > >> > wrote:
>> > > > >> > >>> > > > > > > >> > >
>> > > > >> > >>> > > > > > > >> > > > We discussed the KIP in the hangout
>> > today.
>> > > > The
>> > > > >> > >>> > > > recommendation
>> > > > >> > >>> > > > > is
>> > > > >> > >>> > > > > > > to
>> > > > >> > >>> > > > > > > >> > make
>> > > > >> > >>> > > > > > > >> > > > rack as a broker property in
>> ZooKeeper.
>> > For
>> > > > >> users
>> > > > >> > >>> with
>> > > > >> > >>> > > > > existing
>> > > > >> > >>> > > > > > > rack
>> > > > >> > >>> > > > > > > >> > > > information stored somewhere, they
>> would
>> > > need
>> > > > >> to
>> > > > >> > >>> > retrieve
>> > > > >> > >>> > > > the
>> > > > >> > >>> > > > > > > >> > information
>> > > > >> > >>> > > > > > > >> > > > at broker start up and dynamically set
>> > the
>> > > > rack
>> > > > >> > >>> > property,
>> > > > >> > >>> > > > > which
>> > > > >> > >>> > > > > > > can
>> > > > >> > >>> > > > > > > >> be
>> > > > >> > >>> > > > > > > >> > > > implemented as a wrapper to bootstrap
>> > > broker.
>> > > > >> > There
>> > > > >> > >>> will
>> > > > >> > >>> > > be
>> > > > >> > >>> > > > no
>> > > > >> > >>> > > > > > > >> > interface
>> > > > >> > >>> > > > > > > >> > > or
>> > > > >> > >>> > > > > > > >> > > > pluggable implementation to retrieve
>> the
>> > > rack
>> > > > >> > >>> > information.
>> > > > >> > >>> > > > > > > >> > > >
>> > > > >> > >>> > > > > > > >> > > > The assumption is that you always
>> need to
>> > > > >> restart
>> > > > >> > >>> the
>> > > > >> > >>> > > broker
>> > > > >> > >>> > > > > to
>> > > > >> > >>> > > > > > > >> make a
>> > > > >> > >>> > > > > > > >> > > > change to the rack.
>> > > > >> > >>> > > > > > > >> > > >
>> > > > >> > >>> > > > > > > >> > > > Once the rack becomes a broker
>> property,
>> > it
>> > > > >> will
>> > > > >> > be
>> > > > >> > >>> > > possible
>> > > > >> > >>> > > > > to
>> > > > >> > >>> > > > > > > make
>> > > > >> > >>> > > > > > > >> > rack
>> > > > >> > >>> > > > > > > >> > > > part of the meta data to help the
>> > consumer
>> > > > >> choose
>> > > > >> > >>> which
>> > > > >> > >>> > in
>> > > > >> > >>> > > > > sync
>> > > > >> > >>> > > > > > > >> replica
>> > > > >> > >>> > > > > > > >> > > to
>> > > > >> > >>> > > > > > > >> > > > consume from as part of the future
>> > consumer
>> > > > >> > >>> enhancement.
>> > > > >> > >>> > > > > > > >> > > >
>> > > > >> > >>> > > > > > > >> > > > I will update the KIP.
>> > > > >> > >>> > > > > > > >> > > >
>> > > > >> > >>> > > > > > > >> > > > Thanks,
>> > > > >> > >>> > > > > > > >> > > > Allen
>> > > > >> > >>> > > > > > > >> > > >
>> > > > >> > >>> > > > > > > >> > > >
>> > > > >> > >>> > > > > > > >> > > > On Thu, Oct 8, 2015 at 9:23 AM, Allen
>> > Wang
>> > > <
>> > > > >> > >>> > > > > > allenxw...@gmail.com>
>> > > > >> > >>> > > > > > > >> > wrote:
>> > > > >> > >>> > > > > > > >> > > >
>> > > > >> > >>> > > > > > > >> > > > > I attended Tuesday's KIP hangout but
>> > this
>> > > > KIP
>> > > > >> > was
>> > > > >> > >>> not
>> > > > >> > >>> > > > > > discussed
>> > > > >> > >>> > > > > > > >> due
>> > > > >> > >>> > > > > > > >> > to
>> > > > >> > >>> > > > > > > >> > > > > time constraint.
>> > > > >> > >>> > > > > > > >> > > > >
>> > > > >> > >>> > > > > > > >> > > > > However, after hearing discussion of
>> > > > KIP-35,
>> > > > >> I
>> > > > >> > >>> have
>> > > > >> > >>> > the
>> > > > >> > >>> > > > > > feeling
>> > > > >> > >>> > > > > > > >> that
>> > > > >> > >>> > > > > > > >> > > > > incompatibility (caused by new
>> broker
>> > > > >> property)
>> > > > >> > >>> > between
>> > > > >> > >>> > > > > > brokers
>> > > > >> > >>> > > > > > > >> with
>> > > > >> > >>> > > > > > > >> > > > > different versions  will be solved
>> > there.
>> > > > In
>> > > > >> > >>> addition,
>> > > > >> > >>> > > > > having
>> > > > >> > >>> > > > > > > >> stack
>> > > > >> > >>> > > > > > > >> > in
>> > > > >> > >>> > > > > > > >> > > > > broker property as meta data may
>> also
>> > > help
>> > > > >> > >>> consumers
>> > > > >> > >>> > in
>> > > > >> > >>> > > > the
>> > > > >> > >>> > > > > > > >> future.
>> > > > >> > >>> > > > > > > >> > So
>> > > > >> > >>> > > > > > > >> > > I
>> > > > >> > >>> > > > > > > >> > > > am
>> > > > >> > >>> > > > > > > >> > > > > open to adding stack property to
>> > broker.
>> > > > >> > >>> > > > > > > >> > > > >
>> > > > >> > >>> > > > > > > >> > > > > Hopefully we can discuss this in the
>> > next
>> > > > KIP
>> > > > >> > >>> hangout.
>> > > > >> > >>> > > > > > > >> > > > >
>> > > > >> > >>> > > > > > > >> > > > > On Wed, Sep 30, 2015 at 2:46 PM,
>> Allen
>> > > > Wang <
>> > > > >> > >>> > > > > > > allenxw...@gmail.com
>> > > > >> > >>> > > > > > > >> >
>> > > > >> > >>> > > > > > > >> > > > wrote:
>> > > > >> > >>> > > > > > > >> > > > >
>> > > > >> > >>> > > > > > > >> > > > >> Can you send me the information on
>> the
>> > > > next
>> > > > >> KIP
>> > > > >> > >>> > > hangout?
>> > > > >> > >>> > > > > > > >> > > > >>
>> > > > >> > >>> > > > > > > >> > > > >> Currently the broker-rack mapping
>> is
>> > not
>> > > > >> > cached.
>> > > > >> > >>> In
>> > > > >> > >>> > > > > > KafkaApis,
>> > > > >> > >>> > > > > > > >> > > > >> RackLocator.getRackInfo() is called
>> > each
>> > > > >> time
>> > > > >> > the
>> > > > >> > >>> > > mapping
>> > > > >> > >>> > > > > is
>> > > > >> > >>> > > > > > > >> needed
>> > > > >> > >>> > > > > > > >> > > for
>> > > > >> > >>> > > > > > > >> > > > >> auto topic creation. This will
>> ensure
>> > > > latest
>> > > > >> > >>> mapping
>> > > > >> > >>> > is
>> > > > >> > >>> > > > > used
>> > > > >> > >>> > > > > > at
>> > > > >> > >>> > > > > > > >> any
>> > > > >> > >>> > > > > > > >> > > > time.
>> > > > >> > >>> > > > > > > >> > > > >>
>> > > > >> > >>> > > > > > > >> > > > >> The ability to get the complete
>> > mapping
>> > > > >> makes
>> > > > >> > it
>> > > > >> > >>> > simple
>> > > > >> > >>> > > > to
>> > > > >> > >>> > > > > > > reuse
>> > > > >> > >>> > > > > > > >> the
>> > > > >> > >>> > > > > > > >> > > > same
>> > > > >> > >>> > > > > > > >> > > > >> interface in command line tools.
>> > > > >> > >>> > > > > > > >> > > > >>
>> > > > >> > >>> > > > > > > >> > > > >>
>> > > > >> > >>> > > > > > > >> > > > >> On Wed, Sep 30, 2015 at 11:01 AM,
>> > Aditya
>> > > > >> > >>> Auradkar <
>> > > > >> > >>> > > > > > > >> > > > >> aaurad...@linkedin.com.invalid>
>> > wrote:
>> > > > >> > >>> > > > > > > >> > > > >>
>> > > > >> > >>> > > > > > > >> > > > >>> Perhaps we discuss this during the
>> > next
>> > > > KIP
>> > > > >> > >>> hangout?
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > > >>> I do see that a pluggable rack
>> > locator
>> > > > can
>> > > > >> be
>> > > > >> > >>> useful
>> > > > >> > >>> > > > but I
>> > > > >> > >>> > > > > > do
>> > > > >> > >>> > > > > > > >> see a
>> > > > >> > >>> > > > > > > >> > > few
>> > > > >> > >>> > > > > > > >> > > > >>> concerns:
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > > >>> - The RackLocator (as described in
>> > the
>> > > > >> > >>> document),
>> > > > >> > >>> > > > implies
>> > > > >> > >>> > > > > > that
>> > > > >> > >>> > > > > > > >> it
>> > > > >> > >>> > > > > > > >> > can
>> > > > >> > >>> > > > > > > >> > > > >>> discover rack information for any
>> > node
>> > > in
>> > > > >> the
>> > > > >> > >>> > cluster.
>> > > > >> > >>> > > > How
>> > > > >> > >>> > > > > > > does
>> > > > >> > >>> > > > > > > >> it
>> > > > >> > >>> > > > > > > >> > > deal
>> > > > >> > >>> > > > > > > >> > > > >>> with rack location changes? For
>> > > example,
>> > > > >> if I
>> > > > >> > >>> moved
>> > > > >> > >>> > > > broker
>> > > > >> > >>> > > > > > id
>> > > > >> > >>> > > > > > > >> (1)
>> > > > >> > >>> > > > > > > >> > > from
>> > > > >> > >>> > > > > > > >> > > > >>> rack
>> > > > >> > >>> > > > > > > >> > > > >>> X to Y, I only have to start that
>> > > broker
>> > > > >> with
>> > > > >> > a
>> > > > >> > >>> > newer
>> > > > >> > >>> > > > rack
>> > > > >> > >>> > > > > > > >> config.
>> > > > >> > >>> > > > > > > >> > If
>> > > > >> > >>> > > > > > > >> > > > >>> RackLocator discovers broker ->
>> rack
>> > > > >> > >>> information at
>> > > > >> > >>> > > > start
>> > > > >> > >>> > > > > up
>> > > > >> > >>> > > > > > > >> time,
>> > > > >> > >>> > > > > > > >> > > any
>> > > > >> > >>> > > > > > > >> > > > >>> change to a broker will require
>> > > bouncing
>> > > > >> the
>> > > > >> > >>> entire
>> > > > >> > >>> > > > > cluster
>> > > > >> > >>> > > > > > > >> since
>> > > > >> > >>> > > > > > > >> > > > >>> createTopic requests can be sent
>> to
>> > any
>> > > > >> node
>> > > > >> > in
>> > > > >> > >>> the
>> > > > >> > >>> > > > > cluster.
>> > > > >> > >>> > > > > > > >> > > > >>> For this reason it may be simpler
>> to
>> > > have
>> > > > >> each
>> > > > >> > >>> node
>> > > > >> > >>> > be
>> > > > >> > >>> > > > > aware
>> > > > >> > >>> > > > > > > of
>> > > > >> > >>> > > > > > > >> its
>> > > > >> > >>> > > > > > > >> > > own
>> > > > >> > >>> > > > > > > >> > > > >>> rack and persist it in ZK during
>> > start
>> > > up
>> > > > >> > time.
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > > >>> - A pluggable RackLocator relies
>> on
>> > an
>> > > > >> > external
>> > > > >> > >>> > > service
>> > > > >> > >>> > > > > > being
>> > > > >> > >>> > > > > > > >> > > available
>> > > > >> > >>> > > > > > > >> > > > >>> to
>> > > > >> > >>> > > > > > > >> > > > >>> serve rack information.
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > > >>> Out of curiosity, I looked up how
>> a
>> > > > couple
>> > > > >> of
>> > > > >> > >>> other
>> > > > >> > >>> > > > > systems
>> > > > >> > >>> > > > > > > deal
>> > > > >> > >>> > > > > > > >> > with
>> > > > >> > >>> > > > > > > >> > > > >>> zone/rack awareness.
>> > > > >> > >>> > > > > > > >> > > > >>> For Cassandra some interesting
>> modes
>> > > are:
>> > > > >> > >>> > > > > > > >> > > > >>> (Property File configuration)
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > >
>> > > > >> > >>> > > > > > > >> > >
>> > > > >> > >>> > > > > > > >> >
>> > > > >> > >>> > > > > > > >>
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > >
>> > > > >> > >>> > >
>> > > > >> > >>> >
>> > > > >> > >>>
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> http://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureSnitchPFSnitch_t.html
>> > > > >> > >>> > > > > > > >> > > > >>> (Dynamic inference)
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > >
>> > > > >> > >>> > > > > > > >> > >
>> > > > >> > >>> > > > > > > >> >
>> > > > >> > >>> > > > > > > >>
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > >
>> > > > >> > >>> > >
>> > > > >> > >>> >
>> > > > >> > >>>
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> http://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureSnitchRackInf_c.html
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > > >>> Voldemort does a static node ->
>> zone
>> > > > >> > assignment
>> > > > >> > >>> > based
>> > > > >> > >>> > > on
>> > > > >> > >>> > > > > > > >> > > configuration.
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > > >>> Aditya
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > > >>> On Wed, Sep 30, 2015 at 10:05 AM,
>> > Allen
>> > > > >> Wang <
>> > > > >> > >>> > > > > > > >> allenxw...@gmail.com
>> > > > >> > >>> > > > > > > >> > >
>> > > > >> > >>> > > > > > > >> > > > >>> wrote:
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > > >>> > I would like to see if we can do
>> > > both:
>> > > > >> > >>> > > > > > > >> > > > >>> >
>> > > > >> > >>> > > > > > > >> > > > >>> > - Make RackLocator pluggable to
>> > > > >> facilitate
>> > > > >> > >>> > migration
>> > > > >> > >>> > > > > with
>> > > > >> > >>> > > > > > > >> > existing
>> > > > >> > >>> > > > > > > >> > > > >>> > broker-rack mapping
>> > > > >> > >>> > > > > > > >> > > > >>> >
>> > > > >> > >>> > > > > > > >> > > > >>> > - Make rack an optional property
>> > for
>> > > > >> broker.
>> > > > >> > >>> If
>> > > > >> > >>> > rack
>> > > > >> > >>> > > > is
>> > > > >> > >>> > > > > > > >> available
>> > > > >> > >>> > > > > > > >> > > > from
>> > > > >> > >>> > > > > > > >> > > > >>> > broker, treat it as source of
>> > truth.
>> > > > For
>> > > > >> > users
>> > > > >> > >>> > with
>> > > > >> > >>> > > > > > existing
>> > > > >> > >>> > > > > > > >> > > > >>> broker-rack
>> > > > >> > >>> > > > > > > >> > > > >>> > mapping somewhere else, they can
>> > use
>> > > > the
>> > > > >> > >>> pluggable
>> > > > >> > >>> > > way
>> > > > >> > >>> > > > > or
>> > > > >> > >>> > > > > > > they
>> > > > >> > >>> > > > > > > >> > can
>> > > > >> > >>> > > > > > > >> > > > >>> transfer
>> > > > >> > >>> > > > > > > >> > > > >>> > the mapping to the broker rack
>> > > > property.
>> > > > >> > >>> > > > > > > >> > > > >>> >
>> > > > >> > >>> > > > > > > >> > > > >>> > One thing I am not sure is what
>> > > happens
>> > > > >> at
>> > > > >> > >>> rolling
>> > > > >> > >>> > > > > upgrade
>> > > > >> > >>> > > > > > > >> when
>> > > > >> > >>> > > > > > > >> > we
>> > > > >> > >>> > > > > > > >> > > > have
>> > > > >> > >>> > > > > > > >> > > > >>> > rack as a broker property. For
>> > > brokers
>> > > > >> with
>> > > > >> > >>> older
>> > > > >> > >>> > > > > version
>> > > > >> > >>> > > > > > of
>> > > > >> > >>> > > > > > > >> > Kafka,
>> > > > >> > >>> > > > > > > >> > > > >>> will it
>> > > > >> > >>> > > > > > > >> > > > >>> > cause problem for them? If so,
>> is
>> > > there
>> > > > >> any
>> > > > >> > >>> > > > workaround?
>> > > > >> > >>> > > > > I
>> > > > >> > >>> > > > > > > also
>> > > > >> > >>> > > > > > > >> > > think
>> > > > >> > >>> > > > > > > >> > > > it
>> > > > >> > >>> > > > > > > >> > > > >>> > would be better not to have
>> rack in
>> > > the
>> > > > >> > >>> controller
>> > > > >> > >>> > > > wire
>> > > > >> > >>> > > > > > > >> protocol
>> > > > >> > >>> > > > > > > >> > > but
>> > > > >> > >>> > > > > > > >> > > > >>> not
>> > > > >> > >>> > > > > > > >> > > > >>> > sure if it is achievable.
>> > > > >> > >>> > > > > > > >> > > > >>> >
>> > > > >> > >>> > > > > > > >> > > > >>> > Thanks,
>> > > > >> > >>> > > > > > > >> > > > >>> > Allen
>> > > > >> > >>> > > > > > > >> > > > >>> >
>> > > > >> > >>> > > > > > > >> > > > >>> >
>> > > > >> > >>> > > > > > > >> > > > >>> >
>> > > > >> > >>> > > > > > > >> > > > >>> >
>> > > > >> > >>> > > > > > > >> > > > >>> >
>> > > > >> > >>> > > > > > > >> > > > >>> > On Mon, Sep 28, 2015 at 4:55 PM,
>> > Todd
>> > > > >> > Palino <
>> > > > >> > >>> > > > > > > >> tpal...@gmail.com>
>> > > > >> > >>> > > > > > > >> > > > >>> wrote:
>> > > > >> > >>> > > > > > > >> > > > >>> >
>> > > > >> > >>> > > > > > > >> > > > >>> > > I tend to like the idea of a
>> > > > pluggable
>> > > > >> > >>> locator.
>> > > > >> > >>> > > For
>> > > > >> > >>> > > > > > > >> example, we
>> > > > >> > >>> > > > > > > >> > > > >>> already
>> > > > >> > >>> > > > > > > >> > > > >>> > > have an interface for
>> discovering
>> > > > >> > >>> information
>> > > > >> > >>> > > about
>> > > > >> > >>> > > > > the
>> > > > >> > >>> > > > > > > >> > physical
>> > > > >> > >>> > > > > > > >> > > > >>> location
>> > > > >> > >>> > > > > > > >> > > > >>> > > of servers. I don't relish the
>> > idea
>> > > > of
>> > > > >> > >>> having to
>> > > > >> > >>> > > > > > maintain
>> > > > >> > >>> > > > > > > >> data
>> > > > >> > >>> > > > > > > >> > in
>> > > > >> > >>> > > > > > > >> > > > >>> > multiple
>> > > > >> > >>> > > > > > > >> > > > >>> > > places.
>> > > > >> > >>> > > > > > > >> > > > >>> > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > -Todd
>> > > > >> > >>> > > > > > > >> > > > >>> > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > On Mon, Sep 28, 2015 at 4:48
>> PM,
>> > > > Aditya
>> > > > >> > >>> > Auradkar <
>> > > > >> > >>> > > > > > > >> > > > >>> > > aaurad...@linkedin.com.invalid
>> >
>> > > > wrote:
>> > > > >> > >>> > > > > > > >> > > > >>> > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > Thanks for starting this KIP
>> > > Allen.
>> > > > >> > >>> > > > > > > >> > > > >>> > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > I agree with Gwen that
>> having a
>> > > > >> > >>> RackLocator
>> > > > >> > >>> > > class
>> > > > >> > >>> > > > > that
>> > > > >> > >>> > > > > > > is
>> > > > >> > >>> > > > > > > >> > > > pluggable
>> > > > >> > >>> > > > > > > >> > > > >>> > seems
>> > > > >> > >>> > > > > > > >> > > > >>> > > > to be too complex. The KIP
>> > refers
>> > > > to
>> > > > >> > >>> > potentially
>> > > > >> > >>> > > > > > non-ZK
>> > > > >> > >>> > > > > > > >> > storage
>> > > > >> > >>> > > > > > > >> > > > >>> for the
>> > > > >> > >>> > > > > > > >> > > > >>> > > > rack info which I don't
>> think
>> > is
>> > > > >> > >>> necessary.
>> > > > >> > >>> > > > > > > >> > > > >>> > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > Perhaps we can persist this
>> > info
>> > > in
>> > > > >> zk
>> > > > >> > >>> under
>> > > > >> > >>> > > > > > > >> > > > >>> /brokers/ids/<broker_id>
>> > > > >> > >>> > > > > > > >> > > > >>> > > > similar to other broker
>> > > properties
>> > > > >> and
>> > > > >> > >>> add a
>> > > > >> > >>> > > > config
>> > > > >> > >>> > > > > in
>> > > > >> > >>> > > > > > > >> > > > KafkaConfig
>> > > > >> > >>> > > > > > > >> > > > >>> > called
>> > > > >> > >>> > > > > > > >> > > > >>> > > > "rack".
>> > > > >> > >>> > > > > > > >> > > > >>> > > >
>> > > > >> > >>> > > > > > >
>> > > > >> {"jmx_port":-1,"endpoints":[...],"host":"xxx","port":yyy,
>> > > > >> > >>> > > > > > > >> > > "rack":
>> > > > >> > >>> > > > > > > >> > > > >>> > "abc"}
>> > > > >> > >>> > > > > > > >> > > > >>> > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > Aditya
>> > > > >> > >>> > > > > > > >> > > > >>> > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > On Mon, Sep 28, 2015 at 2:30
>> > PM,
>> > > > Gwen
>> > > > >> > >>> Shapira
>> > > > >> > >>> > <
>> > > > >> > >>> > > > > > > >> > > g...@confluent.io
>> > > > >> > >>> > > > > > > >> > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > wrote:
>> > > > >> > >>> > > > > > > >> > > > >>> > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > Hi,
>> > > > >> > >>> > > > > > > >> > > > >>> > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > First, thanks for putting
>> > out a
>> > > > KIP
>> > > > >> > for
>> > > > >> > >>> > this.
>> > > > >> > >>> > > > This
>> > > > >> > >>> > > > > > is
>> > > > >> > >>> > > > > > > >> super
>> > > > >> > >>> > > > > > > >> > > > >>> important
>> > > > >> > >>> > > > > > > >> > > > >>> > > for
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > production deployments of
>> > > Kafka.
>> > > > >> > >>> > > > > > > >> > > > >>> > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > Few questions:
>> > > > >> > >>> > > > > > > >> > > > >>> > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > 1) Are we sure we want "as
>> > many
>> > > > >> racks
>> > > > >> > as
>> > > > >> > >>> > > > > possible"?
>> > > > >> > >>> > > > > > > I'd
>> > > > >> > >>> > > > > > > >> > want
>> > > > >> > >>> > > > > > > >> > > to
>> > > > >> > >>> > > > > > > >> > > > >>> > balance
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > between safety (more
>> racks)
>> > and
>> > > > >> > network
>> > > > >> > >>> > > > > utilization
>> > > > >> > >>> > > > > > > >> > (traffic
>> > > > >> > >>> > > > > > > >> > > > >>> within a
>> > > > >> > >>> > > > > > > >> > > > >>> > > > rack
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > uses the high-bandwidth
>> TOR
>> > > > >> switch).
>> > > > >> > One
>> > > > >> > >>> > > replica
>> > > > >> > >>> > > > > on
>> > > > >> > >>> > > > > > a
>> > > > >> > >>> > > > > > > >> > > different
>> > > > >> > >>> > > > > > > >> > > > >>> rack
>> > > > >> > >>> > > > > > > >> > > > >>> > > and
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > the rest on same rack (if
>> > > > possible)
>> > > > >> > >>> sounds
>> > > > >> > >>> > > > better
>> > > > >> > >>> > > > > to
>> > > > >> > >>> > > > > > > me.
>> > > > >> > >>> > > > > > > >> > > > >>> > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > 2) Rack-locator class
>> seems
>> > > > overly
>> > > > >> > >>> complex
>> > > > >> > >>> > > > > compared
>> > > > >> > >>> > > > > > to
>> > > > >> > >>> > > > > > > >> > > adding a
>> > > > >> > >>> > > > > > > >> > > > >>> > > > rack.number
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > property to the broker
>> > > properties
>> > > > >> > file.
>> > > > >> > >>> Why
>> > > > >> > >>> > do
>> > > > >> > >>> > > > we
>> > > > >> > >>> > > > > > want
>> > > > >> > >>> > > > > > > >> > that?
>> > > > >> > >>> > > > > > > >> > > > >>> > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > Gwen
>> > > > >> > >>> > > > > > > >> > > > >>> > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > On Mon, Sep 28, 2015 at
>> 12:15
>> > > PM,
>> > > > >> > Allen
>> > > > >> > >>> > Wang <
>> > > > >> > >>> > > > > > > >> > > > >>> allenxw...@gmail.com>
>> > > > >> > >>> > > > > > > >> > > > >>> > > > wrote:
>> > > > >> > >>> > > > > > > >> > > > >>> > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > > Hello Kafka Developers,
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > > I just created KIP-36
>> for
>> > > rack
>> > > > >> aware
>> > > > >> > >>> > replica
>> > > > >> > >>> > > > > > > >> assignment.
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > >
>> > > > >> > >>> > > > > > > >> > > > >>> >
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > >
>> > > > >> > >>> > > > > > > >> > >
>> > > > >> > >>> > > > > > > >> >
>> > > > >> > >>> > > > > > > >>
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > >
>> > > > >> > >>> > >
>> > > > >> > >>> >
>> > > > >> > >>>
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-36+Rack+aware+replica+assignment
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > > The goal is to utilize
>> the
>> > > > >> isolation
>> > > > >> > >>> > > provided
>> > > > >> > >>> > > > by
>> > > > >> > >>> > > > > > the
>> > > > >> > >>> > > > > > > >> > racks
>> > > > >> > >>> > > > > > > >> > > in
>> > > > >> > >>> > > > > > > >> > > > >>> data
>> > > > >> > >>> > > > > > > >> > > > >>> > > > center
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > > and distribute replicas
>> to
>> > > > racks
>> > > > >> to
>> > > > >> > >>> > provide
>> > > > >> > >>> > > > > fault
>> > > > >> > >>> > > > > > > >> > > tolerance.
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > > Comments are welcome.
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > > Thanks,
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > > Allen
>> > > > >> > >>> > > > > > > >> > > > >>> > > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > > >
>> > > > >> > >>> > > > > > > >> > > > >>> > >
>> > > > >> > >>> > > > > > > >> > > > >>> >
>> > > > >> > >>> > > > > > > >> > > > >>>
>> > > > >> > >>> > > > > > > >> > > > >>
>> > > > >> > >>> > > > > > > >> > > > >>
>> > > > >> > >>> > > > > > > >> > > > >
>> > > > >> > >>> > > > > > > >> > > >
>> > > > >> > >>> > > > > > > >> > >
>> > > > >> > >>> > > > > > > >> >
>> > > > >> > >>> > > > > > > >>
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > >
>> > > > >> > >>> > >
>> > > > >> > >>> >
>> > > > >> > >>>
>> > > > >> > >>>
>> > > > >> > >>>
>> > > > >> > >>> --
>> > > > >> > >>> Thanks,
>> > > > >> > >>> Neha
>> > > > >> > >>>
>> > > > >> > >>
>> > > > >> > >>
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Reply via email to