I'll be happy to give the initial design a go, but will probably only get
to it after Strata.

So either wait a bit (there are enough KIPs to review ;) or someone else
can get started.

Gwen

On Thu, Feb 12, 2015 at 6:55 PM, Joel Koshy <jjkosh...@gmail.com> wrote:

> +1 on investigating it further as a separate feature that will improve
> ops significantly (especially since an expert on the operations side
> has described use cases from actual experience).
>
> On Thu, Feb 12, 2015 at 05:47:50PM -0800, Gwen Shapira wrote:
> > I REALLY like the idea of supporting separate network for inter-broker
> > communication (and probably Zookeeper too).
> > I think its actually a pretty typical configuration in clusters, so I'm
> > surprised we didn't think of it before :)
> > Servers arrive with multiple cards specifically for "admin nic" vs.
> > "clients nic" vs "storage nic".
> >
> > That said, I'd like to handle it in a separate patch. First because
> > KAFKA-1809 is big enough already, and second because this really deserve
> > its own requirement-gathering and design.
> >
> > Does that make sense?
> >
> > Gwen
> >
> >
> >
> > On Thu, Feb 12, 2015 at 12:34 PM, Todd Palino <tpal...@gmail.com> wrote:
> >
> > > The idea is more about isolating the intra-cluster traffic from the
> normal
> > > clients as much as possible. There's a couple situations we've seen
> where
> > > this would be useful that I can think of immediately:
> > >
> > > 1) Normal operation - just having the intra-cluster traffic on a
> separate
> > > network interface would allow it to not get overwhelmed by something
> like a
> > > bootstrapping client who is saturating the network interface. We see
> this
> > > fairly often, where the replication falls behind because of heavy
> traffic
> > > from one application. We can always adjust the network threads, but
> > > segregating the traffic is the first step.
> > >
> > > 2) Isolation in case of an error - We have had situations, more than
> once,
> > > where we are needing to rebuild a cluster after a catastrophic problem
> and
> > > the clients are causing that process to take too long, or are causing
> > > additional failures. This has mostly come into play with file
> descriptor
> > > limits in the past, but it's certainly not the only situation.
> Constantly
> > > reconnecting clients continue to cause the brokers to fall over while
> we
> > > are trying to recover a down cluster. The only solution was to
> firewall off
> > > all the clients temporarily. This is a great deal more complicated if
> the
> > > brokers and the clients are all operating over the same port.
> > >
> > > Now, that said, quotas can be a partial solution to this. I don't want
> to
> > > jump the gun on that discussion (because it's going to come up
> separately
> > > and in more detail), but it is possible to structure quotas in a way
> that
> > > will allow the intra-cluster replication to continue to function in the
> > > case of high load. That would partially address case 1, but it does
> nothing
> > > for case 2. Additionally, I think it is also desirable to segregate the
> > > traffic even with quotas, so that regardless of the client load, the
> > > cluster itself is able to be healthy.
> > >
> > > -Todd
> > >
> > >
> > > On Thu, Feb 12, 2015 at 11:38 AM, Jun Rao <j...@confluent.io> wrote:
> > >
> > > > Todd,
> > > >
> > > > Could you elaborate on the benefit for having a separate endpoint for
> > > > intra-cluster communication? Is it mainly for giving intra-cluster
> > > requests
> > > > a high priority? At this moment, having a separate endpoint just
> means
> > > that
> > > > the socket connection for the intra-cluster communication is handled
> by a
> > > > separate acceptor thread. The processing of the requests from the
> network
> > > > and the handling of the requests are still shared by a single thread
> > > pool.
> > > > So, if any of the thread pool is exhausted, the intra-cluster
> requests
> > > will
> > > > still be delayed. We can potentially change this model, but this
> requires
> > > > more work.
> > > >
> > > > An alternative is to just rely on quotas. Intra-cluster requests
> will be
> > > > exempt from any kind of throttling.
> > > >
> > > > Gwen,
> > > >
> > > > I agree that defaulting wire.protocol.version to the current version
> is
> > > > probably better. It just means that we need to document the migration
> > > path
> > > > for previous versions.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Wed, Feb 11, 2015 at 6:33 PM, Todd Palino <tpal...@gmail.com>
> wrote:
> > > >
> > > > > Thanks, Gwen. This looks good to me as far as the wire protocol
> > > > versioning
> > > > > goes. I agree with you on defaulting to the new wire protocol
> version
> > > for
> > > > > new installs. I think it will also need to be very clear (to
> general
> > > > > installer of Kafka, and not just developers) in documentation when
> the
> > > > wire
> > > > > protocol version changes moving forwards, and what the
> risk/benefit of
> > > > > changing to the new version is.
> > > > >
> > > > > Since a rolling upgrade of the intra-cluster protocol is supported,
> > > will
> > > > a
> > > > > rolling downgrade work as well? Should a flaw (bug, security, or
> > > > otherwise)
> > > > > be discovered after upgrade, is it possible to change the
> > > > > wire.protocol.version
> > > > > back to 0.8.2 and do a rolling bounce?
> > > > >
> > > > > On the host/port/protocol specification, specifically the ZK config
> > > > format,
> > > > > is it possible to have an un-advertised endpoint? I would see this
> as
> > > > > potentially useful if you wanted to have an endpoint that you are
> > > > reserving
> > > > > for intra-cluster communication, and you would prefer to not have
> it
> > > > > advertised at all. Perhaps it is blocked by a firewall rule or
> other
> > > > > authentication method. This could also allow you to duplicate a
> > > security
> > > > > protocol type but segregate it on a different port or interface
> (if it
> > > is
> > > > > unadvertised, there is no ambiguity to the clients as to which
> endpoint
> > > > > should be selected). I believe I asked about that previously, and I
> > > > didn't
> > > > > track what the final outcome was or even if it was discussed
> further.
> > > > >
> > > > >
> > > > > -Todd
> > > > >
> > > > >
> > > > > On Wed, Feb 11, 2015 at 4:38 PM, Gwen Shapira <
> gshap...@cloudera.com>
> > > > > wrote:
> > > > >
> > > > > > Added Jun's notes to the KIP (Thanks for explaining so clearly,
> Jun.
> > > I
> > > > > was
> > > > > > clearly struggling with this...) and removed the reference to
> > > > > > use.new.wire.protocol.
> > > > > >
> > > > > > On Wed, Feb 11, 2015 at 4:19 PM, Joel Koshy <jjkosh...@gmail.com
> >
> > > > wrote:
> > > > > >
> > > > > > > The description that Jun gave for (2) was the detail I was
> looking
> > > > for
> > > > > > > - Gwen can you update the KIP with that for
> completeness/clarity?
> > > > > > >
> > > > > > > I'm +1 as well overall. However, I think it would be good if we
> > > also
> > > > > > > get an ack from someone who is more experienced on the
> operations
> > > > side
> > > > > > > (say, Todd) to review especially the upgrade plan.
> > > > > > >
> > > > > > > On Wed, Feb 11, 2015 at 09:40:50AM -0800, Jun Rao wrote:
> > > > > > > > +1 for proposed changes in 1 and 2.
> > > > > > > >
> > > > > > > > 1. The impact is that if someone uses SimpleConsumer and
> > > references
> > > > > > > Broker
> > > > > > > > explicitly, the application needs code change to compile with
> > > > 0.8.3.
> > > > > > > Since
> > > > > > > > SimpleConsumer is not widely used, breaking the API in
> > > > SimpleConsumer
> > > > > > but
> > > > > > > > maintaining overall code cleanness seems to be a better
> tradeoff.
> > > > > > > >
> > > > > > > > 2. For clarification, the issue is the following. In 0.8.3,
> we
> > > will
> > > > > be
> > > > > > > > evolving the wire protocol of UpdateMedataRequest (to send
> info
> > > > about
> > > > > > > > endpoints for different security protocols). Since this is
> used
> > > in
> > > > > > > > intra-cluster communication, we need to do the upgrade in two
> > > > steps.
> > > > > > The
> > > > > > > > idea is that in 0.8.3, we will default wire.protocol.version
> to
> > > > > 0.8.2.
> > > > > > > When
> > > > > > > > upgrading to 0.8.3, in step 1, we do a rolling upgrade to
> 0.8.3.
> > > > > After
> > > > > > > step
> > > > > > > > 1, all brokers will be capable for processing the new
> protocol in
> > > > > > 0.8.3,
> > > > > > > > but without actually using it. In step 2, we
> > > > > > > > configure wire.protocol.version to 0.8.3 in each broker and
> do
> > > > > another
> > > > > > > > rolling restart. After step 2, all brokers will start using
> the
> > > new
> > > > > > > > protocol in 0.8.3. Let's say that in the next release 0.9,
> we are
> > > > > > > changing
> > > > > > > > the intra-cluster wire protocol again. We will do the similar
> > > > thing:
> > > > > > > > defaulting wire.protocol.version to 0.8.3 in 0.9 so that
> people
> > > can
> > > > > > > upgrade
> > > > > > > > from 0.8.3 to 0.9 in two steps. For people who want to
> upgrade
> > > from
> > > > > > 0.8.2
> > > > > > > > to 0.9 directly, they will have to configure
> > > wire.protocol.version
> > > > to
> > > > > > > 0.8.2
> > > > > > > > first and then do the two-step upgrade to 0.9.
> > > > > > > >
> > > > > > > > Gwen,
> > > > > > > >
> > > > > > > > In KIP2, there is still a reference to use.new.protocol. This
> > > needs
> > > > > to
> > > > > > be
> > > > > > > > removed. Also, would it be better to use
> > > > > > > intra.cluster.wire.protocol.version
> > > > > > > > since this only applies to the wire protocol among brokers?
> > > > > > > >
> > > > > > > > Others,
> > > > > > > >
> > > > > > > > The patch in KAFKA-1809 is almost ready. It would be good to
> wrap
> > > > up
> > > > > > the
> > > > > > > > discussion on KIP2 soon. So, if you haven't looked at this
> KIP,
> > > > > please
> > > > > > > take
> > > > > > > > a look and send your comments.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jan 26, 2015 at 8:02 PM, Gwen Shapira <
> > > > gshap...@cloudera.com
> > > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Kafka Devs,
> > > > > > > > >
> > > > > > > > > While reviewing the patch for KAFKA-1809, we came across
> two
> > > > > > questions
> > > > > > > > > that we are interested in hearing the community out on.
> > > > > > > > >
> > > > > > > > > 1. This patch changes the Broker class and adds a new class
> > > > > > > > > BrokerEndPoint that behaves like the previous broker.
> > > > > > > > >
> > > > > > > > > While technically kafka.cluster.Broker is not part of the
> > > public
> > > > > API,
> > > > > > > > > it is returned by javaapi, used with the SimpleConsumer.
> > > > > > > > >
> > > > > > > > > Getting replicas from PartitionMetadata will now return
> > > > > > BrokerEndPoint
> > > > > > > > > instead of Broker. All method calls remain the same, but
> since
> > > we
> > > > > > > > > return a new type, we break the API.
> > > > > > > > >
> > > > > > > > > Note that this breakage does not prevent upgrades -
> existing
> > > > > > > > > SimpleConsumers will continue working (because we are
> > > > > > > > > wire-compatible).
> > > > > > > > > The only thing that won't work is building SimpleConsumers
> with
> > > > > > > > > dependency on Kafka versions higher than 0.8.2. Arguably,
> we
> > > > don't
> > > > > > > > > want anyone to do it anyway :)
> > > > > > > > >
> > > > > > > > > So:
> > > > > > > > > Do we state that the highest release on which
> SimpleConsumers
> > > can
> > > > > > > > > depend is 0.8.2? Or shall we keep Broker as is and create
> an
> > > > > > > > > UberBroker which will contain multiple brokers as its
> > > endpoints?
> > > > > > > > >
> > > > > > > > > 2.
> > > > > > > > > The KIP suggests "use.new.wire.protocol" configuration to
> > > decide
> > > > > > which
> > > > > > > > > protocols the brokers will use to talk to each other. The
> > > problem
> > > > > is
> > > > > > > > > that after the next upgrade, the wire protocol is no longer
> > > new,
> > > > so
> > > > > > > > > we'll have to reset it to false for the following upgrade,
> then
> > > > > > change
> > > > > > > > > to true again... and upgrading more than a single version
> will
> > > be
> > > > > > > > > impossible.
> > > > > > > > > Bad idea :)
> > > > > > > > >
> > > > > > > > > As an alternative, we can have a property for each version
> and
> > > > set
> > > > > > one
> > > > > > > > > of them to true. Or (simple, I think) have
> > > > "wire.protocol.version"
> > > > > > > > > property and accept version numbers (0.8.2, 0.8.3, 0.9) as
> > > > values.
> > > > > > > > >
> > > > > > > > > Please share your thoughts :)
> > > > > > > > >
> > > > > > > > > Gwen
> > > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>
>

Reply via email to