Re: Kafka mirroring and zookeeper

Jay Kreps Wed, 25 Apr 2012 15:33:03 -0700

Short answer is yes, both async (acks=0 or 1) and sync replication
(acks > 1) will be both be supported.


-Jay

On Wed, Apr 25, 2012 at 11:22 AM, Jun Rao <[email protected]> wrote:
> Felix,
>
> Initially, we thought we could keep the option of not sending acks from the
> broker to the producer. However, this seems hard since in the new wire
> protocol, we need to send at least the error code to the producer (e.g., a
> request is sent to the wrong broker or wrong partition).
>
> So, what we allow in the current design is the following. The producer can
> specify the # of acks in the request. By default (acks = -1), the broker
> will wait for the message to be written to all replicas that are still
> synced up with the leader before acking the producer. Otherwise (acks >=0),
> the broker will ack the producer after the message is written to acks
> replicas. Currently, acks=0 is treated the same as acks=1.
>
> Thanks,
>
> Jun
>
> On Wed, Apr 25, 2012 at 10:39 AM, Felix GV <[email protected]> wrote:
>
>> Just curious, but if I remember correctly from the time I read KAFKA-50 and
>> the related JIRA issues, you guys plan to implement sync AND async
>> replication, right?
>>
>> --
>> Felix
>>
>>
>>
>> On Tue, Apr 24, 2012 at 4:42 PM, Jay Kreps <[email protected]> wrote:
>>
>> > Right now we do sloppy failover. That is when a broker goes down
>> > traffic is redirected to the remaining machines, but any unconsumed
>> > messages are stuck on that server until it comes back, if it is
>> > permanently gone the messages are lost. This is acceptable for us in
>> > the near-term since our pipeline is pretty real-time so this window
>> > between production and consumption is pretty small. The complete
>> > solution is the intra-cluster replication in KAFA-50 which is coming
>> > along fairly nicely now that we are working on it.
>> >
>> > -Jay
>> >
>> > On Tue, Apr 24, 2012 at 12:21 PM, Oliver Krohne
>> > <[email protected]> wrote:
>> > > Hi,
>> > >
>> > > indeed I thought could be used as failover approach.
>> > >
>> > > We use raid for local redundancy but it does not protect us in case of
>> a
>> > machine failure, so I am looking for a way to achieve a master/slave
>> setup
>> > until KAFKA-50 has been implemented.
>> > >
>> > > I think we can solve it for now by having multiple broker so that the
>> > application can continue sending messages if one broker goes down. My
>> main
>> > concern is to not introduce a new single point of failure which can stop
>> > the application. However as some consumer are not developed by us and it
>> is
>> > not clear how they store the offset in zookeeper we need to find out how
>> we
>> > can manage the consumer in case a broker will never return after a
>> failure.
>> > It will be acceptable to lose a couple of messages if a broker dies and
>> the
>> > consumers have not consumed all messages at the point of failure.
>> > >
>> > > Thanks,
>> > > Oliver
>> > >
>> > >
>> > >
>> > >
>> > > Am 23.04.2012 um 19:58 schrieb Jay Kreps:
>> > >
>> > >> I think the confusion comes from the fact that we are using mirroring
>> to
>> > >> handle geographic distribution not failover. If I understand correctly
>> > what
>> > >> Oliver is asking for is something to give fault tolerance not
>> something
>> > for
>> > >> distribution. I don't think that is really what the mirroring does out
>> > of
>> > >> the box, though technically i suppose you could just reset the offsets
>> > and
>> > >> point the consumer at the new cluster and have it start from "now".
>> > >>
>> > >> I think it would be helpful to document our use case in the mirroring
>> > docs
>> > >> since this is not the first time someone has asked about this.
>> > >>
>> > >> -Jay
>> > >>
>> > >> On Mon, Apr 23, 2012 at 10:38 AM, Joel Koshy <[email protected]>
>> > wrote:
>> > >>
>> > >>> Hi Oliver,
>> > >>>
>> > >>> I was reading the mirroring guide and I wonder if it is required that
>> > the
>> > >>>> mirror runs it's own zookeeper?
>> > >>>>
>> > >>>> We have a zookeeper cluster running which is used by different
>> > >>>> applications, so can we use that zookeeper cluster for the kafka
>> > source
>> > >>> and
>> > >>>> kafka mirror?
>> > >>>>
>> > >>>
>> > >>> You could have a single zookeeper cluster and use different
>> namespaces
>> > for
>> > >>> the source/target mirror. However, I don't think it is recommended to
>> > use a
>> > >>> remote zookeeper (if you have a cross-DC set up) since that would
>> > >>> potentially mean very high ZK latencies on one of your clusters.
>> > >>>
>> > >>>
>> > >>>> What is the procedure if the kafka source server fails to switch the
>> > >>>> applications to use the mirrored instance?
>> > >>>>
>> > >>>
>> > >>> I don't quite follow this question - can you clarify? The mirror
>> > cluster is
>> > >>> pretty much a separate instance. There is no built-in automatic
>> > fail-over
>> > >>> if your source cluster goes down.
>> > >>>
>> > >>>
>> > >>>> Are there any backup best practices if we would not use mirroring?
>> > >>>>
>> > >>>
>> > >>> You can use RAID arrays for (local) data redundancy. You may also be
>> > >>> interested in the (intra-DC) replication feature (KAFKA-50) that is
>> > >>> currently being developed. I believe some folks on this list have
>> also
>> > used
>> > >>> plain rsync's as an alternative to mirroring.
>> > >>>
>> > >>> Thanks,
>> > >>>
>> > >>> Joel
>> > >>>
>> > >
>> >
>>

Re: Kafka mirroring and zookeeper

Reply via email to