Re: KAFKA-50 replication support and the Disruptor

Erik van Oosten Mon, 31 Oct 2011 09:22:44 -0700

No problems.

You could first program all tasks serially and then convert to the use the 
disrupter later. With cleanly separated tasks this should be very easy to do. 
So easy in fact that it might not be worth the wait ;)


BTW I respect the non-trivialness of multi-broker replication. That reminds me, 
is there an update of the proposal design documents?

Kind regards,
     Erik.

--
Erik van Oosten
http://day-to-day-stuff.blogspot.com

On 31 okt. 2011, at 17:06, Jun Rao wrote:

> Erik,
> 
> Thanks for the pointer. This could be useful optimization. However, we
> probably don't want to optimize too early until we understand the use
> cases. Also, the replication logic itself is already non-trivial. Perhaps
> we can look into this after the first version of replication is done?
> 
> Thanks,
> 
> Jun
> 
> On Mon, Oct 31, 2011 at 1:23 AM, Erik van Oosten <e.vanoos...@grons.nl>wrote:
> 
>>> This is interesting. But wouldn't the cost of the I/O here (writing to
>> log,
>>> requests to slave nodes) completely dominate the cost of locks?
>> 
>> 
>> That is not the point (mostly). While you're waiting for a lock, you can't
>> issue another IO request. Avoiding locking is worthwhile even if CPU is the
>> bottleneck. The advantage is that you'll get lower latency and also
>> important, less jitter.
>> 
>> As you know, given the right hardware, sequential writes to disk are
>> already very fast. If you jump through some hoops (e.g. avoid TCP, use user
>> space IP stack) the same applies to the network. In a carefully coded async
>> system, I am not convinced up front it would dominate locking overhead.
>> 
>> In fact, what the LMAX guys found out is that the synchronization overhead
>> of e.g. an ArrayBlockingQueue (the fastest queue they could find)
>> completely dwarves any other CPU processing you might want to do. In their
>> setup they process 6M messages per second on (by now) old hardware. That is
>> including journalling, replicating to standby node, doing some financial
>> transaction stuff and then sending a reply in lock step with the standby
>> node.
>> 
>> Kind regards,
>>    Erik.
>> 
>> --
>> Erik van Oosten
>> http://day-to-day-stuff.blogspot.com
>> 
>> On 30 okt. 2011, at 22:02, Jay Kreps wrote:
>> 
>>> This is interesting. But wouldn't the cost of the I/O here (writing to
>> log,
>>> requests to slave nodes) completely dominate the cost of locks?
>>> 
>>> -Jay
>>> 
>>> On Sun, Oct 30, 2011 at 1:01 PM, Erik van Oosten <e.vanoos...@grons.nl
>>> wrote:
>>> 
>>>> Hello,
>>>> 
>>>> The upcoming replication support (which we eagerly anticipate at my
>> work)
>>>> is a feature for which LMAX' disruptor is an ideal solution (
>>>> http://code.google.com/p/**disruptor/<
>> http://code.google.com/p/disruptor/>,
>>>> Apache licensed). A colleague has in fact just started on a new
>> replicating
>>>> message broker based on it (https://github.com/cdegroot/**underground<
>> https://github.com/cdegroot/underground>
>>>> ).
>>>> 
>>>> The disruptor itself is a super-performing in-jvm consumer/producer
>>>> system. A consumer normally works in its own thread. The disruptor gets
>>>> most of its speed because it is designed such that each consumer can
>>>> continue working without releasing the CPU to the OS or other threads.
>> In
>>>> addition it is optimized for modern CPU architectures, for example by
>>>> respecting the way the CPU cache works, and by avoiding all locking, CAS
>>>> operations and even by keeping volatile read/writes to a minimum.
>>>> Consumers may depend on work of other consumers. The disruptor will only
>>>> offer new messages (in bulk if possible) when they were processed by
>>>> preceding consumers.
>>>> 
>>>> For Kafka-50 we can (for example) think of the following tasks:
>>>> -a- get incoming new messages (the producer)
>>>> -b- pre-processor (calculate checksum and offset)
>>>> -c- write to journal
>>>> -d- write to replica broker, wait for confirmation
>>>> -e- notify consumers (no changes here)
>>>> 
>>>> With the disrupter the main flow would be coded as:
>>>> 
>>>> disruptor
>>>> .handleEventsWith(**preprocessor)
>>>> .then(journaller, replicator)
>>>> .then(notifier)
>>>> 
>>>> Journaling and replicating of a message is thus executed in parallel.
>>>> 
>>>> When this approach is considered, feel free to ask me about the
>> disruptor.
>>>> Hopefully I will also find some time to write some code a well.
>>>> 
>>>> Kind regards,
>>>>  Erik.
>>>> 
>>>> 
>>>> --
>>>> Erik van Oosten
>>>> http://www.day-to-day-stuff.**blogspot.com/<
>> http://www.day-to-day-stuff.blogspot.com/>
>>>> 
>>>> 
>>>> 
>> 
>>

Re: KAFKA-50 replication support and the Disruptor

Reply via email to