Re: new producer api and batched Futures....

Jay Kreps Thu, 20 Nov 2014 16:57:59 -0800

Internally it works as you describe, there is only one CountDownLatch per
batch sent, each of the futures is just a wrapper around that.

It is true that if you accumulate thousands of futures in a list that may
be a fair number of objects you are retaining, and there will be some work
involved in checking them all. If you are sure they are all going to the
same partition you can actually wait on the last future since sends are
ordered within a partition. So when the final send completes the prior
sends should also have completed.

Either way if you see a case where the new producer isn't as fast as the
old producer let us know.

-Jay

On Thu, Nov 20, 2014 at 4:24 PM, Jason Rosenberg <j...@squareup.com> wrote:

> I've been looking at the new producer api with anticipation, but have not
> fired it up yet.
>
> One question I have, is it looks like there's no longer a 'batch' send mode
> (and I get that this is all now handled internally, e.g. you send
> individual messages, that then get collated and batched up and sent out).
>
> What I'm wondering, is whether there's added overhead in the producer (and
> the client code) having to manage all the Future return Objects from all
> the individual messages sent?  If I'm sending 100K messages/second, etc.,
> that seems like a lot of async Future Objects that have to be tickled, and
> waited for, etc.  Does not this cause some overhead?
>
> If I send a bunch of messages and then store all the Future's in a list,
> and then wait for all of them, it seems like a lot of thread contention.
> On the other hand, if I send a batch of messages, that are likely all to
> get sent as a single batch over the wire (cuz they are all going to the
> same partition), wouldn't there be some benefit in only having to wait for
> a single Future Object for the batch?
>
> Jason
>

Re: new producer api and batched Futures....

Reply via email to