Re: New Producer - ONLY sync mode?

2015-02-09 Thread Steve Morin
Jay, Thanks I'll look at that more closely. On Sat, Feb 7, 2015 at 1:23 PM, Jay Kreps jay.kr...@gmail.com wrote: Steve In terms of mimicing the sync behavior, I think that is what .get() does, no? We are always returning the offset and error information. The example I gave didn't make use

Re: New Producer - ONLY sync mode?

2015-02-08 Thread Jay Kreps
Hey Otis, Yeah, Gwen is correct. The future from the send will be satisfied when the response is received so it will be exactly the same as the performance of the sync producer previously. -Jay On Mon, Feb 2, 2015 at 1:34 PM, Gwen Shapira gshap...@cloudera.com wrote: If I understood the code

Re: New Producer - ONLY sync mode?

2015-02-08 Thread Jay Kreps
I implemented the flush() call I hypothesized earlier in this thread as a patch on KAFKA-1865. So now producer.flush() will block until all buffered requests complete. The post condition is that all previous send futures are satisfied and have error/offset information. This is a little easier

Re: New Producer - ONLY sync mode?

2015-02-08 Thread Jay Kreps
Steve In terms of mimicing the sync behavior, I think that is what .get() does, no? We are always returning the offset and error information. The example I gave didn't make use of it, but you definitely can make use of it if you want to. -Jay On Wed, Feb 4, 2015 at 9:58 AM, Steve Morin

Re: New Producer - ONLY sync mode?

2015-02-04 Thread Joe Stein
Now that 0.8.2.0 is in the wild I look forward to working with more and seeing what folks start to-do with this function https://dist.apache.org/repos/dist/release/kafka/0.8.2.0/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html#send(org.apache.kafka.clients.producer.ProducerRecord,

Re: New Producer - ONLY sync mode?

2015-02-03 Thread Jay Kreps
Hey guys, I guess the question is whether it really matters how many underlying network requests occur? It is very hard for an application to depend on this even in the old producer since it depends on the partitions placement (a send to two partitions may go to either one machine or two and so

Re: New Producer - ONLY sync mode?

2015-02-02 Thread Otis Gospodnetic
Hi, Thanks for the info. Here's the use case. We have something up stream sending data, say a log shipper called X. It sends it to some remote component Y. Y is the Kafka Producer and it puts data into Kafka. But Y needs to send a reply to X and tell it whether it successfully put all its

Re: New Producer - ONLY sync mode?

2015-02-02 Thread Gwen Shapira
If I understood the code and Jay correctly - if you wait for the future it will be a similar delay to that of the old sync producer. Put another way, if you test it out and see longer delays than the sync producer had, we need to find out why and fix it. Gwen On Mon, Feb 2, 2015 at 1:27 PM,

Re: New Producer - ONLY sync mode?

2015-02-02 Thread Gwen Shapira
Can Y have a callback that will handle the notification to X? In this case, perhaps Y can be async and X can buffer the data until the callback triggers and says all good (or resend if the callback indicates an error) On Mon, Feb 2, 2015 at 12:56 PM, Otis Gospodnetic otis.gospodne...@gmail.com

Re: New Producer - ONLY sync mode?

2015-02-02 Thread Jay Kreps
Yeah as Gwen says there is no sync/async mode anymore. There is a new configuration which does a lot of what async did in terms of allowing batching: batch.size - This is the target amount of data per partition the server will attempt to batch together. linger.ms - This is the time the producer

Re: New Producer - ONLY sync mode?

2015-02-02 Thread Pradeep Gollakota
This is a great question Otis. Like Gwen said, you can accomplish Sync mode by setting the batch size to 1. But this does highlight a shortcoming of the new producer API. I really like the design of the new API and it has really great properties and I'm enjoying working with it. However, once API

Re: New Producer - ONLY sync mode?

2015-02-02 Thread Gwen Shapira
I've been thinking about that too, since both Flume and Sqoop rely on send(List) API of the old API. I'd like to see this API come back, but I'm debating how we'd handle errors. IIRC, the old API would fail an entire batch on a single error, which can lead to duplicates. Having N callbacks lets

Re: New Producer - ONLY sync mode?

2015-02-02 Thread Pradeep Gollakota
I looked at the newly added batch API to Kinesis for inspiration. The response on the batch put is a list of message-ids and their status (offset if success else a failure code). Ideally, I think the server should fail the entire batch or succeed the entire batch (i.e. no duplicates), but this is

Re: New Producer - ONLY sync mode?

2015-02-02 Thread Otis Gospodnetic
Hi, Nope, unfortunately it can't do that. X is a remote app, doesn't listen to anything external, calls Y via HTTPS. So X has to decide what to do with its data based on Y's synchronous response. It has to block until Y responds. And it wouldn't be pretty, I think, because nobody wants to run

New Producer - ONLY sync mode?

2015-02-02 Thread Otis Gospodnetic
Hi, Is the plan for New Producer to have ONLY async mode? I'm asking because of this info from the Wiki: - The producer will always attempt to batch data and will always immediately return a SendResponse which acts as a Future to allow the client to await the completion of the

Re: New Producer - ONLY sync mode?

2015-02-02 Thread Gwen Shapira
If you want to emulate the old sync producer behavior, you need to set the batch size to 1 (in producer config) and wait on the future you get from Send (i.e. future.get) I can't think of good reasons to do so, though. Gwen On Mon, Feb 2, 2015 at 11:08 AM, Otis Gospodnetic