Re: Proposed Changes To New Producer Public API

2014-02-03 Thread Jay Kreps
Also, I thought a bit more about the org.apache thing. This is clearly an aesthetic thing, but I do feel aesthetics are important. The arguments for the kafka.* package is basically that putting your domain name in the namespace is a silly cargo cult thing that Java people do. I have never seen a

Re: Proposed Changes To New Producer Public API

2014-02-03 Thread Jun Rao
Fine with most of the changes. Just a few questions/comments. 1. Would it be better to change Callback to ProducerCallback to distinguish it from controller callback and potential future other types of callbacks (e.g. consumer)? 2. If no key is present AND no partition id is present, partitions

Re: Proposed Changes To New Producer Public API

2014-02-03 Thread Neha Narkhede
(1) if the replication factor is 1 and there is a broker failure, we can still route the message, (2) if a bunch of producers are started at the same time, this prevents them from picking up the same partition in a synchronized way. You raised a good point, Jun. Regarding #1, shouldn't round

Re: Proposed Changes To New Producer Public API

2014-02-03 Thread Jay Kreps
1. Yes, I think the name could be improved. However that name doesn't really express what it does. What about RecordSentCallback? 2. As Neha said the nodes are initialized to a random position. Round robin is preferable to random (lower variance, cheaper to compute, etc). Your point about skipping

Re: Proposed Changes To New Producer Public API

2014-02-03 Thread Joel Koshy
For (3) we could also do the following: - On any retryable producer error, force a metadata refresh (in handleProducerResponse). - In handleMetadataResponse, the producer can (internally) close out connections that are no longer valid. (i.e., connections to {old set of leader brokers} - {new

Re: Proposed Changes To New Producer Public API

2014-02-02 Thread Jay Kreps
Hey Joel, I actually went with FutureRecordMetadata instead of FutureRecordPosition but yes, as you say that object represents the topic, partition, and offset and would be the place we would put any future info we need to return to the user. The problem with implementing cancel is that the

Re: Proposed Changes To New Producer Public API

2014-02-02 Thread Jay Kreps
Hey Neha, Basically partitionsForTopic will invoke Metadata.fetch(topic).partitionsFor. So it will block on the first request for a given topic while metadata is loaded. Each subsequent request will return whatever metadata is present, the logic for refreshing metadata will remain exactly as it

Re: Proposed Changes To New Producer Public API

2014-02-02 Thread Neha Narkhede
It *is* expected that the user call this method on every single call and because we no longer control the partitioner interface there will be no way to control this. Make sense. This will ensure that new partitions are detected as defined by topic.metadata.refresh.interval.ms. There are a quite

Re: Proposed Changes To New Producer Public API

2014-02-02 Thread Guozhang Wang
About serialization, I am wondering if most people would try to have a customized partioning mainly due to logic motivations, such that they want some messages to go to the same partition; if that is true, they do not really care about the actually partition id but all they would do is specify the

Re: Proposed Changes To New Producer Public API

2014-02-02 Thread Jay Kreps
Okay I posted a patch against trunk that carries out the refactoring described above: https://issues.apache.org/jira/browse/KAFKA-1227 Updated javadoc is here: http://empathybox.com/kafka-javadoc This touches a fair number of files as I also improved documentation and standardized terminology in

Re: Proposed Changes To New Producer Public API

2014-02-01 Thread Joel Koshy
. In order to allow correctly choosing a partition the Producer interface will include a new method: ListPartitionInfo partitionsForTopic(String topic); PartitionInfo will be changed to include the actual Node objects not just the Node ids. Why are the node id's alone insufficient?

Re: Proposed Changes To New Producer Public API

2014-02-01 Thread Neha Narkhede
1. Change send to use java.util.concurrent.Future in send(): FutureRecordPosition send(ProducerRecord record, Callback callback) The cancel method will always return false and not do anything. Callback will then be changed to interface Callback { public void onCompletion(RecordPosition

Proposed Changes To New Producer Public API

2014-01-31 Thread Jay Kreps
Hey folks, Thanks for all the excellent suggestions on the producer API, I think this really made things better. We'll do a similar thing for the consumer as we get a proposal together. I wanted to summarize everything I heard and the proposed changes I plan to do versus ignore. I'd like to get