[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-04-13 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492628#comment-14492628
 ] 

Guozhang Wang commented on KAFKA-1910:
--

1. Sounds good.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.3

 Attachments: KAFKA-1910.patch, KAFKA-1910.patch, KAFKA-1910.patch, 
 KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-04-13 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492569#comment-14492569
 ] 

Guozhang Wang commented on KAFKA-1910:
--

Hey Jay,

1. I agree with you, the function name is quite misleading. I am thinking about 
changing them to createXYZRequest (there are two such functions in total, 
namely initiateConsumerMetadataRequest and initiateCoordinatorRequest).

2. Right now we have two handlers for async requests, offset commit and 
heartbeat (join-group does not use handlers but parse the response directly 
since it is a blocking call), and for these two requests the handling logic 
seems quite independent to the request setup environment, which is why I 
separate them as objects. Do you have specific concerns about what tied to 
what happens during the request setup?

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.3

 Attachments: KAFKA-1910.patch, KAFKA-1910.patch, KAFKA-1910.patch, 
 KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-04-13 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492598#comment-14492598
 ] 

Jay Kreps commented on KAFKA-1910:
--

Hey [~guozhang],
1. Cool. Yeah I think create is actually also a bit bad though because of the 
call to ensureCoordinatorReady which is a big chunk of network I/O work 
potentially (create sounds like a pure function). Maybe 
readyCoordinatorRequest? That is a bit more neutral.
2. Yeah if you feel like they are separate that makes sense. I guess I just 
liked being able to read the request and response together to reason about the 
complete request, but this is a pretty minor gripe. I agree that made the 
methods very large.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.3

 Attachments: KAFKA-1910.patch, KAFKA-1910.patch, KAFKA-1910.patch, 
 KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-04-12 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14491698#comment-14491698
 ] 

Jay Kreps commented on KAFKA-1910:
--

Hey [~guozhang], on the whole this made things a lot better, but a couple 
things seemed to get worse from my opinion. 

1. Coordinator has a lot of methods named initiateXyzRequest that no longer 
actually initiate the request after this refactoring. One of my pet peeves is 
methods that say they do something they don't do. The request initiation has 
been moved into sendAndReceive. These initiateXyz methods are a bit awkward 
because they aren't really createXyzRequest either as they do a bunch of 
stateful prep. I think this would make more sense if initiateXyzRequest did the 
send and sendAndReceive became something like completeRequest and just 
completed an already initiated request (or whatever).

2. You moved the completion handlers into their own objects. I started that way 
but one thing I found was that in practice since each handler should have one 
method that invokes that request separating these into other objects really 
breaks up the control flow (often the completion handler is doing a bunch of 
things very tightly tied to what happens during the request setup. I actually 
think the prior approach where the handler was defined as an annonymous 
callback in the method that does the setup was clearer.

These are both somewhat matters of individual taste, and like I said almost 
everything got better, but I thought I would pass on the feedback.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.3

 Attachments: KAFKA-1910.patch, KAFKA-1910.patch, KAFKA-1910.patch, 
 KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-04-07 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483724#comment-14483724
 ] 

Guozhang Wang commented on KAFKA-1910:
--

Created reviewboard https://reviews.apache.org/r/32931/diff/
 against branch origin/trunk

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.3

 Attachments: KAFKA-1910.patch, KAFKA-1910.patch, KAFKA-1910.patch, 
 KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-04-06 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482084#comment-14482084
 ] 

Guozhang Wang commented on KAFKA-1910:
--

[~jjkoshy]:

1. No the error code change was not necessary, I originally just add it for 
OffsetCommitTest, hence it only checks NoError code for the case where no 
offsets is fetchable.
2. This change has been reverted (by accident while doing rebase) in 
KAFKA-1634, I can submit a follow-up patch to delete the exception class and 
the error code from mapping.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.3

 Attachments: KAFKA-1910.patch, KAFKA-1910.patch, 
 KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-31 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1433#comment-1433
 ] 

Joel Koshy commented on KAFKA-1910:
---

BTW, just to be clear, what I would like to discuss is:
* Is the error code change necessary?
* If it is, then should we bump up the OffsetFetchRequest version (without 
actually changing the request content)? i.e., since the response is different 
across different broker versions.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.3

 Attachments: KAFKA-1910.patch, KAFKA-1910.patch, 
 KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-30 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14387652#comment-14387652
 ] 

Joel Koshy commented on KAFKA-1910:
---

[~guozhang] this patch modified some of the behavior of OffsetFetchRequest 
which [~toddpalino] caught while using a python client against the latest 
broker (on trunk)

Previously, we would return NoError with an invalid offset if there was no 
committed offset. Now it returns the NoOffsetsCommittedCode. Was there a 
discussion on this? Either way, we should also update the protocol 
documentation if this change sticks.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.3

 Attachments: KAFKA-1910.patch, KAFKA-1910.patch, 
 KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-19 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370281#comment-14370281
 ] 

Guozhang Wang commented on KAFKA-1910:
--

Pushed two fixes to the new consumer to trunk.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.3

 Attachments: KAFKA-1910.patch, KAFKA-1910.patch, 
 KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-19 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370061#comment-14370061
 ] 

Guozhang Wang commented on KAFKA-1910:
--

Created reviewboard https://reviews.apache.org/r/32258/diff/
 against branch origin/trunk

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.3

 Attachments: KAFKA-1910.patch, KAFKA-1910.patch, 
 KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-12 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357145#comment-14357145
 ] 

Jun Rao commented on KAFKA-1910:


[~guozhang], I still see the following transient unit test failure after your 
last commit. It happens infrequently. However, I was able to hit it once after 
running it 10 times.

kafka.api.ConsumerTest  testConsumptionWithBrokerFailures FAILED
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This 
server does not host this topic-partition.

commit 1eb5f53aa4f5c5c0f67ce62d86e8d9b4e5bbc500
Author: Guozhang Wang wangg...@gmail.com
Date:   Tue Mar 10 17:31:17 2015 -0700

KAFKA-1910; missed follow-up changes

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1910.patch, KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-12 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357348#comment-14357348
 ] 

Guozhang Wang commented on KAFKA-1910:
--

Found the root cause: in handling ListOffsetReponse in the new consumer, we 
assume the possible error are NOT_LEADER_FOR_PARTITION and 
LEADER_NOT_AVAILABLE, and if there are other error codes we just use

{code}
Errors.forCode(errorCode).maybeThrow()
{code}

Which by the way does not print any stack trace in unit tests, hence very 
difficult to debug.

The proposed patch fixed this issue, in addition, I think this is another 
example that calls out for KAFKA-1985.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1910.patch, KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-12 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359506#comment-14359506
 ] 

Jun Rao commented on KAFKA-1910:


In BounceBrokerScheduler, is there a particular reason that we need to add a 
sleep before restarting the dead broker?

  killRandomBroker()
  Thread.sleep(500)
  restartDeadBrokers()


 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.3

 Attachments: KAFKA-1910.patch, KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-11 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357529#comment-14357529
 ] 

Guozhang Wang commented on KAFKA-1910:
--

[~junrao] Could you take a look at the patch so that I can check-in the fix if 
it LGTY?

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1910.patch, KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-11 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357366#comment-14357366
 ] 

Guozhang Wang commented on KAFKA-1910:
--

Got some problems with RB, uploading the patch here for a quick review:

{code}
diff --git 
a/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
 
b/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
index e972efb..436f9b2 100644
--- 
a/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
+++ 
b/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
@@ -129,7 +129,7 @@ public final class Coordinator {
 
 // process the response
 JoinGroupResponse response = new 
JoinGroupResponse(resp.responseBody());
-// TODO: needs to handle disconnects and errors
+// TODO: needs to handle disconnects and errors, should not just throw 
exceptions
 Errors.forCode(response.errorCode()).maybeThrow();
 this.consumerId = response.consumerId();
 
diff --git 
a/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
 
b/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
index 27c78b8..8b71fba 100644
--- 
a/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
+++ 
b/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
@@ -231,11 +231,12 @@ public class FetcherK, V {
 log.debug(Fetched offset {} for partition {}, 
offset, topicPartition);
 return offset;
 } else if (errorCode == 
Errors.NOT_LEADER_FOR_PARTITION.code()
-|| errorCode == Errors.LEADER_NOT_AVAILABLE.code()) {
+|| errorCode == 
Errors.UNKNOWN_TOPIC_OR_PARTITION.code()) {
 log.warn(Attempt to fetch offsets for partition {} 
failed due to obsolete leadership information, retrying.,
 topicPartition);
 awaitMetadataUpdate();
 } else {
+// TODO: we should not just throw exceptions but 
should handle and log it.
 Errors.forCode(errorCode).maybeThrow();
 }
 }
diff --git 
a/clients/src/main/java/org/apache/kafka/common/requests/ListOffsetResponse.java
 
b/clients/src/main/java/org/apache/kafka/common/requests/ListOffsetResponse.java
index af704f3..f706086 100644
--- 
a/clients/src/main/java/org/apache/kafka/common/requests/ListOffsetResponse.java
+++ 
b/clients/src/main/java/org/apache/kafka/common/requests/ListOffsetResponse.java
@@ -45,7 +45,9 @@ public class ListOffsetResponse extends 
AbstractRequestResponse {
 /**
  * Possible error code:
  *
- * TODO
+ *  UNKNOWN_TOPIC_OR_PARTITION (3)
+ *  NOT_LEADER_FOR_PARTITION (6)
+ *  UNKNOWN (-1)
  */
 
 private static final String OFFSETS_KEY_NAME = offsets;
diff --git a/core/src/test/scala/integration/kafka/api/ConsumerTest.scala 
b/core/src/test/scala/integration/kafka/api/ConsumerTest.scala
index fed37e3..8eae1ab 100644
--- a/core/src/test/scala/integration/kafka/api/ConsumerTest.scala
+++ b/core/src/test/scala/integration/kafka/api/ConsumerTest.scala
@@ -260,8 +260,10 @@ class ConsumerTest extends IntegrationTestHarness with 
Logging {
 var iter: Int = 0
 
 override def doWork(): Unit = {
-  killRandomBroker()
+  info(Killed broker %d.format(killRandomBroker()))
+  Thread.sleep(500)
   restartDeadBrokers()
+  info(Restarted all brokers)
 
   iter += 1
   if (iter == numIters)
{code}

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1910.patch, KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-11 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357725#comment-14357725
 ] 

Jun Rao commented on KAFKA-1910:


Are the changes in ConsumerTest needed? The extra logging and sleep seem to be 
just for debugging. Other than that. +1. Ran the tests locally 20 times and 
they all pass.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1910.patch, KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-11 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357794#comment-14357794
 ] 

Guozhang Wang commented on KAFKA-1910:
--

Thanks Jun, incorporated the comments and commit to trunk.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1910.patch, KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355640#comment-14355640
 ] 

Jun Rao commented on KAFKA-1910:


The most recent jenkins run has the following two unit test failures. Are they 
due to this commit?

kafka.api.ProducerFailureHandlingTest  testBrokerFailure FAILED
java.lang.AssertionError: Should have fetched 128000 unique messages 
expected:128000 but was:127915
at org.junit.Assert.fail(Assert.java:92)
at org.junit.Assert.failNotEquals(Assert.java:689)
at org.junit.Assert.assertEquals(Assert.java:127)
at org.junit.Assert.assertEquals(Assert.java:514)
at 
kafka.api.ProducerFailureHandlingTest.testBrokerFailure(ProducerFailureHandlingTest.scala:301)

kafka.api.ConsumerTest  testConsumptionWithBrokerFailures FAILED
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This 
server does not host this topic-partition.


 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1910.patch, KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-10 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355739#comment-14355739
 ] 

Guozhang Wang commented on KAFKA-1910:
--

Jun, I will take a look at it now.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1910.patch, KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-10 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355916#comment-14355916
 ] 

Guozhang Wang commented on KAFKA-1910:
--

Just check-in the fix, which resolves the root cause of these failures.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1910.patch, KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-10 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355969#comment-14355969
 ] 

Guozhang Wang commented on KAFKA-1910:
--

Some changes are reverted by accident in the last-pass checking, re-committing 
them.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1910.patch, KAFKA-1910_2015-03-05_14:55:33.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-03-02 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344174#comment-14344174
 ] 

Guozhang Wang commented on KAFKA-1910:
--

Created reviewboard https://reviews.apache.org/r/31650/diff/
 against branch origin/trunk

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1910.patch


 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-02-27 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341201#comment-14341201
 ] 

Guozhang Wang commented on KAFKA-1910:
--

Jay,

I found my fix to KAFKA-1948 was not complete while trying to upload the RB. 
Will try to update the newest one asap.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang

 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-02-25 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336985#comment-14336985
 ] 

Guozhang Wang commented on KAFKA-1910:
--

The uploaded patch contains multiple fixes to the related JIRAs as well as 
refactoring the new consumer itself. I will summarize them here instead of in 
the RB:

1. Fix ConsumerTest.testXXXwithBrokerFailure: in RestartDeadBroker we need to 
call startup() on the old brokers instead of creating new ones as the last 
approach will case the metadata to be mess up and cause the test to hang 
(KAFKA-1948). Also make sure the test topic is created with correct 
replication factor to avoid hanging when the only replica broker was shutdown.

2. Fix ConsumerTest's __consumer_offsets topic: when we call partitionFor() the 
__consumer_offsets topic may be created with replication as 
min(offsetTopicRaplicationFactor, aliveBrokers.size), see KAFKA-1864 for 
details (KAFKA-1975). 

3. Add the IllegalGeneration logic in the coordinator as it is important for 
consumers rebalancing after rediscovering the coordinator, in the current stub 
it always return OK and hence consumers migrating to the new coordinator will 
not trigger rebalance (KAFKA-1964).

4. Create the Coodinator and the FetchManager modules as KafkaConsumer 
internals. Coordinator is responsible for assign partitions (join groups), 
commit offsets and fetch offsets from coordinator, and FetchManager is 
responsible for handling fetch request / responses.

4.1 After the refactoring it is easier to detect and fix a bug where response 
callbacks being triggered multiple times, causing the coordinator NPE 
(KAFKA-1969).

4.2 Avoid always trying to fetch offsets from coordinator whenever the consumer 
decides to update fetch positions, introduce a few new variables / APIs in 
SubscriptionState accordingly.

4.3 Move serializer / de-serializer configs / constructors to AbstractConfig.

4.4 Add missing error handling in commit offset / heartbeat responses. In 
general I think we should make notes about possible error codes in each of the 
response type to help coding error handling logic, has filed KAFKA-1985 for 
that.

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang

 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-02-25 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337621#comment-14337621
 ] 

Jay Kreps commented on KAFKA-1910:
--

Where is the RB?

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang

 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1910) Refactor KafkaConsumer

2015-02-01 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300251#comment-14300251
 ] 

Jay Kreps commented on KAFKA-1910:
--

I guess the real question is how to slice it up. Big classes aren't great but 
bad abstractions are worse.

One thought I had had was that maybe we could somehow factor out a kind of 
ConsumerCoordinator class that would manage the interaction with the 
coordinator--heartbeat, groups, etc. But the hard part is figuring out the 
boundaries of that abstraction...

 Refactor KafkaConsumer
 --

 Key: KAFKA-1910
 URL: https://issues.apache.org/jira/browse/KAFKA-1910
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Guozhang Wang

 KafkaConsumer now contains all the logic on the consumer side, making it a 
 very huge class file, better re-factoring it to have multiple layers on top 
 of KafkaClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)