Re: Producer becomes slow over time

2015-09-23 Thread Prabhjot Bharaj
Hi,

I would like to dig deep into this issue. I've changed log4j.properties for
logging in ALL mode in all places. I am getting lost in the logs.

Any pointers would be welcome

Please let me know if you would need any information regarding this

Thanks,
Prabhjot

On Wed, Sep 23, 2015 at 6:46 PM, Prabhjot Bharaj 
wrote:

> Hello Folks,
>
> I've noticed that 2 producer machines, that I had configured, have become
> very slow over time
> They are giving 17-19 MB/s
>
> But, a producer that I setup today is giving 70MB/s as the write throughput
>
> If I see the contents of bin, libs, config directories, nothing is
> different in the files on any of the producer machines.
>
> Producer is running on the kafka machines itself
>
> Request your expertise
>
> Regards,
> Prabhjot
>
>
>


-- 
-
"There are only 10 types of people in the world: Those who understand
binary, and those who don't"


Re: [VOTE] KIP-31 - Move to relative offsets in compressed message sets.

2015-09-23 Thread Guozhang Wang
+1

On Wed, Sep 23, 2015 at 9:32 PM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> +1
>
> On Wed, Sep 23, 2015 at 8:03 PM, Neha Narkhede  wrote:
>
> > +1
> >
> > On Wed, Sep 23, 2015 at 6:21 PM, Todd Palino  wrote:
> >
> > > +1000
> > >
> > > !
> > >
> > > -Todd
> > >
> > > On Wednesday, September 23, 2015, Jiangjie Qin
>  > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Thanks a lot for the reviews and feedback on KIP-31. It looks all the
> > > > concerns of the KIP has been addressed. I would like to start the
> > voting
> > > > process.
> > > >
> > > > The short summary for the KIP:
> > > > We are going to use the relative offset in the message format to
> avoid
> > > > server side recompression.
> > > >
> > > > In case you haven't got a chance to check, here is the KIP link.
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-31+-+Move+to+relative+offsets+in+compressed+message+sets
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > >
> >
> >
> >
> > --
> > Thanks,
> > Neha
> >
>



-- 
-- Guozhang


Re: [VOTE] KIP-31 - Move to relative offsets in compressed message sets.

2015-09-23 Thread Jay Kreps
I'm +1 though that is dependent on having a graceful migration path for
people.

-Jay

On Wed, Sep 23, 2015 at 5:53 PM, Jiangjie Qin 
wrote:

> Hi,
>
> Thanks a lot for the reviews and feedback on KIP-31. It looks all the
> concerns of the KIP has been addressed. I would like to start the voting
> process.
>
> The short summary for the KIP:
> We are going to use the relative offset in the message format to avoid
> server side recompression.
>
> In case you haven't got a chance to check, here is the KIP link.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-31+-+Move+to+relative+offsets+in+compressed+message+sets
>
> Thanks,
>
> Jiangjie (Becket) Qin
>


Re: [VOTE] KIP-31 - Move to relative offsets in compressed message sets.

2015-09-23 Thread Aditya Auradkar
+1

On Wed, Sep 23, 2015 at 8:03 PM, Neha Narkhede  wrote:

> +1
>
> On Wed, Sep 23, 2015 at 6:21 PM, Todd Palino  wrote:
>
> > +1000
> >
> > !
> >
> > -Todd
> >
> > On Wednesday, September 23, 2015, Jiangjie Qin  >
> > wrote:
> >
> > > Hi,
> > >
> > > Thanks a lot for the reviews and feedback on KIP-31. It looks all the
> > > concerns of the KIP has been addressed. I would like to start the
> voting
> > > process.
> > >
> > > The short summary for the KIP:
> > > We are going to use the relative offset in the message format to avoid
> > > server side recompression.
> > >
> > > In case you haven't got a chance to check, here is the KIP link.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-31+-+Move+to+relative+offsets+in+compressed+message+sets
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> >
>
>
>
> --
> Thanks,
> Neha
>


[jira] [Commented] (KAFKA-2569) Kafka should write its metrics to a Kafka topic

2015-09-23 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905760#comment-14905760
 ] 

Otis Gospodnetic commented on KAFKA-2569:
-

bq. How do people typically monitor things with JMX metrics (like Kafka and 
Zookeeper)?

They use tools like SPM for Kafka / ZK / etc., or home-grown stuff.  See 
http://sematext.com/spm/integrations/kafka-monitoring.html for example.


> Kafka should write its metrics to a Kafka topic
> ---
>
> Key: KAFKA-2569
> URL: https://issues.apache.org/jira/browse/KAFKA-2569
> Project: Kafka
>  Issue Type: New Feature
>Reporter: James Cheng
>
> Kafka is often used to hold and transport monitoring data.
> In order to monitor Kafka itself, Kafka currently exposes many metrics via 
> JMX, which require using a tool to pull the JMX metrics, and then write them 
> to the monitoring system.
> It would be convenient if Kafka could simply send its metrics to a Kafka 
> topic. This would make most sense if the Kafka topic was in a different Kafka 
> cluster, but could still be useful even if it was sent to a topic in the same 
> Kafka cluster.
> Of course, if sent to the same cluster, it would not be accessible if the 
> cluster itself was down.
> This would allow monitoring of Kafka itself without requiring people to set 
> up their own JMX-to-monitoring-system pipelines.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-31 - Move to relative offsets in compressed message sets.

2015-09-23 Thread Neha Narkhede
+1

On Wed, Sep 23, 2015 at 6:21 PM, Todd Palino  wrote:

> +1000
>
> !
>
> -Todd
>
> On Wednesday, September 23, 2015, Jiangjie Qin 
> wrote:
>
> > Hi,
> >
> > Thanks a lot for the reviews and feedback on KIP-31. It looks all the
> > concerns of the KIP has been addressed. I would like to start the voting
> > process.
> >
> > The short summary for the KIP:
> > We are going to use the relative offset in the message format to avoid
> > server side recompression.
> >
> > In case you haven't got a chance to check, here is the KIP link.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-31+-+Move+to+relative+offsets+in+compressed+message+sets
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
>



-- 
Thanks,
Neha


Re: [VOTE] KIP-31 - Move to relative offsets in compressed message sets.

2015-09-23 Thread Todd Palino
+1000

!

-Todd

On Wednesday, September 23, 2015, Jiangjie Qin 
wrote:

> Hi,
>
> Thanks a lot for the reviews and feedback on KIP-31. It looks all the
> concerns of the KIP has been addressed. I would like to start the voting
> process.
>
> The short summary for the KIP:
> We are going to use the relative offset in the message format to avoid
> server side recompression.
>
> In case you haven't got a chance to check, here is the KIP link.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-31+-+Move+to+relative+offsets+in+compressed+message+sets
>
> Thanks,
>
> Jiangjie (Becket) Qin
>


[VOTE] KIP-31 - Move to relative offsets in compressed message sets.

2015-09-23 Thread Jiangjie Qin
Hi,

Thanks a lot for the reviews and feedback on KIP-31. It looks all the
concerns of the KIP has been addressed. I would like to start the voting
process.

The short summary for the KIP:
We are going to use the relative offset in the message format to avoid
server side recompression.

In case you haven't got a chance to check, here is the KIP link.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-31+-+Move+to+relative+offsets+in+compressed+message+sets

Thanks,

Jiangjie (Becket) Qin


[jira] [Updated] (KAFKA-2390) OffsetOutOfRangeException should contain the Offset and Partition info.

2015-09-23 Thread Dong Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-2390:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> OffsetOutOfRangeException should contain the Offset and Partition info.
> ---
>
> Key: KAFKA-2390
> URL: https://issues.apache.org/jira/browse/KAFKA-2390
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jiangjie Qin
>Assignee: Dong Lin
>
> Currently when seek() finishes, the offset seek to is not verified and 
> OffsetOutOfRangeException might be thrown later. To let the users take 
> actions when the OffsetOutOfRangeException is thrown, we need to provide more 
> information in the Exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: Kafka-trunk #631

2015-09-23 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-2390; OffsetOutOfRangeException should contain the Offset and 
Partition info.

--
[...truncated 1641 lines...]
kafka.admin.DeleteTopicTest > testResumeDeleteTopicOnControllerFailover PASSED

kafka.api.ConsumerTest > testSeek PASSED

kafka.api.ProducerSendTest > testCloseWithZeroTimeoutFromCallerThread PASSED

kafka.consumer.ZookeeperConsumerConnectorTest > testCompression PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > 
testHeartbeatWrongCoordinator PASSED

kafka.admin.DeleteConsumerGroupTest > 
testGroupTopicWideDeleteInZKForGroupConsumingMultipleTopics PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > 
testHeartbeatIllegalGeneration PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > 
testGenerationIdIncrementsOnRebalance PASSED

kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > testHeartbeatUnknownGroup 
PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > 
testHeartbeatDuringRebalanceCausesRebalanceInProgress PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > 
testJoinGroupSessionTimeoutTooLarge PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > 
testJoinGroupSessionTimeoutTooSmall PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > 
testJoinGroupWrongCoordinator PASSED

kafka.integration.SslTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > 
testJoinGroupUnknownConsumerExistingGroup PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > 
testJoinGroupUnknownPartitionAssignmentStrategy PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > 
testJoinGroupInconsistentPartitionAssignmentStrategy PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > 
testJoinGroupUnknownConsumerNewGroup PASSED

kafka.api.QuotasTest > testProducerConsumerOverrideUnthrottled PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > testValidJoinGroup PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > 
testHeartbeatUnknownConsumerExistingGroup PASSED

kafka.admin.AdminTest > testPartitionReassignmentWithLeaderInNewReplicas PASSED

kafka.admin.DeleteTopicTest > testResumeDeleteTopicWithRecoveredFollower PASSED

kafka.coordinator.ConsumerCoordinatorResponseTest > testValidHeartbeat PASSED

kafka.api.ConsumerTest > testUnsubscribeTopic PASSED

kafka.admin.AddPartitionsTest > testManualAssignmentOfReplicas PASSED

kafka.integration.MinIsrConfigTest > testDefaultKafkaConfig PASSED

kafka.admin.DeleteConsumerGroupTest > testGroupWideDeleteInZK PASSED

kafka.common.ZkNodeChangeNotificationListenerTest > testProcessNotification 
PASSED

kafka.admin.DeleteTopicTest > testDeleteTopicAlreadyMarkedAsDeleted PASSED

kafka.admin.AdminTest > testShutdownBroker PASSED

kafka.admin.AdminTest > testTopicCreationWithCollision PASSED

kafka.api.ProducerSendTest > testCloseWithZeroTimeoutFromSenderThread PASSED

kafka.metrics.KafkaTimerTest > testKafkaTimer PASSED

kafka.admin.AdminTest > testTopicCreationInZK PASSED

kafka.api.ConsumerTest > testListTopics PASSED

kafka.controller.ControllerFailoverTest > testMetadataUpdate PASSED

kafka.network.SocketServerTest > testMaxConnectionsPerIp PASSED

kafka.network.SocketServerTest > simpleRequest PASSED

kafka.network.SocketServerTest > testSessionPrincipal PASSED

kafka.network.SocketServerTest > testSocketsCloseOnShutdown PASSED

kafka.network.SocketServerTest > testMaxConnectionsPerIPOverrides PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[0] PASSED

kafka.network.SocketServerTest > testSSLSocketServer PASSED

kafka.network.SocketServerTest > tooBigRequestIsRejected PASSED

kafka.admin.DeleteTopicTest > testPartitionReassignmentDuringDeleteTopic PASSED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionEnabled 
PASSED

kafka.api.QuotasTest > testThrottledProducerConsumer PASSED

kafka.admin.AddPartitionsTest > testReplicaPlacement PASSED

kafka.utils.ReplicationUtilsTest > testUpdateLeaderAndIsr PASSED

kafka.api.ConsumerTest > testExpandingTopicSubscriptions PASSED

kafka.utils.ReplicationUtilsTest > testGetLeaderIsrAndEpochForPartition PASSED

kafka.integration.FetcherTest > testFetcher PASSED

kafka.zk.ZKEphemeralTest > testEphemeralNodeCleanup PASSED

kafka.admin.DeleteTopicTest > testDeleteNonExistingTopic PASSED

kafka.api.ConsumerTest > testPatternUnsubscription PASSED

kafka.admin.DeleteTopicTest > testRecreateTopicAfterDeletion PASSED

kafka.api.ConsumerTest > testGroupConsumption PASSED

kafka.admin.DeleteTopicTest > testAddPartitionDuringDeleteTopic PASSED

kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures PASSED

kafka.utils.ByteBoundedBlockingQueueTest > testByteBoundedBlockingQueue PASSED

kafka.api.ConsumerTest > testPartitionsFor PASSED

kafka.admin.DeleteTopicTes

[jira] [Commented] (KAFKA-2390) OffsetOutOfRangeException should contain the Offset and Partition info.

2015-09-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905540#comment-14905540
 ] 

ASF GitHub Bot commented on KAFKA-2390:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/118


> OffsetOutOfRangeException should contain the Offset and Partition info.
> ---
>
> Key: KAFKA-2390
> URL: https://issues.apache.org/jira/browse/KAFKA-2390
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jiangjie Qin
>Assignee: Dong Lin
>
> Currently when seek() finishes, the offset seek to is not verified and 
> OffsetOutOfRangeException might be thrown later. To let the users take 
> actions when the OffsetOutOfRangeException is thrown, we need to provide more 
> information in the Exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2390; OffsetOutOfRangeException should c...

2015-09-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/118


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2078) Getting Selector [WARN] Error in I/O with host java.io.EOFException

2015-09-23 Thread Tom Waterhouse (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905492#comment-14905492
 ] 

Tom Waterhouse commented on KAFKA-2078:
---

I am running a three node Kafka server on Linux with Java 8, with a single 
producer and am seeing the issue as well:

2015-09-23 23:09:34.852 [kafka-producer-network-thread | network driver] WARN  
o.a.kafka.common.network.Selector - Error in I/O with /192.168.1.78
java.io.EOFException: null
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62) 
~[analytics-spark-assembly-4484568f974d01a150ffd465ce0903649a528b8b-SNAPSHOT.jar:4484568f974d01a150ffd465ce0903649a528b8b-SNAPSHOT]
at org.apache.kafka.common.network.Selector.poll(Selector.java:248) 
~[analytics-spark-assembly-4484568f974d01a150ffd465ce0903649a528b8b-SNAPSHOT.jar:4484568f974d01a150ffd465ce0903649a528b8b-SNAPSHOT]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) 
[analytics-spark-assembly-4484568f974d01a150ffd465ce0903649a528b8b-SNAPSHOT.jar:4484568f974d01a150ffd465ce0903649a528b8b-SNAPSHOT]
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) 
[analytics-spark-assembly-4484568f974d01a150ffd465ce0903649a528b8b-SNAPSHOT.jar:4484568f974d01a150ffd465ce0903649a528b8b-SNAPSHOT]
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) 
[analytics-spark-assembly-4484568f974d01a150ffd465ce0903649a528b8b-SNAPSHOT.jar:4484568f974d01a150ffd465ce0903649a528b8b-SNAPSHOT]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]


> Getting Selector [WARN] Error in I/O with host java.io.EOFException
> ---
>
> Key: KAFKA-2078
> URL: https://issues.apache.org/jira/browse/KAFKA-2078
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
> Environment: OS Version: 2.6.39-400.209.1.el5uek and Hardware: 8 x 
> Intel(R) Xeon(R) CPU X5660  @ 2.80GHz/44GB
>Reporter: Aravind
>Assignee: Jun Rao
>
> When trying to Produce 1000 (10 MB) messages, getting this below error some 
> where between 997 to 1000th message. There is no pattern but able to 
> reproduce.
> [PDT] 2015-03-31 13:53:50 Selector [WARN] Error in I/O with "our host" 
> java.io.EOFException at 
> org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62)
>  at org.apache.kafka.common.network.Selector.poll(Selector.java:248) at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) at 
> java.lang.Thread.run(Thread.java:724)
> This error I am getting some times @ 997th message or 999th message. There is 
> no pattern but able to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2573) Mirror maker system test hangs and eventually fails

2015-09-23 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905391#comment-14905391
 ] 

Ashish K Singh commented on KAFKA-2573:
---

I think the test is failing due to a reason it is not intending to test, or is 
it. I am not saying that we should not be catching such issues, my question is 
that should we be failing mirror-maker test due to an issue with consumer 
shutdown? May be we should. I am fine either ways, but just something we should 
agree upon.

If we choose to let the test fail, I can change the PR to just make it fail 
fast instead of waiting for 10 minutes.

> Mirror maker system test hangs and eventually fails
> ---
>
> Key: KAFKA-2573
> URL: https://issues.apache.org/jira/browse/KAFKA-2573
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> Due to changes made in KAFKA-2015, handling of {{--consumer.config}} has 
> changed, more details is specified on KAFKA-2467. This leads to the exception.
> {code}
> Exception in thread "main" java.lang.NoSuchMethodError: 
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
>   at kafka.utils.Pool.keys(Pool.scala:77)
>   at 
> kafka.consumer.FetchRequestAndResponseStatsRegistry$.removeConsumerFetchRequestAndResponseStats(FetchRequestAndResponseStats.scala:69)
>   at 
> kafka.metrics.KafkaMetricsGroup$.removeAllConsumerMetrics(KafkaMetricsGroup.scala:189)
>   at 
> kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:200)
>   at kafka.consumer.OldConsumer.stop(BaseConsumer.scala:75)
>   at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:98)
>   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:57)
>   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:41)
>   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2409) Have KafkaConsumer.committed() return null when there is no committed offset

2015-09-23 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905364#comment-14905364
 ] 

Onur Karaman commented on KAFKA-2409:
-

Yeah I think KAFKA-2403's change to have OffsetAndMetadata for committed() and 
long for position() makes more sense than the ticket's originally proposed Long 
for committed() and long for position()

> Have KafkaConsumer.committed() return null when there is no committed offset
> 
>
> Key: KAFKA-2409
> URL: https://issues.apache.org/jira/browse/KAFKA-2409
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Sreepathi Prasanna
>Priority: Minor
>  Labels: newbie
> Fix For: 0.9.0.0
>
>
> Currently checking whether an offset has been committed requires catching 
> NoOffsetForPartitionException. Since this is likely a fairly common case, it 
> is more convenient for users just to return null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2409) Have KafkaConsumer.committed() return null when there is no committed offset

2015-09-23 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905304#comment-14905304
 ] 

Jason Gustafson commented on KAFKA-2409:


[~onurkaraman] With KAFKA-2403 checked in, do you have any objections to this 
change?

[~sree2k] With a lot of the noise around the commit API more or less resolved, 
this should be fairly straightforward. If you don't have time to work on it, 
feel free to assign to me.

> Have KafkaConsumer.committed() return null when there is no committed offset
> 
>
> Key: KAFKA-2409
> URL: https://issues.apache.org/jira/browse/KAFKA-2409
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Sreepathi Prasanna
>Priority: Minor
>  Labels: newbie
> Fix For: 0.9.0.0
>
>
> Currently checking whether an offset has been committed requires catching 
> NoOffsetForPartitionException. Since this is likely a fairly common case, it 
> is more convenient for users just to return null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2573) Mirror maker system test hangs and eventually fails

2015-09-23 Thread Geoff Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905284#comment-14905284
 ] 

Geoff Anderson commented on KAFKA-2573:
---

After speaking a bit with Ashish, I think I understand the intent behind the 
changes better now:
it seems these changes are intended to get the test to pass despite the remote 
console consumer process hanging during shutdown (correct me if I 
misunderstood, [~singhashish]).

My argument is that failure of the test in this case is exactly what we want - 
how else would we have known an issue with conflicting java versions had caused 
the process to blow up during shutdown?

That said, it is a very reasonable error to compile locally with java 8 (I'm 
100% sure this will cause trouble for others as well), so we should at least 
document this in the system test README 


> Mirror maker system test hangs and eventually fails
> ---
>
> Key: KAFKA-2573
> URL: https://issues.apache.org/jira/browse/KAFKA-2573
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> Due to changes made in KAFKA-2015, handling of {{--consumer.config}} has 
> changed, more details is specified on KAFKA-2467. This leads to the exception.
> {code}
> Exception in thread "main" java.lang.NoSuchMethodError: 
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
>   at kafka.utils.Pool.keys(Pool.scala:77)
>   at 
> kafka.consumer.FetchRequestAndResponseStatsRegistry$.removeConsumerFetchRequestAndResponseStats(FetchRequestAndResponseStats.scala:69)
>   at 
> kafka.metrics.KafkaMetricsGroup$.removeAllConsumerMetrics(KafkaMetricsGroup.scala:189)
>   at 
> kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:200)
>   at kafka.consumer.OldConsumer.stop(BaseConsumer.scala:75)
>   at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:98)
>   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:57)
>   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:41)
>   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2403) Expose offset commit metadata in new consumer

2015-09-23 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-2403:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Expose offset commit metadata in new consumer
> -
>
> Key: KAFKA-2403
> URL: https://issues.apache.org/jira/browse/KAFKA-2403
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Minor
> Fix For: 0.9.0.0
>
>
> The offset commit protocol supports the ability to add user metadata to 
> commits, but this is not yet exposed in the new consumer API. The 
> straightforward way to add it is to create a container for the offset and 
> metadata and adjust the KafkaConsumer API accordingly.
> {code}
> OffsetMetadata {
>   long offset;
>   String metadata;
> }
> KafkaConsumer {
>   commit(Map)
>   OffsetMetadata committed(TopicPartition)
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2403) Expose offset commit metadata in new consumer

2015-09-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905259#comment-14905259
 ] 

ASF GitHub Bot commented on KAFKA-2403:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/119


> Expose offset commit metadata in new consumer
> -
>
> Key: KAFKA-2403
> URL: https://issues.apache.org/jira/browse/KAFKA-2403
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Minor
> Fix For: 0.9.0.0
>
>
> The offset commit protocol supports the ability to add user metadata to 
> commits, but this is not yet exposed in the new consumer API. The 
> straightforward way to add it is to create a container for the offset and 
> metadata and adjust the KafkaConsumer API accordingly.
> {code}
> OffsetMetadata {
>   long offset;
>   String metadata;
> }
> KafkaConsumer {
>   commit(Map)
>   OffsetMetadata committed(TopicPartition)
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2403; Add support for commit metadata in...

2015-09-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/119


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2573) Mirror maker system test hangs and eventually fails

2015-09-23 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905234#comment-14905234
 ] 

Ashish K Singh commented on KAFKA-2573:
---

[~granders] +1 on your suggested change, will incorporate that. By hang I meant 
it will be stuck before timing out and failing. What are your thoughts on the 
PR to address java incompatibility issue?

> Mirror maker system test hangs and eventually fails
> ---
>
> Key: KAFKA-2573
> URL: https://issues.apache.org/jira/browse/KAFKA-2573
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> Due to changes made in KAFKA-2015, handling of {{--consumer.config}} has 
> changed, more details is specified on KAFKA-2467. This leads to the exception.
> {code}
> Exception in thread "main" java.lang.NoSuchMethodError: 
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
>   at kafka.utils.Pool.keys(Pool.scala:77)
>   at 
> kafka.consumer.FetchRequestAndResponseStatsRegistry$.removeConsumerFetchRequestAndResponseStats(FetchRequestAndResponseStats.scala:69)
>   at 
> kafka.metrics.KafkaMetricsGroup$.removeAllConsumerMetrics(KafkaMetricsGroup.scala:189)
>   at 
> kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:200)
>   at kafka.consumer.OldConsumer.stop(BaseConsumer.scala:75)
>   at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:98)
>   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:57)
>   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:41)
>   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2573) Mirror maker system test hangs and eventually fails

2015-09-23 Thread Geoff Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905211#comment-14905211
 ] 

Geoff Anderson commented on KAFKA-2573:
---

See what you think of my comment on https://github.com/apache/kafka/pull/234


> Mirror maker system test hangs and eventually fails
> ---
>
> Key: KAFKA-2573
> URL: https://issues.apache.org/jira/browse/KAFKA-2573
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> Due to changes made in KAFKA-2015, handling of {{--consumer.config}} has 
> changed, more details is specified on KAFKA-2467. This leads to the exception.
> {code}
> Exception in thread "main" java.lang.NoSuchMethodError: 
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
>   at kafka.utils.Pool.keys(Pool.scala:77)
>   at 
> kafka.consumer.FetchRequestAndResponseStatsRegistry$.removeConsumerFetchRequestAndResponseStats(FetchRequestAndResponseStats.scala:69)
>   at 
> kafka.metrics.KafkaMetricsGroup$.removeAllConsumerMetrics(KafkaMetricsGroup.scala:189)
>   at 
> kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:200)
>   at kafka.consumer.OldConsumer.stop(BaseConsumer.scala:75)
>   at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:98)
>   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:57)
>   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:41)
>   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2573) Mirror maker system test hangs and eventually fails

2015-09-23 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905194#comment-14905194
 ] 

Ashish K Singh commented on KAFKA-2573:
---

In my PR I removed {{consumer.wait()}} that gets blocked and eventually the 
test fails due to timeout while waiting for the consumer thread to finish. 
Instead, I am doing my asserts on a separate thread by getting consumed 
messages from consumer thread and then I stop the thread without waiting on it. 
This approach is used in console_consumer_test as well. Does that sound 
reasonable to you?

> Mirror maker system test hangs and eventually fails
> ---
>
> Key: KAFKA-2573
> URL: https://issues.apache.org/jira/browse/KAFKA-2573
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> Due to changes made in KAFKA-2015, handling of {{--consumer.config}} has 
> changed, more details is specified on KAFKA-2467. This leads to the exception.
> {code}
> Exception in thread "main" java.lang.NoSuchMethodError: 
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
>   at kafka.utils.Pool.keys(Pool.scala:77)
>   at 
> kafka.consumer.FetchRequestAndResponseStatsRegistry$.removeConsumerFetchRequestAndResponseStats(FetchRequestAndResponseStats.scala:69)
>   at 
> kafka.metrics.KafkaMetricsGroup$.removeAllConsumerMetrics(KafkaMetricsGroup.scala:189)
>   at 
> kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:200)
>   at kafka.consumer.OldConsumer.stop(BaseConsumer.scala:75)
>   at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:98)
>   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:57)
>   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:41)
>   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2570) New consumer should commit before every rebalance when auto-commit is enabled

2015-09-23 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-2570:
---
Fix Version/s: 0.9.0.0

> New consumer should commit before every rebalance when auto-commit is enabled
> -
>
> Key: KAFKA-2570
> URL: https://issues.apache.org/jira/browse/KAFKA-2570
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
> Fix For: 0.9.0.0
>
>
> If not, then the consumer may see duplicates even on normal rebalances, since 
> we will always reset to the previous commit after rebalancing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2573) Mirror maker system test hangs and eventually fails

2015-09-23 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905028#comment-14905028
 ] 

Ismael Juma commented on KAFKA-2573:


How does your PR handle this problem? This is a user error, but if we can 
provide a better error message somewhat easily, then that's a good thing.

> Mirror maker system test hangs and eventually fails
> ---
>
> Key: KAFKA-2573
> URL: https://issues.apache.org/jira/browse/KAFKA-2573
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> Due to changes made in KAFKA-2015, handling of {{--consumer.config}} has 
> changed, more details is specified on KAFKA-2467. This leads to the exception.
> {code}
> Exception in thread "main" java.lang.NoSuchMethodError: 
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
>   at kafka.utils.Pool.keys(Pool.scala:77)
>   at 
> kafka.consumer.FetchRequestAndResponseStatsRegistry$.removeConsumerFetchRequestAndResponseStats(FetchRequestAndResponseStats.scala:69)
>   at 
> kafka.metrics.KafkaMetricsGroup$.removeAllConsumerMetrics(KafkaMetricsGroup.scala:189)
>   at 
> kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:200)
>   at kafka.consumer.OldConsumer.stop(BaseConsumer.scala:75)
>   at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:98)
>   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:57)
>   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:41)
>   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2573) Mirror maker system test hangs and eventually fails

2015-09-23 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905020#comment-14905020
 ] 

Ashish K Singh commented on KAFKA-2573:
---

[~ijuma] thanks for pointing it out. I was under impression that Kafka code is 
compiled on the vagrant hosts and never thought it could be the java version 
causing the issue. Yes, it was due to using java8 at compile time and java 7 at 
run time. I am not sure, but I think that is not something we should be worried 
about. Let me know your thoughts, and we can close the JIRA accordingly.

> Mirror maker system test hangs and eventually fails
> ---
>
> Key: KAFKA-2573
> URL: https://issues.apache.org/jira/browse/KAFKA-2573
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> Due to changes made in KAFKA-2015, handling of {{--consumer.config}} has 
> changed, more details is specified on KAFKA-2467. This leads to the exception.
> {code}
> Exception in thread "main" java.lang.NoSuchMethodError: 
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
>   at kafka.utils.Pool.keys(Pool.scala:77)
>   at 
> kafka.consumer.FetchRequestAndResponseStatsRegistry$.removeConsumerFetchRequestAndResponseStats(FetchRequestAndResponseStats.scala:69)
>   at 
> kafka.metrics.KafkaMetricsGroup$.removeAllConsumerMetrics(KafkaMetricsGroup.scala:189)
>   at 
> kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:200)
>   at kafka.consumer.OldConsumer.stop(BaseConsumer.scala:75)
>   at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:98)
>   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:57)
>   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:41)
>   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2577) one node of Kafka cluster cannot process produce request

2015-09-23 Thread Ray (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray updated KAFKA-2577:
---
Description: 
We had 3 nodes for kafka cluster, suddenly one node cannot accept produce r 
request, here is the log:
[2015-09-21 04:56:32,413] WARN [KafkaApi-0] Produce request with correlation id 
9178992 from client  on partition [topic_name,3] failed due to Leader not local 
for partition [topic_name,3] on broker 0 (kafka.server.KafkaApis)

after restarting that node, it still cannot work and I saw different log:
[2015-09-21 20:38:16,791] WARN [KafkaApi-0] Produce request with correlation id 
9661337 from client  on partition [topic_name,3] failed due to Topic topic_name 
either doesn't exist or is in the process of being deleted 
(kafka.server.KafkaApis)

it got fixed after rolling all the kafka nodes

  was:
We had 3 nodes for kafka cluster, suddenly one node cannot accept produce r 
request, here is the log:
[2015-09-21 04:56:32,413] WARN [KafkaApi-0] Produce request with correlation id 
9178992 from client  on partition [topic_name,3] failed due to Leader not local 
for partition [topic_name,3] on broker 0 (kafka.server.KafkaApis)

after restarting that node, it still cannot work and I saw different log:
[2015-09-21 20:38:16,791] WARN [KafkaApi-0] Produce request with correlation id 
9661337 from client  on partition [topic_name,3] failed due to Topic topic_name 
either doesn't exist or is in the process of being deleted 
(kafka.server.KafkaApis)


> one node of Kafka cluster cannot process produce request
> 
>
> Key: KAFKA-2577
> URL: https://issues.apache.org/jira/browse/KAFKA-2577
> Project: Kafka
>  Issue Type: Bug
>  Components: producer , replication
>Affects Versions: 0.8.1.1
>Reporter: Ray
>Assignee: Jun Rao
>
> We had 3 nodes for kafka cluster, suddenly one node cannot accept produce r 
> request, here is the log:
> [2015-09-21 04:56:32,413] WARN [KafkaApi-0] Produce request with correlation 
> id 9178992 from client  on partition [topic_name,3] failed due to Leader not 
> local for partition [topic_name,3] on broker 0 (kafka.server.KafkaApis)
> after restarting that node, it still cannot work and I saw different log:
> [2015-09-21 20:38:16,791] WARN [KafkaApi-0] Produce request with correlation 
> id 9661337 from client  on partition [topic_name,3] failed due to Topic 
> topic_name either doesn't exist or is in the process of being deleted 
> (kafka.server.KafkaApis)
> it got fixed after rolling all the kafka nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2577) one node of Kafka cluster cannot process produce request

2015-09-23 Thread Ray (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905014#comment-14905014
 ] 

Ray commented on KAFKA-2577:


Is this a known issue for 0.8.1.1, will this be fixed in the latest version?

> one node of Kafka cluster cannot process produce request
> 
>
> Key: KAFKA-2577
> URL: https://issues.apache.org/jira/browse/KAFKA-2577
> Project: Kafka
>  Issue Type: Bug
>  Components: producer , replication
>Affects Versions: 0.8.1.1
>Reporter: Ray
>Assignee: Jun Rao
>
> We had 3 nodes for kafka cluster, suddenly one node cannot accept produce r 
> request, here is the log:
> [2015-09-21 04:56:32,413] WARN [KafkaApi-0] Produce request with correlation 
> id 9178992 from client  on partition [topic_name,3] failed due to Leader not 
> local for partition [topic_name,3] on broker 0 (kafka.server.KafkaApis)
> after restarting that node, it still cannot work and I saw different log:
> [2015-09-21 20:38:16,791] WARN [KafkaApi-0] Produce request with correlation 
> id 9661337 from client  on partition [topic_name,3] failed due to Topic 
> topic_name either doesn't exist or is in the process of being deleted 
> (kafka.server.KafkaApis)
> it got fixed after rolling all the kafka nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2577) one node of Kafka cluster cannot process produce request

2015-09-23 Thread Ray (JIRA)
Ray created KAFKA-2577:
--

 Summary: one node of Kafka cluster cannot process produce request
 Key: KAFKA-2577
 URL: https://issues.apache.org/jira/browse/KAFKA-2577
 Project: Kafka
  Issue Type: Bug
  Components: producer , replication
Affects Versions: 0.8.1.1
Reporter: Ray
Assignee: Jun Rao


We had 3 nodes for kafka cluster, suddenly one node cannot accept produce r 
request, here is the log:
[2015-09-21 04:56:32,413] WARN [KafkaApi-0] Produce request with correlation id 
9178992 from client  on partition [topic_name,3] failed due to Leader not local 
for partition [topic_name,3] on broker 0 (kafka.server.KafkaApis)

after restarting that node, it still cannot work and I saw different log:
[2015-09-21 20:38:16,791] WARN [KafkaApi-0] Produce request with correlation id 
9661337 from client  on partition [topic_name,3] failed due to Topic topic_name 
either doesn't exist or is in the process of being deleted 
(kafka.server.KafkaApis)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (KAFKA-2576) ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic

2015-09-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-2576 started by Ismael Juma.
--
> ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic
> 
>
> Key: KAFKA-2576
> URL: https://issues.apache.org/jira/browse/KAFKA-2576
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ben Stopford
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.9.0.0
>
>
> Running the ConsumerPerformance using a multi partition topic causes it to 
> hang (or execute with no results).
> bin/kafka-topics.sh --create --zookeeper server:2181 --replication-factor 1 
> --partitions 50  --topic 50p
> bin/kafka-producer-perf-test.sh --broker-list server:9092 --topic 50p  
> --new-producer --messages 100 --message-size 1000
> #Works ok
> bin/kafka-consumer-perf-test.sh  --broker-list server:9092  --messages 
> 100  --new-consumer --topic 50p 
> #Hangs
> bin/kafka-consumer-perf-test.sh  --broker-list server:9093  --messages 
> 100  --new-consumer --topic 50p --consumer.config ssl.properties
> Running the same without SSL enabled works as expected.  
> Running the same using a single partition topic works as expected.  
> Tested locally and on EC2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2576) ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic

2015-09-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-2576:
---
Reviewer: Jun Rao
  Status: Patch Available  (was: In Progress)

> ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic
> 
>
> Key: KAFKA-2576
> URL: https://issues.apache.org/jira/browse/KAFKA-2576
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ben Stopford
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.9.0.0
>
>
> Running the ConsumerPerformance using a multi partition topic causes it to 
> hang (or execute with no results).
> bin/kafka-topics.sh --create --zookeeper server:2181 --replication-factor 1 
> --partitions 50  --topic 50p
> bin/kafka-producer-perf-test.sh --broker-list server:9092 --topic 50p  
> --new-producer --messages 100 --message-size 1000
> #Works ok
> bin/kafka-consumer-perf-test.sh  --broker-list server:9092  --messages 
> 100  --new-consumer --topic 50p 
> #Hangs
> bin/kafka-consumer-perf-test.sh  --broker-list server:9093  --messages 
> 100  --new-consumer --topic 50p --consumer.config ssl.properties
> Running the same without SSL enabled works as expected.  
> Running the same using a single partition topic works as expected.  
> Tested locally and on EC2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2576) ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic

2015-09-23 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904623#comment-14904623
 ] 

Ismael Juma commented on KAFKA-2576:


I should mention that Ben and I worked together on debugging this one.

> ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic
> 
>
> Key: KAFKA-2576
> URL: https://issues.apache.org/jira/browse/KAFKA-2576
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ben Stopford
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.9.0.0
>
>
> Running the ConsumerPerformance using a multi partition topic causes it to 
> hang (or execute with no results).
> bin/kafka-topics.sh --create --zookeeper server:2181 --replication-factor 1 
> --partitions 50  --topic 50p
> bin/kafka-producer-perf-test.sh --broker-list server:9092 --topic 50p  
> --new-producer --messages 100 --message-size 1000
> #Works ok
> bin/kafka-consumer-perf-test.sh  --broker-list server:9092  --messages 
> 100  --new-consumer --topic 50p 
> #Hangs
> bin/kafka-consumer-perf-test.sh  --broker-list server:9093  --messages 
> 100  --new-consumer --topic 50p --consumer.config ssl.properties
> Running the same without SSL enabled works as expected.  
> Running the same using a single partition topic works as expected.  
> Tested locally and on EC2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2576) ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic

2015-09-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904601#comment-14904601
 ] 

ASF GitHub Bot commented on KAFKA-2576:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/236

KAFKA-2576; ConsumerPerformance hangs when SSL enabled for Multi-Partition 
Topic

We now write to the channel with an empty buffer when
there are pending bytes remaining and all data has been
sent.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-2576-ssl-multi-partition-topic-hang

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/236.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #236


commit b0cf6a8387473906de0ca4c05c4bb88eb8342b0f
Author: Ismael Juma 
Date:   2015-09-23T14:18:53Z

Fix infinite loop in `FetchResponse` `*Send` classes

We now write to the channel with an empty buffer when
there are pending bytes remaining and all data has been
sent.




> ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic
> 
>
> Key: KAFKA-2576
> URL: https://issues.apache.org/jira/browse/KAFKA-2576
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ben Stopford
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.9.0.0
>
>
> Running the ConsumerPerformance using a multi partition topic causes it to 
> hang (or execute with no results).
> bin/kafka-topics.sh --create --zookeeper server:2181 --replication-factor 1 
> --partitions 50  --topic 50p
> bin/kafka-producer-perf-test.sh --broker-list server:9092 --topic 50p  
> --new-producer --messages 100 --message-size 1000
> #Works ok
> bin/kafka-consumer-perf-test.sh  --broker-list server:9092  --messages 
> 100  --new-consumer --topic 50p 
> #Hangs
> bin/kafka-consumer-perf-test.sh  --broker-list server:9093  --messages 
> 100  --new-consumer --topic 50p --consumer.config ssl.properties
> Running the same without SSL enabled works as expected.  
> Running the same using a single partition topic works as expected.  
> Tested locally and on EC2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2576; ConsumerPerformance hangs when SSL...

2015-09-23 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/236

KAFKA-2576; ConsumerPerformance hangs when SSL enabled for Multi-Partition 
Topic

We now write to the channel with an empty buffer when
there are pending bytes remaining and all data has been
sent.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-2576-ssl-multi-partition-topic-hang

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/236.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #236


commit b0cf6a8387473906de0ca4c05c4bb88eb8342b0f
Author: Ismael Juma 
Date:   2015-09-23T14:18:53Z

Fix infinite loop in `FetchResponse` `*Send` classes

We now write to the channel with an empty buffer when
there are pending bytes remaining and all data has been
sent.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2576) ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic

2015-09-23 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904562#comment-14904562
 ] 

Ismael Juma commented on KAFKA-2576:


I believe I have a fix, will create a PR in a moment.

> ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic
> 
>
> Key: KAFKA-2576
> URL: https://issues.apache.org/jira/browse/KAFKA-2576
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ben Stopford
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.9.0.0
>
>
> Running the ConsumerPerformance using a multi partition topic causes it to 
> hang (or execute with no results).
> bin/kafka-topics.sh --create --zookeeper server:2181 --replication-factor 1 
> --partitions 50  --topic 50p
> bin/kafka-producer-perf-test.sh --broker-list server:9092 --topic 50p  
> --new-producer --messages 100 --message-size 1000
> #Works ok
> bin/kafka-consumer-perf-test.sh  --broker-list server:9092  --messages 
> 100  --new-consumer --topic 50p 
> #Hangs
> bin/kafka-consumer-perf-test.sh  --broker-list server:9093  --messages 
> 100  --new-consumer --topic 50p --consumer.config ssl.properties
> Running the same without SSL enabled works as expected.  
> Running the same using a single partition topic works as expected.  
> Tested locally and on EC2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-2576) ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic

2015-09-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reassigned KAFKA-2576:
--

Assignee: Ismael Juma

> ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic
> 
>
> Key: KAFKA-2576
> URL: https://issues.apache.org/jira/browse/KAFKA-2576
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ben Stopford
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.9.0.0
>
>
> Running the ConsumerPerformance using a multi partition topic causes it to 
> hang (or execute with no results).
> bin/kafka-topics.sh --create --zookeeper server:2181 --replication-factor 1 
> --partitions 50  --topic 50p
> bin/kafka-producer-perf-test.sh --broker-list server:9092 --topic 50p  
> --new-producer --messages 100 --message-size 1000
> #Works ok
> bin/kafka-consumer-perf-test.sh  --broker-list server:9092  --messages 
> 100  --new-consumer --topic 50p 
> #Hangs
> bin/kafka-consumer-perf-test.sh  --broker-list server:9093  --messages 
> 100  --new-consumer --topic 50p --consumer.config ssl.properties
> Running the same without SSL enabled works as expected.  
> Running the same using a single partition topic works as expected.  
> Tested locally and on EC2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2575) inconsistant offset count in replication-offset-checkpoint during lead election leads to huge exceptions

2015-09-23 Thread Warren Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Warren Jin updated KAFKA-2575:
--
Description: 
We have 3 brokers, more than 100 topics in production, the default partition 
number is 24 for each topic, the replication factor is 3.

We noticed the following errors in recent days.
2015-09-22 22:25:12,529 ERROR Error on broker 1 while processing LeaderAndIsr 
request correlationId 438501 received from controller 2 epoch 12 for partition 
[LOGIST.DELIVERY.SUBSCRIBE,7] (state.change.logger)
java.io.IOException: Expected 3918 entries but found only 3904
at kafka.server.OffsetCheckpoint.read(OffsetCheckpoint.scala:99)
at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:91)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1$$anonfun$apply$mcZ$sp$4.apply(Partition.scala:171)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1$$anonfun$apply$mcZ$sp$4.apply(Partition.scala:171)
at scala.collection.immutable.Set$Set3.foreach(Set.scala:115)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1.apply$mcZ$sp(Partition.scala:171)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1.apply(Partition.scala:163)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1.apply(Partition.scala:163)
at kafka.utils.Utils$.inLock(Utils.scala:535)
at kafka.utils.Utils$.inWriteLock(Utils.scala:543)
at kafka.cluster.Partition.makeLeader(Partition.scala:163)
at 
kafka.server.ReplicaManager$$anonfun$makeLeaders$5.apply(ReplicaManager.scala:427)
at 
kafka.server.ReplicaManager$$anonfun$makeLeaders$5.apply(ReplicaManager.scala:426)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at kafka.server.ReplicaManager.makeLeaders(ReplicaManager.scala:426)
at 
kafka.server.ReplicaManager.becomeLeaderOrFollower(ReplicaManager.scala:378)
at kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:120)
at kafka.server.KafkaApis.handle(KafkaApis.scala:63)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:59)
at java.lang.Thread.run(Thread.java:745)

It occurs in LOGIST.DELIVERY.SUBSCRIBE partition election, 
then it repeatly pring out the error message:
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 14943530 from client ReplicaFetcherThread-2-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,22] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,22] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 15022337 from client ReplicaFetcherThread-1-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,1] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,1] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 15078431 from client ReplicaFetcherThread-0-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,4] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,4] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 13477660 from client ReplicaFetcherThread-2-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,10] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,10] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 15022337 from client ReplicaFetcherThread-1-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,13] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,13] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 15078431 from client ReplicaFetcherThread-0-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,16] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,16] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 14988525 from client ReplicaFetcherThread-3-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,19] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,19] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 ERROR [KafkaApi-1] error when handling request Name: 
FetchRequest; Version: 0; CorrelationId: 15022337; ClientId: 
ReplicaFetcherThread-1-1; ReplicaId: 0; MaxW

[jira] [Updated] (KAFKA-2576) ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic

2015-09-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-2576:
---
 Priority: Blocker  (was: Major)
Fix Version/s: 0.9.0.0

Setting this as a blocker for 0.9.0.0. I'm investigating.

> ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic
> 
>
> Key: KAFKA-2576
> URL: https://issues.apache.org/jira/browse/KAFKA-2576
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ben Stopford
>Priority: Blocker
> Fix For: 0.9.0.0
>
>
> Running the ConsumerPerformance using a multi partition topic causes it to 
> hang (or execute with no results).
> bin/kafka-topics.sh --create --zookeeper server:2181 --replication-factor 1 
> --partitions 50  --topic 50p
> bin/kafka-producer-perf-test.sh --broker-list server:9092 --topic 50p  
> --new-producer --messages 100 --message-size 1000
> #Works ok
> bin/kafka-consumer-perf-test.sh  --broker-list server:9092  --messages 
> 100  --new-consumer --topic 50p 
> #Hangs
> bin/kafka-consumer-perf-test.sh  --broker-list server:9093  --messages 
> 100  --new-consumer --topic 50p --consumer.config ssl.properties
> Running the same without SSL enabled works as expected.  
> Running the same using a single partition topic works as expected.  
> Tested locally and on EC2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2576) ConsumerPerformance hangs when SSL enabled for Multi-Partition Topic

2015-09-23 Thread Ben Stopford (JIRA)
Ben Stopford created KAFKA-2576:
---

 Summary: ConsumerPerformance hangs when SSL enabled for 
Multi-Partition Topic
 Key: KAFKA-2576
 URL: https://issues.apache.org/jira/browse/KAFKA-2576
 Project: Kafka
  Issue Type: Bug
Reporter: Ben Stopford


Running the ConsumerPerformance using a multi partition topic causes it to hang 
(or execute with no results).

bin/kafka-topics.sh --create --zookeeper server:2181 --replication-factor 1 
--partitions 50  --topic 50p


bin/kafka-producer-perf-test.sh --broker-list server:9092 --topic 50p  
--new-producer --messages 100 --message-size 1000

#Works ok
bin/kafka-consumer-perf-test.sh  --broker-list server:9092  --messages 100  
--new-consumer --topic 50p 

#Hangs
bin/kafka-consumer-perf-test.sh  --broker-list server:9093  --messages 100  
--new-consumer --topic 50p --consumer.config ssl.properties

Running the same without SSL enabled works as expected.  
Running the same using a single partition topic works as expected.  
Tested locally and on EC2








--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2566) Improve Jenkins set-up

2015-09-23 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904239#comment-14904239
 ] 

Ismael Juma commented on KAFKA-2566:


[~guozhang], I changed the commands for the trunk Jenkins jobs to produce jars 
and doc jars (this implies compilation and javadoc generation). We also run the 
tests last because they can be a bit flaky at the moment.

> Improve Jenkins set-up
> --
>
> Key: KAFKA-2566
> URL: https://issues.apache.org/jira/browse/KAFKA-2566
> Project: Kafka
>  Issue Type: Task
>Reporter: Ismael Juma
>
> There are currently two Jenkins jobs:
> https://builds.apache.org/job/Kafka-trunk
> https://builds.apache.org/job/kafka-trunk-git-pr
> They both run with Java 7 and execute the following gradle command:
> ./gradlew -PscalaVersion=2.10.1 test
> There are a few issues with this:
> * We don't test Java 8 even though that's the only stable release of the JDK 
> that still receives security fixes
> * We are testing with Scala 2.10.1 even though we should be testing with 
> Scala 2.10.5
> * We are not testing with Scala 2.11.x
> * We are not doing clean builds
> I suggest the following:
> 1. Change the `kafka-trunk-git-pr` job to use the `./gradlew clean test` 
> command.
> 2. Change the `Kafka-trunk` job to use the `./gradlew clean jarAll docsJarAll 
> testAll` command.
> 3. Introduce a kafka-trunk-jdk8 job with the command `./gradlew clean jarAll 
> docsJarAll testAll`
> This is a compromise that doesn't slow down the PR job (which is executed 
> much more often) while still testing trunk in all of our supported JDK and 
> Scala versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-2566) Improve Jenkins set-up

2015-09-23 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904239#comment-14904239
 ] 

Ismael Juma edited comment on KAFKA-2566 at 9/23/15 9:33 AM:
-

[~guozhang], I changed the commands for the trunk Jenkins jobs to produce jars 
and doc jars (this implies compilation and javadoc generation). We also run the 
tests last because they can be a bit flaky at the moment.

Do you have access to Jenkins now? Would it be OK to assign this to you?


was (Author: ijuma):
[~guozhang], I changed the commands for the trunk Jenkins jobs to produce jars 
and doc jars (this implies compilation and javadoc generation). We also run the 
tests last because they can be a bit flaky at the moment.

> Improve Jenkins set-up
> --
>
> Key: KAFKA-2566
> URL: https://issues.apache.org/jira/browse/KAFKA-2566
> Project: Kafka
>  Issue Type: Task
>Reporter: Ismael Juma
>
> There are currently two Jenkins jobs:
> https://builds.apache.org/job/Kafka-trunk
> https://builds.apache.org/job/kafka-trunk-git-pr
> They both run with Java 7 and execute the following gradle command:
> ./gradlew -PscalaVersion=2.10.1 test
> There are a few issues with this:
> * We don't test Java 8 even though that's the only stable release of the JDK 
> that still receives security fixes
> * We are testing with Scala 2.10.1 even though we should be testing with 
> Scala 2.10.5
> * We are not testing with Scala 2.11.x
> * We are not doing clean builds
> I suggest the following:
> 1. Change the `kafka-trunk-git-pr` job to use the `./gradlew clean test` 
> command.
> 2. Change the `Kafka-trunk` job to use the `./gradlew clean jarAll docsJarAll 
> testAll` command.
> 3. Introduce a kafka-trunk-jdk8 job with the command `./gradlew clean jarAll 
> docsJarAll testAll`
> This is a compromise that doesn't slow down the PR job (which is executed 
> much more often) while still testing trunk in all of our supported JDK and 
> Scala versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2566) Improve Jenkins set-up

2015-09-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-2566:
---
Description: 
There are currently two Jenkins jobs:

https://builds.apache.org/job/Kafka-trunk
https://builds.apache.org/job/kafka-trunk-git-pr

They both run with Java 7 and execute the following gradle command:

./gradlew -PscalaVersion=2.10.1 test

There are a few issues with this:
* We don't test Java 8 even though that's the only stable release of the JDK 
that still receives security fixes
* We are testing with Scala 2.10.1 even though we should be testing with Scala 
2.10.5
* We are not testing with Scala 2.11.x
* We are not doing clean builds

I suggest the following:

1. Change the `kafka-trunk-git-pr` job to use the `./gradlew clean test` 
command.
2. Change the `Kafka-trunk` job to use the `./gradlew clean jarAll docsJarAll 
testAll` command.
3. Introduce a kafka-trunk-jdk8 job with the command `./gradlew clean jarAll 
docsJarAll testAll`

This is a compromise that doesn't slow down the PR job (which is executed much 
more often) while still testing trunk in all of our supported JDK and Scala 
versions.

  was:
There are currently two Jenkins jobs:

https://builds.apache.org/job/Kafka-trunk
https://builds.apache.org/job/kafka-trunk-git-pr

They both run with Java 7 and execute the following gradle command:

./gradlew -PscalaVersion=2.10.1 test

There are a few issues with this:
* We don't test Java 8 even though that's the only stable release of the JDK 
that still receives security fixes
* We are testing with Scala 2.10.1 even though we should be testing with Scala 
2.10.5
* We are not testing with Scala 2.11.x
* We are not doing clean builds

I suggest the following:

1. Change the `kafka-trunk-git-pr` job to use the `./gradlew clean test` 
command.
2. Change the `Kafka-trunk` job to use the `./gradlew clean jarAll testAll 
docsJarAll` command.
3. Introduce a kafka-trunk-jdk8 job with the command `./gradlew clean jarAll 
testAll docsJarAll`

This is a compromise that doesn't slow down the PR job (which is executed much 
more often) while still testing trunk in all of our supported JDK and Scala 
versions.


> Improve Jenkins set-up
> --
>
> Key: KAFKA-2566
> URL: https://issues.apache.org/jira/browse/KAFKA-2566
> Project: Kafka
>  Issue Type: Task
>Reporter: Ismael Juma
>
> There are currently two Jenkins jobs:
> https://builds.apache.org/job/Kafka-trunk
> https://builds.apache.org/job/kafka-trunk-git-pr
> They both run with Java 7 and execute the following gradle command:
> ./gradlew -PscalaVersion=2.10.1 test
> There are a few issues with this:
> * We don't test Java 8 even though that's the only stable release of the JDK 
> that still receives security fixes
> * We are testing with Scala 2.10.1 even though we should be testing with 
> Scala 2.10.5
> * We are not testing with Scala 2.11.x
> * We are not doing clean builds
> I suggest the following:
> 1. Change the `kafka-trunk-git-pr` job to use the `./gradlew clean test` 
> command.
> 2. Change the `Kafka-trunk` job to use the `./gradlew clean jarAll docsJarAll 
> testAll` command.
> 3. Introduce a kafka-trunk-jdk8 job with the command `./gradlew clean jarAll 
> docsJarAll testAll`
> This is a compromise that doesn't slow down the PR job (which is executed 
> much more often) while still testing trunk in all of our supported JDK and 
> Scala versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2566) Improve Jenkins set-up

2015-09-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-2566:
---
Description: 
There are currently two Jenkins jobs:

https://builds.apache.org/job/Kafka-trunk
https://builds.apache.org/job/kafka-trunk-git-pr

They both run with Java 7 and execute the following gradle command:

./gradlew -PscalaVersion=2.10.1 test

There are a few issues with this:
* We don't test Java 8 even though that's the only stable release of the JDK 
that still receives security fixes
* We are testing with Scala 2.10.1 even though we should be testing with Scala 
2.10.5
* We are not testing with Scala 2.11.x
* We are not doing clean builds

I suggest the following:

1. Change the `kafka-trunk-git-pr` job to use the `./gradlew clean test` 
command.
2. Change the `Kafka-trunk` job to use the `./gradlew clean jarAll testAll 
docsJarAll` command.
3. Introduce a kafka-trunk-jdk8 job with the command `./gradlew clean jarAll 
testAll docsJarAll`

This is a compromise that doesn't slow down the PR job (which is executed much 
more often) while still testing trunk in all of our supported JDK and Scala 
versions.

  was:
There are currently two Jenkins jobs:

https://builds.apache.org/job/Kafka-trunk
https://builds.apache.org/job/kafka-trunk-git-pr

They both run with Java 7 and execute the following gradle command:

./gradlew -PscalaVersion=2.10.1 test

There are a few issues with this:
* We don't test Java 8 even though that's the only stable release of the JDK 
that still receives security fixes
* We are testing with Scala 2.10.1 even though we should be testing with Scala 
2.10.5
* We are not testing with Scala 2.11.x
* We are not doing clean builds

I suggest the following:

1. Change the `kafka-trunk-git-pr` job to use the `./gradlew clean test` 
command.
2. Change the `Kafka-trunk` job to use the `./gradlew clean testAll` command.
3. Introduce a kafka-trunk-jdk8 job with the command `./gradlew clean testAll`

This is a compromise that doesn't slow down the PR job (which is executed much 
more often) while still testing trunk in all of our supported JDK and Scala 
versions.


> Improve Jenkins set-up
> --
>
> Key: KAFKA-2566
> URL: https://issues.apache.org/jira/browse/KAFKA-2566
> Project: Kafka
>  Issue Type: Task
>Reporter: Ismael Juma
>
> There are currently two Jenkins jobs:
> https://builds.apache.org/job/Kafka-trunk
> https://builds.apache.org/job/kafka-trunk-git-pr
> They both run with Java 7 and execute the following gradle command:
> ./gradlew -PscalaVersion=2.10.1 test
> There are a few issues with this:
> * We don't test Java 8 even though that's the only stable release of the JDK 
> that still receives security fixes
> * We are testing with Scala 2.10.1 even though we should be testing with 
> Scala 2.10.5
> * We are not testing with Scala 2.11.x
> * We are not doing clean builds
> I suggest the following:
> 1. Change the `kafka-trunk-git-pr` job to use the `./gradlew clean test` 
> command.
> 2. Change the `Kafka-trunk` job to use the `./gradlew clean jarAll testAll 
> docsJarAll` command.
> 3. Introduce a kafka-trunk-jdk8 job with the command `./gradlew clean jarAll 
> testAll docsJarAll`
> This is a compromise that doesn't slow down the PR job (which is executed 
> much more often) while still testing trunk in all of our supported JDK and 
> Scala versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2575) inconsistant offset count in replication-offset-checkpoint during lead election leads to huge exceptions

2015-09-23 Thread Warren Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Warren Jin updated KAFKA-2575:
--
Description: 
We have 3 brokers, more than 100 topics in production, the default partition 
number is 24 for each topic, the replication factor is 3.

We noticed the following errors in recent days.
2015-09-22 22:25:12,529 ERROR Error on broker 1 while processing LeaderAndIsr 
request correlationId 438501 received from controller 2 epoch 12 for partition 
[LOGIST.DELIVERY.SUBSCRIBE,7] (state.change.logger)
java.io.IOException: Expected 3918 entries but found only 3904
at kafka.server.OffsetCheckpoint.read(OffsetCheckpoint.scala:99)
at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:91)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1$$anonfun$apply$mcZ$sp$4.apply(Partition.scala:171)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1$$anonfun$apply$mcZ$sp$4.apply(Partition.scala:171)
at scala.collection.immutable.Set$Set3.foreach(Set.scala:115)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1.apply$mcZ$sp(Partition.scala:171)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1.apply(Partition.scala:163)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1.apply(Partition.scala:163)
at kafka.utils.Utils$.inLock(Utils.scala:535)
at kafka.utils.Utils$.inWriteLock(Utils.scala:543)
at kafka.cluster.Partition.makeLeader(Partition.scala:163)
at 
kafka.server.ReplicaManager$$anonfun$makeLeaders$5.apply(ReplicaManager.scala:427)
at 
kafka.server.ReplicaManager$$anonfun$makeLeaders$5.apply(ReplicaManager.scala:426)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at kafka.server.ReplicaManager.makeLeaders(ReplicaManager.scala:426)
at 
kafka.server.ReplicaManager.becomeLeaderOrFollower(ReplicaManager.scala:378)
at kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:120)
at kafka.server.KafkaApis.handle(KafkaApis.scala:63)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:59)
at java.lang.Thread.run(Thread.java:745)

It occurs in LOGIST.DELIVERY.SUBSCRIBE partition election, 
then it repeatly pring out the error message:
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 14943530 from client ReplicaFetcherThread-2-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,22] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,22] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 15022337 from client ReplicaFetcherThread-1-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,1] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,1] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 15078431 from client ReplicaFetcherThread-0-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,4] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,4] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 13477660 from client ReplicaFetcherThread-2-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,10] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,10] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 15022337 from client ReplicaFetcherThread-1-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,13] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,13] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 15078431 from client ReplicaFetcherThread-0-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,16] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,16] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 14988525 from client ReplicaFetcherThread-3-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,19] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,19] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 ERROR [KafkaApi-1] error when handling request Name: 
FetchRequest; Version: 0; CorrelationId: 15022337; ClientId: 
ReplicaFetcherThread-1-1; ReplicaId: 0; MaxW

[jira] [Updated] (KAFKA-2575) inconsistant offset count in replication-offset-checkpoint

2015-09-23 Thread Warren Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Warren Jin updated KAFKA-2575:
--
Summary: inconsistant offset count in replication-offset-checkpoint  (was: 
kafka.server.OffsetCheckpoint found inconsistant offset entry count, lead to 
NotAssignedReplicaException)

> inconsistant offset count in replication-offset-checkpoint
> --
>
> Key: KAFKA-2575
> URL: https://issues.apache.org/jira/browse/KAFKA-2575
> Project: Kafka
>  Issue Type: Bug
>  Components: kafka streams
>Affects Versions: 0.8.2.1
>Reporter: Warren Jin
>
> We have more than 100 topics in production, the default partition number is 
> 24 for each topic.
> We noticed the following errors in recent days.
> 2015-09-22 22:25:12,529 ERROR Error on broker 1 while processing LeaderAndIsr 
> request correlationId 438501 received from controller 2 epoch 12 for 
> partition [LOGIST.DELIVERY.SUBSCRIBE,7] (state.change.logger)
> java.io.IOException: Expected 3918 entries but found only 3904
>   at kafka.server.OffsetCheckpoint.read(OffsetCheckpoint.scala:99)
>   at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:91)
>   at 
> kafka.cluster.Partition$$anonfun$makeLeader$1$$anonfun$apply$mcZ$sp$4.apply(Partition.scala:171)
>   at 
> kafka.cluster.Partition$$anonfun$makeLeader$1$$anonfun$apply$mcZ$sp$4.apply(Partition.scala:171)
>   at scala.collection.immutable.Set$Set3.foreach(Set.scala:115)
>   at 
> kafka.cluster.Partition$$anonfun$makeLeader$1.apply$mcZ$sp(Partition.scala:171)
>   at 
> kafka.cluster.Partition$$anonfun$makeLeader$1.apply(Partition.scala:163)
>   at 
> kafka.cluster.Partition$$anonfun$makeLeader$1.apply(Partition.scala:163)
>   at kafka.utils.Utils$.inLock(Utils.scala:535)
>   at kafka.utils.Utils$.inWriteLock(Utils.scala:543)
>   at kafka.cluster.Partition.makeLeader(Partition.scala:163)
>   at 
> kafka.server.ReplicaManager$$anonfun$makeLeaders$5.apply(ReplicaManager.scala:427)
>   at 
> kafka.server.ReplicaManager$$anonfun$makeLeaders$5.apply(ReplicaManager.scala:426)
>   at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>   at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>   at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
>   at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
>   at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
>   at kafka.server.ReplicaManager.makeLeaders(ReplicaManager.scala:426)
>   at 
> kafka.server.ReplicaManager.becomeLeaderOrFollower(ReplicaManager.scala:378)
>   at kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:120)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:63)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:59)
>   at java.lang.Thread.run(Thread.java:745)
> the it repeatly pring out the error message:
> 2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request 
> with correlation id 14943530 from client ReplicaFetcherThread-2-1 on 
> partition [LOGIST.DELIVERY.SUBSCRIBE,22] failed due to Leader not local for 
> partition [LOGIST.DELIVERY.SUBSCRIBE,22] on broker 1 
> (kafka.server.ReplicaManager)
> 2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request 
> with correlation id 15022337 from client ReplicaFetcherThread-1-1 on 
> partition [LOGIST.DELIVERY.SUBSCRIBE,1] failed due to Leader not local for 
> partition [LOGIST.DELIVERY.SUBSCRIBE,1] on broker 1 
> (kafka.server.ReplicaManager)
> 2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request 
> with correlation id 15078431 from client ReplicaFetcherThread-0-1 on 
> partition [LOGIST.DELIVERY.SUBSCRIBE,4] failed due to Leader not local for 
> partition [LOGIST.DELIVERY.SUBSCRIBE,4] on broker 1 
> (kafka.server.ReplicaManager)
> 2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request 
> with correlation id 13477660 from client ReplicaFetcherThread-2-1 on 
> partition [LOGIST.DELIVERY.SUBSCRIBE,10] failed due to Leader not local for 
> partition [LOGIST.DELIVERY.SUBSCRIBE,10] on broker 1 
> (kafka.server.ReplicaManager)
> 2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request 
> with correlation id 15022337 from client ReplicaFetcherThread-1-1 on 
> partition [LOGIST.DELIVERY.SUBSCRIBE,13] failed due to Leader not local for 
> partition [LOGIST.DELIVERY.SUBSCRIBE,13] on broker 1 
> (kafka.server.ReplicaManager)
> 2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request 
> with correlation id 15078431 from client ReplicaFetcherThread-0-1 on 
> partition [LOGIST.DELIVERY.SUBSCRIBE,16] failed due to Leader not local for 
> partition [LOG

[jira] [Commented] (KAFKA-2459) Connection backoff/blackout period should start when a connection is disconnected, not when the connection attempt was initiated

2015-09-23 Thread Brian Sung-jin Hong (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904194#comment-14904194
 ] 

Brian Sung-jin Hong commented on KAFKA-2459:


How's the status of this issue? We use Kafka in AWS EC2. When a Kafka instance 
is terminated, we experience this problem.

I tried to fix this myself. But looking at the code, it seems to need many 
refactoring for this to work out. If no one is working on this issue, can 
anyone give me some guidance for me to proceed?

> Connection backoff/blackout period should start when a connection is 
> disconnected, not when the connection attempt was initiated
> 
>
> Key: KAFKA-2459
> URL: https://issues.apache.org/jira/browse/KAFKA-2459
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, producer 
>Affects Versions: 0.8.2.1
>Reporter: Ewen Cheslack-Postava
>Assignee: Manikumar Reddy
>
> Currently the connection code for new clients marks the time when a 
> connection was initiated (NodeConnectionState.lastConnectMs) and then uses 
> this to compute blackout periods for nodes, during which connections will not 
> be attempted and the node is not considered a candidate for leastLoadedNode.
> However, in cases where the connection attempt takes longer than the 
> blackout/backoff period (default 10ms), this results in incorrect behavior. 
> If a broker is not available and, for example, the broker does not explicitly 
> reject the connection, instead waiting for a connection timeout (e.g. due to 
> firewall settings), then the backoff period will have already elapsed and the 
> node will immediately be considered ready for a new connection attempt and a 
> node to be selected by leastLoadedNode for metadata updates. I think it 
> should be easy to reproduce and verify this problem manually by using tc to 
> introduce enough latency to make connection failures take > 10ms.
> The correct behavior would use the disconnection event to mark the end of the 
> last connection attempt and then wait for the backoff period to elapse after 
> that.
> See 
> http://mail-archives.apache.org/mod_mbox/kafka-users/201508.mbox/%3CCAJY8EofpeU4%2BAJ%3Dw91HDUx2RabjkWoU00Z%3DcQ2wHcQSrbPT4HA%40mail.gmail.com%3E
>  for the original description of the problem.
> This is related to KAFKA-1843 because leastLoadedNode currently will 
> consistently choose the same node if this blackout period is not handled 
> correctly, but is a much smaller issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2575) kafka.server.OffsetCheckpoint found inconsistant offset entry count, lead to NotAssignedReplicaException

2015-09-23 Thread Warren Jin (JIRA)
Warren Jin created KAFKA-2575:
-

 Summary: kafka.server.OffsetCheckpoint found inconsistant offset 
entry count, lead to NotAssignedReplicaException
 Key: KAFKA-2575
 URL: https://issues.apache.org/jira/browse/KAFKA-2575
 Project: Kafka
  Issue Type: Bug
  Components: kafka streams
Affects Versions: 0.8.2.1
Reporter: Warren Jin


We have more than 100 topics in production, the default partition number is 24 
for each topic.

We noticed the following errors in recent days.
2015-09-22 22:25:12,529 ERROR Error on broker 1 while processing LeaderAndIsr 
request correlationId 438501 received from controller 2 epoch 12 for partition 
[LOGIST.DELIVERY.SUBSCRIBE,7] (state.change.logger)
java.io.IOException: Expected 3918 entries but found only 3904
at kafka.server.OffsetCheckpoint.read(OffsetCheckpoint.scala:99)
at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:91)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1$$anonfun$apply$mcZ$sp$4.apply(Partition.scala:171)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1$$anonfun$apply$mcZ$sp$4.apply(Partition.scala:171)
at scala.collection.immutable.Set$Set3.foreach(Set.scala:115)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1.apply$mcZ$sp(Partition.scala:171)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1.apply(Partition.scala:163)
at 
kafka.cluster.Partition$$anonfun$makeLeader$1.apply(Partition.scala:163)
at kafka.utils.Utils$.inLock(Utils.scala:535)
at kafka.utils.Utils$.inWriteLock(Utils.scala:543)
at kafka.cluster.Partition.makeLeader(Partition.scala:163)
at 
kafka.server.ReplicaManager$$anonfun$makeLeaders$5.apply(ReplicaManager.scala:427)
at 
kafka.server.ReplicaManager$$anonfun$makeLeaders$5.apply(ReplicaManager.scala:426)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at kafka.server.ReplicaManager.makeLeaders(ReplicaManager.scala:426)
at 
kafka.server.ReplicaManager.becomeLeaderOrFollower(ReplicaManager.scala:378)
at kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:120)
at kafka.server.KafkaApis.handle(KafkaApis.scala:63)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:59)
at java.lang.Thread.run(Thread.java:745)

the it repeatly pring out the error message:
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 14943530 from client ReplicaFetcherThread-2-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,22] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,22] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 15022337 from client ReplicaFetcherThread-1-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,1] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,1] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 15078431 from client ReplicaFetcherThread-0-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,4] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,4] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 13477660 from client ReplicaFetcherThread-2-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,10] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,10] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 15022337 from client ReplicaFetcherThread-1-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,13] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,13] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 15078431 from client ReplicaFetcherThread-0-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,16] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,16] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 WARN [Replica Manager on Broker 1]: Fetch request with 
correlation id 14988525 from client ReplicaFetcherThread-3-1 on partition 
[LOGIST.DELIVERY.SUBSCRIBE,19] failed due to Leader not local for partition 
[LOGIST.DELIVERY.SUBSCRIBE,19] on broker 1 (kafka.server.ReplicaManager)
2015-09-23 10:20:03 525 ERROR [KafkaApi-

[jira] [Commented] (KAFKA-2573) Mirror maker system test hangs and eventually fails

2015-09-23 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904190#comment-14904190
 ] 

Ismael Juma commented on KAFKA-2573:


The exception you pasted happens if you compile with Java 8 and run it with 
Java 7. Are you sure you pasted the right exception?

> Mirror maker system test hangs and eventually fails
> ---
>
> Key: KAFKA-2573
> URL: https://issues.apache.org/jira/browse/KAFKA-2573
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> Due to changes made in KAFKA-2015, handling of {{--consumer.config}} has 
> changed, more details is specified on KAFKA-2467. This leads to the exception.
> {code}
> Exception in thread "main" java.lang.NoSuchMethodError: 
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
>   at kafka.utils.Pool.keys(Pool.scala:77)
>   at 
> kafka.consumer.FetchRequestAndResponseStatsRegistry$.removeConsumerFetchRequestAndResponseStats(FetchRequestAndResponseStats.scala:69)
>   at 
> kafka.metrics.KafkaMetricsGroup$.removeAllConsumerMetrics(KafkaMetricsGroup.scala:189)
>   at 
> kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:200)
>   at kafka.consumer.OldConsumer.stop(BaseConsumer.scala:75)
>   at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:98)
>   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:57)
>   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:41)
>   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)