[jira] [Created] (KAFKA-1745) Each new thread creates a PIPE and KQUEUE as open files during producer.send() and does no0t get cleared when the thread that creates them is cleared.
Vishal created KAFKA-1745: - Summary: Each new thread creates a PIPE and KQUEUE as open files during producer.send() and does no0t get cleared when the thread that creates them is cleared. Key: KAFKA-1745 URL: https://issues.apache.org/jira/browse/KAFKA-1745 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Environment: Mac OS Mavericks Reporter: Vishal Priority: Critical Hi, I'm using the java client API for Kafka. I wanted to send data to Kafka by using a producer pool as I'm using a sync producer. The thread that sends the data is from the thread pool that grows and shrinks depending on the usage. So, when I try to send data from one thread, 1 KQUEUE and 2 PIPES are created (got this info by using lsof). If I keep using the same thread it's fine but when a new thread sends data to Kafka (using producer.send() ) a new KQUEUE and 2 PIPEs are created. This is okay, but when the thread is cleared from the thread pool and a new thread is created, then new KQUEUEs and PIPEs are created. The problem is that the old ones which were created are not getting destroyed and they are showing up as open files. This is causing a major problem as the number of open file keep increasing and does not decrease. Please suggest any solutions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1745) Each new thread creates a PIPE and KQUEUE as open files during producer.send() and does not get cleared when the thread that creates them is cleared.
[ https://issues.apache.org/jira/browse/KAFKA-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vishal updated KAFKA-1745: -- Summary: Each new thread creates a PIPE and KQUEUE as open files during producer.send() and does not get cleared when the thread that creates them is cleared. (was: Each new thread creates a PIPE and KQUEUE as open files during producer.send() and does no0t get cleared when the thread that creates them is cleared.) Each new thread creates a PIPE and KQUEUE as open files during producer.send() and does not get cleared when the thread that creates them is cleared. - Key: KAFKA-1745 URL: https://issues.apache.org/jira/browse/KAFKA-1745 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Environment: Mac OS Mavericks Reporter: Vishal Priority: Critical Hi, I'm using the java client API for Kafka. I wanted to send data to Kafka by using a producer pool as I'm using a sync producer. The thread that sends the data is from the thread pool that grows and shrinks depending on the usage. So, when I try to send data from one thread, 1 KQUEUE and 2 PIPES are created (got this info by using lsof). If I keep using the same thread it's fine but when a new thread sends data to Kafka (using producer.send() ) a new KQUEUE and 2 PIPEs are created. This is okay, but when the thread is cleared from the thread pool and a new thread is created, then new KQUEUEs and PIPEs are created. The problem is that the old ones which were created are not getting destroyed and they are showing up as open files. This is causing a major problem as the number of open file keep increasing and does not decrease. Please suggest any solutions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names
[ https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Tretyakov updated KAFKA-1481: -- Attachment: KAFKA-1481_2014-11-03_16-39-41_doc.patch Stop using dashes AND underscores as separators in MBean names -- Key: KAFKA-1481 URL: https://issues.apache.org/jira/browse/KAFKA-1481 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Priority: Critical Labels: patch Fix For: 0.8.3 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, KAFKA-1481_2014-10-31_14-35-43.patch, KAFKA-1481_2014-11-03_16-39-41_doc.patch, KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch MBeans should not use dashes or underscores as separators because these characters are allowed in hostnames, topics, group and consumer IDs, etc., and these are embedded in MBeans names making it impossible to parse out individual bits from MBeans. Perhaps a pipe character should be used to avoid the conflict. This looks like a major blocker because it means nobody can write Kafka 0.8.x monitoring tools unless they are doing it for themselves AND do not use dashes AND do not use underscores. See: http://search-hadoop.com/m/4TaT4lonIW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names
[ https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Tretyakov updated KAFKA-1481: -- Attachment: KAFKA-1481_2014-11-03_17-02-23.patch Stop using dashes AND underscores as separators in MBean names -- Key: KAFKA-1481 URL: https://issues.apache.org/jira/browse/KAFKA-1481 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Priority: Critical Labels: patch Fix For: 0.8.3 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, KAFKA-1481_2014-10-31_14-35-43.patch, KAFKA-1481_2014-11-03_16-39-41_doc.patch, KAFKA-1481_2014-11-03_17-02-23.patch, KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch MBeans should not use dashes or underscores as separators because these characters are allowed in hostnames, topics, group and consumer IDs, etc., and these are embedded in MBeans names making it impossible to parse out individual bits from MBeans. Perhaps a pipe character should be used to avoid the conflict. This looks like a major blocker because it means nobody can write Kafka 0.8.x monitoring tools unless they are doing it for themselves AND do not use dashes AND do not use underscores. See: http://search-hadoop.com/m/4TaT4lonIW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names
[ https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194551#comment-14194551 ] Vladimir Tretyakov commented on KAFKA-1481: --- Hi, added new patch 50. I think it is better to have separate case class for 'all' things. It is real 'other' case. 51. Done 52.1 Done 52.2 Done 52.3 Done 52.4 Done 52.5 Done 53. https://issues.apache.org/jira/secure/attachment/12678927/KAFKA-1481_2014-11-03_16-39-41_doc.patch Stop using dashes AND underscores as separators in MBean names -- Key: KAFKA-1481 URL: https://issues.apache.org/jira/browse/KAFKA-1481 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Priority: Critical Labels: patch Fix For: 0.8.3 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, KAFKA-1481_2014-10-31_14-35-43.patch, KAFKA-1481_2014-11-03_16-39-41_doc.patch, KAFKA-1481_2014-11-03_17-02-23.patch, KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch MBeans should not use dashes or underscores as separators because these characters are allowed in hostnames, topics, group and consumer IDs, etc., and these are embedded in MBeans names making it impossible to parse out individual bits from MBeans. Perhaps a pipe character should be used to avoid the conflict. This looks like a major blocker because it means nobody can write Kafka 0.8.x monitoring tools unless they are doing it for themselves AND do not use dashes AND do not use underscores. See: http://search-hadoop.com/m/4TaT4lonIW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1686) Implement SASL/Kerberos
[ https://issues.apache.org/jira/browse/KAFKA-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194618#comment-14194618 ] Sriharsha Chintalapani commented on KAFKA-1686: --- Hi [~gwenshap] sorry for the late reply. I haven't started on this JIRA and probably for another week atleast I won't be able to work on it. It looks like the first step must be to authenticate Kafka broker itself with Kerberos. Yes this can be a separate piece and make it into its own JIRA. I'll look into KAFKA-1684 and update the JIRA soon with implementation details. Implement SASL/Kerberos --- Key: KAFKA-1686 URL: https://issues.apache.org/jira/browse/KAFKA-1686 Project: Kafka Issue Type: Sub-task Components: security Affects Versions: 0.9.0 Reporter: Jay Kreps Assignee: Sriharsha Chintalapani Fix For: 0.9.0 Implement SASL/Kerberos authentication. To do this we will need to introduce a new SASLRequest and SASLResponse pair to the client protocol. This request and response will each have only a single byte[] field and will be used to handle the SASL challenge/response cycle. Doing this will initialize the SaslServer instance and associate it with the session in a manner similar to KAFKA-1684. When using integrity or encryption mechanisms with SASL we will need to wrap and unwrap bytes as in KAFKA-1684 so the same interface that covers the SSLEngine will need to also cover the SaslServer instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1686) Implement SASL/Kerberos
[ https://issues.apache.org/jira/browse/KAFKA-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194818#comment-14194818 ] Gwen Shapira commented on KAFKA-1686: - An existing long-lived connection doesn't require renewing, since the ticket is only validated on the initial handshake. (Yes, it does make it difficult to invalidate clients, but this is pretty normal for most kerberized services) If the connection drops or the client needs another connection (perhaps when rebalancing?), the client needs to renew the ticket and present a new one. Implement SASL/Kerberos --- Key: KAFKA-1686 URL: https://issues.apache.org/jira/browse/KAFKA-1686 Project: Kafka Issue Type: Sub-task Components: security Affects Versions: 0.9.0 Reporter: Jay Kreps Assignee: Sriharsha Chintalapani Fix For: 0.9.0 Implement SASL/Kerberos authentication. To do this we will need to introduce a new SASLRequest and SASLResponse pair to the client protocol. This request and response will each have only a single byte[] field and will be used to handle the SASL challenge/response cycle. Doing this will initialize the SaslServer instance and associate it with the session in a manner similar to KAFKA-1684. When using integrity or encryption mechanisms with SASL we will need to wrap and unwrap bytes as in KAFKA-1684 so the same interface that covers the SSLEngine will need to also cover the SaslServer instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1738) Partitions for topic not created after restart from forced shutdown
[ https://issues.apache.org/jira/browse/KAFKA-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep updated KAFKA-1738: --- Description: We are using Kafka Topic APIs to create the topic. But in some cases, the topic gets created but we don't see the partition specific files and when producer/consumer tries to get the topic metadata and it fails with exception. Same happens if one tries to create using the command line. k.p.BrokerPartitionInfo - Error while fetching metadata [{TopicMetadata for topic tloader1 - No partition metadata for topic tloader1 due to kafka.common.UnknownTopicOrPartitionException}] for topic [tloader1]: class kafka.common.UnknownTopicOrPartitionException Steps to reproduce - 1. Stop kafka using kill -9 PID of Kafka 2. Start Kafka 3. Create Topic with partition and replication factor of 1. 4. Check the response “Created topic topic_name” 5. Run the list command to verify if its created. 6. Now check the data directory of kakfa. There would not be any for the newly created topic. We see issues when we are creating new topics. This happens randomly and we dont know the exact reasons. We see the below logs in controller during the time of creation of topics which doesnt have the partition files. 2014-11-03 13:12:50,625] INFO [Controller 0]: New topic creation callback for [JobJTopic,0] (kafka.controller.KafkaController) [2014-11-03 13:12:50,626] INFO [Controller 0]: New partition creation callback for [JobJTopic,0] (kafka.controller.KafkaController) [2014-11-03 13:12:50,626] INFO [Partition state machine on Controller 0]: Invoking state change to NewPartition for partitions [JobJTopic,0] (kafka.controller.PartitionStateMachine) [2014-11-03 13:12:50,653] INFO [Replica state machine on controller 0]: Invoking state change to NewReplica for replicas [Topic=JobJTopic,Partition=0,Replica=0] (kafka.controller.ReplicaStateMachine) [2014-11-03 13:12:50,654] INFO [Partition state machine on Controller 0]: Invoking state change to OnlinePartition for partitions [JobJTopic,0] (kafka.controller.PartitionStateMachine) [2014-11-03 13:12:50,654] DEBUG [Partition state machine on Controller 0]: Live assigned replicas for partition [JobJTopic,0] are: [List(0)] (kafka.controller.PartitionStateMachine) [2014-11-03 13:12:50,654] DEBUG [Partition state machine on Controller 0]: Initializing leader and isr for partition [JobJTopic,0] to (Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:2) (kafka.controller.PartitionStateMachine) [2014-11-03 13:12:50,667] INFO [Replica state machine on controller 0]: Invoking state change to OnlineReplica for replicas [Topic=JobJTopic,Partition=0,Replica=0] (kafka.controller.ReplicaStateMachine) [2014-11-03 13:12:50,794] WARN [Controller-0-to-broker-0-send-thread], Controller 0 fails to send a request to broker id:0,host:DMIPVM,port:9092 (kafka.controller.RequestSendThread) java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. at kafka.utils.Utils$.read(Utils.scala:381) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Receive$class.readCompletely(Transmission.scala:56) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) at kafka.network.BlockingChannel.receive(BlockingChannel.scala:108) at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60) [2014-11-03 13:12:50,965] ERROR [Controller-0-to-broker-0-send-thread], Controller 0 epoch 2 failed to send request Name:UpdateMetadataRequest;Version:0;Controller:0;ControllerEpoch:2;CorrelationId:43;ClientId:id_0-host_null-port_9092;AliveBrokers:id:0,host:DMIPVM,port:9092;PartitionState:[JobJTopic,0] - (LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:2),ReplicationFactor:1),AllReplicas:0) to broker id:0,host:DMIPVM,port:9092. Reconnecting to broker. (kafka.controller.RequestSendThread) java.nio.channels.ClosedChannelException at kafka.network.BlockingChannel.send(BlockingChannel.scala:97) at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132) at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60) was: We are using Kafka Topic APIs to create the topic. But in some cases, the topic gets created but we don't see the partition specific files and when producer/consumer tries to get the topic metadata and it fails with exception. Same happens if one tries to create using the command line. k.p.BrokerPartitionInfo - Error while fetching metadata [{TopicMetadata for topic tloader1 - No partition metadata for topic tloader1 due to
[jira] [Commented] (KAFKA-960) Upgrade Metrics to 3.x
[ https://issues.apache.org/jira/browse/KAFKA-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194979#comment-14194979 ] Joel Koshy commented on KAFKA-960: -- Also, we should plan on removing our dependency on Metrics and use the metrics package we already have in the clients package. [~junrao] do you know if we have a jira for that? We can probably target that for 0.9 or shortly thereafter. Upgrade Metrics to 3.x -- Key: KAFKA-960 URL: https://issues.apache.org/jira/browse/KAFKA-960 Project: Kafka Issue Type: Improvement Affects Versions: 0.8.1 Reporter: Cosmin Lehene Now that metrics 3.0 has been released (http://metrics.codahale.com/about/release-notes/) we can upgrade back -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-1746) System tests don't handle errors well
Ewen Cheslack-Postava created KAFKA-1746: Summary: System tests don't handle errors well Key: KAFKA-1746 URL: https://issues.apache.org/jira/browse/KAFKA-1746 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava The system test scripts don't handle errors well. A couple of key issues: * Unexpected exceptions during tests are just ignored and the tests appear to be successful in the reports. * The scripts exit code is always 0, even if tests fail. * Almost no subprocess calls are checked. In a lot of cases this is ok, and sometimes it's not possible (e.g. after starting a long-running remote process), but in some cases such as calls to DumpLogSegments, the tests can miss that the tools is exiting with an exception and the test appears to be successful even though no data was verified. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-1747) TestcaseEnv improperly shares state between instances
Ewen Cheslack-Postava created KAFKA-1747: Summary: TestcaseEnv improperly shares state between instances Key: KAFKA-1747 URL: https://issues.apache.org/jira/browse/KAFKA-1747 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava TestcaseEnv in system tests uses class variables instead of instance variables for a bunch of state. This causes the data to persist between tests. In some cases this can cause tests to break (e.g. there will be state from a service running in a previous test that doesn't exist in the current test; trying to look up state about that service raises an exception or produces invalid data). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1746) System tests don't handle errors well
[ https://issues.apache.org/jira/browse/KAFKA-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1746: - Assignee: Ewen Cheslack-Postava Status: Patch Available (was: Open) System tests don't handle errors well - Key: KAFKA-1746 URL: https://issues.apache.org/jira/browse/KAFKA-1746 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava Assignee: Ewen Cheslack-Postava Attachments: KAFKA-1746.patch The system test scripts don't handle errors well. A couple of key issues: * Unexpected exceptions during tests are just ignored and the tests appear to be successful in the reports. * The scripts exit code is always 0, even if tests fail. * Almost no subprocess calls are checked. In a lot of cases this is ok, and sometimes it's not possible (e.g. after starting a long-running remote process), but in some cases such as calls to DumpLogSegments, the tests can miss that the tools is exiting with an exception and the test appears to be successful even though no data was verified. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1746) System tests don't handle errors well
[ https://issues.apache.org/jira/browse/KAFKA-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1746: - Attachment: KAFKA-1746.patch System tests don't handle errors well - Key: KAFKA-1746 URL: https://issues.apache.org/jira/browse/KAFKA-1746 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava Attachments: KAFKA-1746.patch The system test scripts don't handle errors well. A couple of key issues: * Unexpected exceptions during tests are just ignored and the tests appear to be successful in the reports. * The scripts exit code is always 0, even if tests fail. * Almost no subprocess calls are checked. In a lot of cases this is ok, and sometimes it's not possible (e.g. after starting a long-running remote process), but in some cases such as calls to DumpLogSegments, the tests can miss that the tools is exiting with an exception and the test appears to be successful even though no data was verified. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 27534: Patch for KAFKA-1746
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27534/ --- Review request for kafka. Bugs: KAFKA-1746 https://issues.apache.org/jira/browse/KAFKA-1746 Repository: kafka Description --- KAFKA-1746 Make system tests return a useful exit code. KAFKA-1746 Check the exit code when running DumpLogSegments to verify data. Diffs - system_test/mirror_maker_testsuite/mirror_maker_test.py c0117c64cbb7687ca8fbcec6b5c188eb880300ef system_test/offset_management_testsuite/offset_management_test.py 12b5cd25140e1eb407dd57eef63d9783257688b2 system_test/replication_testsuite/replica_basic_test.py 660006cc253bbae3e7cd9f02601f1c1937dd1714 system_test/system_test_runner.py ee7aa252333553e8eb0bc046edf968ec99dddb70 system_test/utils/kafka_system_test_utils.py 1093b660ebd0cb5ab6d3731d26f151e1bf717f8a Diff: https://reviews.apache.org/r/27534/diff/ Testing --- Thanks, Ewen Cheslack-Postava
[jira] [Commented] (KAFKA-1746) System tests don't handle errors well
[ https://issues.apache.org/jira/browse/KAFKA-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194983#comment-14194983 ] Ewen Cheslack-Postava commented on KAFKA-1746: -- Created reviewboard https://reviews.apache.org/r/27534/diff/ against branch origin/trunk System tests don't handle errors well - Key: KAFKA-1746 URL: https://issues.apache.org/jira/browse/KAFKA-1746 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava Attachments: KAFKA-1746.patch The system test scripts don't handle errors well. A couple of key issues: * Unexpected exceptions during tests are just ignored and the tests appear to be successful in the reports. * The scripts exit code is always 0, even if tests fail. * Almost no subprocess calls are checked. In a lot of cases this is ok, and sometimes it's not possible (e.g. after starting a long-running remote process), but in some cases such as calls to DumpLogSegments, the tests can miss that the tools is exiting with an exception and the test appears to be successful even though no data was verified. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 27535: Patch for KAFKA-1747
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27535/ --- Review request for kafka. Bugs: KAFKA-1747 https://issues.apache.org/jira/browse/KAFKA-1747 Repository: kafka Description --- KAKFA-1747 Fix TestcaseEnv so state isn't shared between instances. Diffs - system_test/utils/testcase_env.py b3c29105c04348f036efbbdc430e14e099ca8c70 Diff: https://reviews.apache.org/r/27535/diff/ Testing --- Thanks, Ewen Cheslack-Postava
[jira] [Commented] (KAFKA-1747) TestcaseEnv improperly shares state between instances
[ https://issues.apache.org/jira/browse/KAFKA-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194984#comment-14194984 ] Ewen Cheslack-Postava commented on KAFKA-1747: -- Created reviewboard https://reviews.apache.org/r/27535/diff/ against branch origin/trunk TestcaseEnv improperly shares state between instances - Key: KAFKA-1747 URL: https://issues.apache.org/jira/browse/KAFKA-1747 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava Attachments: KAFKA-1747.patch TestcaseEnv in system tests uses class variables instead of instance variables for a bunch of state. This causes the data to persist between tests. In some cases this can cause tests to break (e.g. there will be state from a service running in a previous test that doesn't exist in the current test; trying to look up state about that service raises an exception or produces invalid data). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1747) TestcaseEnv improperly shares state between instances
[ https://issues.apache.org/jira/browse/KAFKA-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1747: - Attachment: KAFKA-1747.patch TestcaseEnv improperly shares state between instances - Key: KAFKA-1747 URL: https://issues.apache.org/jira/browse/KAFKA-1747 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava Attachments: KAFKA-1747.patch TestcaseEnv in system tests uses class variables instead of instance variables for a bunch of state. This causes the data to persist between tests. In some cases this can cause tests to break (e.g. there will be state from a service running in a previous test that doesn't exist in the current test; trying to look up state about that service raises an exception or produces invalid data). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1747) TestcaseEnv improperly shares state between instances
[ https://issues.apache.org/jira/browse/KAFKA-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1747: - Assignee: Ewen Cheslack-Postava Status: Patch Available (was: Open) TestcaseEnv improperly shares state between instances - Key: KAFKA-1747 URL: https://issues.apache.org/jira/browse/KAFKA-1747 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava Assignee: Ewen Cheslack-Postava Attachments: KAFKA-1747.patch TestcaseEnv in system tests uses class variables instead of instance variables for a bunch of state. This causes the data to persist between tests. In some cases this can cause tests to break (e.g. there will be state from a service running in a previous test that doesn't exist in the current test; trying to look up state about that service raises an exception or produces invalid data). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-960) Upgrade Metrics to 3.x
[ https://issues.apache.org/jira/browse/KAFKA-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194990#comment-14194990 ] Erik van Oosten commented on KAFKA-960: --- If 2.20 and 2.1.5 are indeed binary compatible (how do you test that?), _all existing_ releases could be patched by simply replacing a jar :) Upgrade Metrics to 3.x -- Key: KAFKA-960 URL: https://issues.apache.org/jira/browse/KAFKA-960 Project: Kafka Issue Type: Improvement Affects Versions: 0.8.1 Reporter: Cosmin Lehene Now that metrics 3.0 has been released (http://metrics.codahale.com/about/release-notes/) we can upgrade back -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-1748) Decouple system test cluster resources definition from service definitions
Ewen Cheslack-Postava created KAFKA-1748: Summary: Decouple system test cluster resources definition from service definitions Key: KAFKA-1748 URL: https://issues.apache.org/jira/browse/KAFKA-1748 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava Currently the system tests use JSON files that specify the set of services for each test and where they should run (i.e. hostname). These currently assume that you already have SSH keys setup, use the same username on the host running the tests and the test cluster, don't require any additional ssh/scp/rsync flags, and assume you'll always have a fixed set of compute resources (or that you'll spend a lot of time editing config files). While we don't want a whole cluster resource manager in the system tests, a bit more flexibility would make it easier to, e.g., run tests against a local vagrant cluster or on dynamically allocated EC2 instances. We can separate out the basic resource spec (i.e. json specifying how to access machines) from the service definition (i.e. a broker should run with settings x, y, z). Restricting to a very simple set of mappings (i.e. map services to hosts with round robin, optionally restricting to no reuse of hosts) should keep things simple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 27536: Patch for KAFKA-1748
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27536/ --- Review request for kafka. Bugs: KAFKA-1748 https://issues.apache.org/jira/browse/KAFKA-1748 Repository: kafka Description --- KAFKA-1748 Make remote hosts used by system tests more flexibly by allowing username, hostname, and SSH args. KAFKA-1748 Make cluster resource declaration separate from test role specification and support loading in three ways: a zero-config default on localhost, a JSON file, and loading from Vagrant's ssh-config command. Diffs - system_test/cluster.json PRE-CREATION system_test/metrics.json cd3fc142176b8a3638db5230fb74f055c04c59d4 system_test/mirror_maker_testsuite/mirror_maker_test.py c0117c64cbb7687ca8fbcec6b5c188eb880300ef system_test/offset_management_testsuite/offset_management_test.py 12b5cd25140e1eb407dd57eef63d9783257688b2 system_test/replication_testsuite/replica_basic_test.py 660006cc253bbae3e7cd9f02601f1c1937dd1714 system_test/system_test_env.py c24d3e83922ef9a56591594b9d80af9bb6a30ce2 system_test/system_test_runner.py ee7aa252333553e8eb0bc046edf968ec99dddb70 system_test/utils/cluster.py PRE-CREATION system_test/utils/kafka_system_test_utils.py 1093b660ebd0cb5ab6d3731d26f151e1bf717f8a system_test/utils/metrics.py 3e663483202a1dd856f2fdfd1cfaac88e98f591c system_test/utils/setup_utils.py 0e8b7f97287ec7bc7a2d91573befe0992092734a system_test/utils/system_test_utils.py e8529cd31f9202a50b9c9774037ddc744ff92efa Diff: https://reviews.apache.org/r/27536/diff/ Testing --- Thanks, Ewen Cheslack-Postava
[jira] [Updated] (KAFKA-1748) Decouple system test cluster resources definition from service definitions
[ https://issues.apache.org/jira/browse/KAFKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1748: - Assignee: Ewen Cheslack-Postava Status: Patch Available (was: Open) Decouple system test cluster resources definition from service definitions -- Key: KAFKA-1748 URL: https://issues.apache.org/jira/browse/KAFKA-1748 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava Assignee: Ewen Cheslack-Postava Attachments: KAFKA-1748.patch Currently the system tests use JSON files that specify the set of services for each test and where they should run (i.e. hostname). These currently assume that you already have SSH keys setup, use the same username on the host running the tests and the test cluster, don't require any additional ssh/scp/rsync flags, and assume you'll always have a fixed set of compute resources (or that you'll spend a lot of time editing config files). While we don't want a whole cluster resource manager in the system tests, a bit more flexibility would make it easier to, e.g., run tests against a local vagrant cluster or on dynamically allocated EC2 instances. We can separate out the basic resource spec (i.e. json specifying how to access machines) from the service definition (i.e. a broker should run with settings x, y, z). Restricting to a very simple set of mappings (i.e. map services to hosts with round robin, optionally restricting to no reuse of hosts) should keep things simple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1748) Decouple system test cluster resources definition from service definitions
[ https://issues.apache.org/jira/browse/KAFKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195005#comment-14195005 ] Ewen Cheslack-Postava commented on KAFKA-1748: -- Created reviewboard https://reviews.apache.org/r/27536/diff/ against branch origin/trunk Decouple system test cluster resources definition from service definitions -- Key: KAFKA-1748 URL: https://issues.apache.org/jira/browse/KAFKA-1748 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava Attachments: KAFKA-1748.patch Currently the system tests use JSON files that specify the set of services for each test and where they should run (i.e. hostname). These currently assume that you already have SSH keys setup, use the same username on the host running the tests and the test cluster, don't require any additional ssh/scp/rsync flags, and assume you'll always have a fixed set of compute resources (or that you'll spend a lot of time editing config files). While we don't want a whole cluster resource manager in the system tests, a bit more flexibility would make it easier to, e.g., run tests against a local vagrant cluster or on dynamically allocated EC2 instances. We can separate out the basic resource spec (i.e. json specifying how to access machines) from the service definition (i.e. a broker should run with settings x, y, z). Restricting to a very simple set of mappings (i.e. map services to hosts with round robin, optionally restricting to no reuse of hosts) should keep things simple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1748) Decouple system test cluster resources definition from service definitions
[ https://issues.apache.org/jira/browse/KAFKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1748: - Attachment: KAFKA-1748_2014-11-03_12:04:18.patch Decouple system test cluster resources definition from service definitions -- Key: KAFKA-1748 URL: https://issues.apache.org/jira/browse/KAFKA-1748 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava Assignee: Ewen Cheslack-Postava Attachments: KAFKA-1748.patch, KAFKA-1748_2014-11-03_12:04:18.patch Currently the system tests use JSON files that specify the set of services for each test and where they should run (i.e. hostname). These currently assume that you already have SSH keys setup, use the same username on the host running the tests and the test cluster, don't require any additional ssh/scp/rsync flags, and assume you'll always have a fixed set of compute resources (or that you'll spend a lot of time editing config files). While we don't want a whole cluster resource manager in the system tests, a bit more flexibility would make it easier to, e.g., run tests against a local vagrant cluster or on dynamically allocated EC2 instances. We can separate out the basic resource spec (i.e. json specifying how to access machines) from the service definition (i.e. a broker should run with settings x, y, z). Restricting to a very simple set of mappings (i.e. map services to hosts with round robin, optionally restricting to no reuse of hosts) should keep things simple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 27536: Patch for KAFKA-1748
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27536/ --- (Updated Nov. 3, 2014, 8:04 p.m.) Review request for kafka. Bugs: KAFKA-1748 https://issues.apache.org/jira/browse/KAFKA-1748 Repository: kafka Description (updated) --- KAFKA-1748 Make cluster resource declaration separate from test role specification and support loading in three ways: a zero-config default on localhost, a JSON file, and loading from Vagrant's ssh-config command. Diffs (updated) - system_test/cluster.json PRE-CREATION system_test/mirror_maker_testsuite/mirror_maker_test.py c0117c64cbb7687ca8fbcec6b5c188eb880300ef system_test/offset_management_testsuite/offset_management_test.py 12b5cd25140e1eb407dd57eef63d9783257688b2 system_test/replication_testsuite/replica_basic_test.py 660006cc253bbae3e7cd9f02601f1c1937dd1714 system_test/system_test_env.py c24d3e83922ef9a56591594b9d80af9bb6a30ce2 system_test/system_test_runner.py ee7aa252333553e8eb0bc046edf968ec99dddb70 system_test/utils/cluster.py PRE-CREATION system_test/utils/kafka_system_test_utils.py 1093b660ebd0cb5ab6d3731d26f151e1bf717f8a system_test/utils/metrics.py 3e663483202a1dd856f2fdfd1cfaac88e98f591c system_test/utils/setup_utils.py 0e8b7f97287ec7bc7a2d91573befe0992092734a system_test/utils/system_test_utils.py e8529cd31f9202a50b9c9774037ddc744ff92efa Diff: https://reviews.apache.org/r/27536/diff/ Testing --- Thanks, Ewen Cheslack-Postava
[jira] [Commented] (KAFKA-1748) Decouple system test cluster resources definition from service definitions
[ https://issues.apache.org/jira/browse/KAFKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195006#comment-14195006 ] Ewen Cheslack-Postava commented on KAFKA-1748: -- Updated reviewboard https://reviews.apache.org/r/27536/diff/ against branch origin/trunk Decouple system test cluster resources definition from service definitions -- Key: KAFKA-1748 URL: https://issues.apache.org/jira/browse/KAFKA-1748 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava Assignee: Ewen Cheslack-Postava Attachments: KAFKA-1748.patch, KAFKA-1748_2014-11-03_12:04:18.patch Currently the system tests use JSON files that specify the set of services for each test and where they should run (i.e. hostname). These currently assume that you already have SSH keys setup, use the same username on the host running the tests and the test cluster, don't require any additional ssh/scp/rsync flags, and assume you'll always have a fixed set of compute resources (or that you'll spend a lot of time editing config files). While we don't want a whole cluster resource manager in the system tests, a bit more flexibility would make it easier to, e.g., run tests against a local vagrant cluster or on dynamically allocated EC2 instances. We can separate out the basic resource spec (i.e. json specifying how to access machines) from the service definition (i.e. a broker should run with settings x, y, z). Restricting to a very simple set of mappings (i.e. map services to hosts with round robin, optionally restricting to no reuse of hosts) should keep things simple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1748) Decouple system test cluster resources definition from service definitions
[ https://issues.apache.org/jira/browse/KAFKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195013#comment-14195013 ] Ewen Cheslack-Postava commented on KAFKA-1748: -- The patches I submitted make the suggested changes, including some support for pulling dynamic configurations (from Vagrant). For the local default config, this shouldn't have any effect, but will require config changes if you're overriding those settings -- the cluster resources go in cluster.json and just contain the hostnames, optional usernames, ssh args, java home and kafka home. In the Vagrant case, I defaulted KAFKA_HOME to /opt/kafka to match the setup in KAFKA-1173 since there's no place for the user to override that setting. Decouple system test cluster resources definition from service definitions -- Key: KAFKA-1748 URL: https://issues.apache.org/jira/browse/KAFKA-1748 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ewen Cheslack-Postava Assignee: Ewen Cheslack-Postava Attachments: KAFKA-1748.patch, KAFKA-1748_2014-11-03_12:04:18.patch Currently the system tests use JSON files that specify the set of services for each test and where they should run (i.e. hostname). These currently assume that you already have SSH keys setup, use the same username on the host running the tests and the test cluster, don't require any additional ssh/scp/rsync flags, and assume you'll always have a fixed set of compute resources (or that you'll spend a lot of time editing config files). While we don't want a whole cluster resource manager in the system tests, a bit more flexibility would make it easier to, e.g., run tests against a local vagrant cluster or on dynamically allocated EC2 instances. We can separate out the basic resource spec (i.e. json specifying how to access machines) from the service definition (i.e. a broker should run with settings x, y, z). Restricting to a very simple set of mappings (i.e. map services to hosts with round robin, optionally restricting to no reuse of hosts) should keep things simple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195029#comment-14195029 ] Joel Koshy commented on KAFKA-1555: --- Sounds good. In that case, can you modify it a bit? The only remaining confusion is that earlier on in that section we start by writing about acknowledgement by all replicas, but then directly (without further comment) assume it is actually acknowledgement by the current in-sync replicas. How about the following: Instead of _A message that has been acknowledged by all in-sync replicas..._ we can write _A message that has been acknowledged by all replicas..._. And then say _Note that acknowledgement by all replicas does not guarantee that the full set of assigned replicas have received the message. By default, acknowledgement happens as soon as all the current in-sync replicas have received the message. For example, if a topic is configured with only two replicas and one fails (i.e., only one in sync replica remains), then writes that specify required.acks=-1 will succeed. However, these writes could be lost if the remaining replica also fails. Although this ensures maximum availability ..._ (from earlier comment) As for the design itself: I just thought that the broker-side setting taking effect only with a client-setting is a bit odd especially if it does not hurt do so with the other ack settings. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, KAFKA-1555-DOCS.2.patch, KAFKA-1555-DOCS.3.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1070) Auto-assign node id
[ https://issues.apache.org/jira/browse/KAFKA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195345#comment-14195345 ] Otis Gospodnetic commented on KAFKA-1070: - [~harsha_ch] - it looks like a number of people would really like to use IPs for broker.id. There is a lot of interest in having that. Please see this thread: http://search-hadoop.com/m/4TaT4dTPKi1 Do you think this is something you could add to this patch, maybe as another broker ID assignment scheme/policy? Auto-assign node id --- Key: KAFKA-1070 URL: https://issues.apache.org/jira/browse/KAFKA-1070 Project: Kafka Issue Type: Bug Reporter: Jay Kreps Assignee: Sriharsha Chintalapani Labels: usability Attachments: KAFKA-1070.patch, KAFKA-1070_2014-07-19_16:06:13.patch, KAFKA-1070_2014-07-22_11:34:18.patch, KAFKA-1070_2014-07-24_20:58:17.patch, KAFKA-1070_2014-07-24_21:05:33.patch, KAFKA-1070_2014-08-21_10:26:20.patch It would be nice to have Kafka brokers auto-assign node ids rather than having that be a configuration. Having a configuration is irritating because (1) you have to generate a custom config for each broker and (2) even though it is in configuration, changing the node id can cause all kinds of bad things to happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-1749) Brokers continually throw exceptions when there are hundreds of topic being fetched by mirrormaker
Min Zhou created KAFKA-1749: --- Summary: Brokers continually throw exceptions when there are hundreds of topic being fetched by mirrormaker Key: KAFKA-1749 URL: https://issues.apache.org/jira/browse/KAFKA-1749 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1, 0.8.2 Reporter: Min Zhou Here is one piece of millions of the exceptions. {noformat} kafka.common.KafkaException: This operation cannot be completed on a complete request. at kafka.network.Transmission$class.expectIncomplete(Transmission.scala:34) at kafka.api.FetchResponseSend.expectIncomplete(FetchResponse.scala:191) at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:214) at kafka.network.Processor.write(SocketServer.scala:375) at kafka.network.Processor.run(SocketServer.scala:247) at java.lang.Thread.run(Thread.java:662) {noformat} We use tools to hook function kafka.api.FetchResponseSend.writeTo, found fetchResponse.sizeInBytes was overflow. Which means below code will get a result over the limit of integer type {noformat} val sizeInBytes = FetchResponse.headerSize + dataGroupedByTopic.foldLeft(0) ((folded, curr) = { val topicData = TopicData(curr._1, curr._2.map { case (topicAndPartition, partitionData) = (topicAndPartition.partition, partitionData) }) folded + topicData.sizeInBytes }) {noformat} If we just fetch few topics by mirrormaker, the brokers ran well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1749) Brokers continually throw exceptions when there are hundreds of topic being fetched by mirrormaker
[ https://issues.apache.org/jira/browse/KAFKA-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195417#comment-14195417 ] Ewen Cheslack-Postava commented on KAFKA-1749: -- This looks like it's probably a duplicate of KAFKA-1196 -- that looks like the stack trace I described in that issue that I get when sending a FetchResponse that exceeds 2GB. Brokers continually throw exceptions when there are hundreds of topic being fetched by mirrormaker -- Key: KAFKA-1749 URL: https://issues.apache.org/jira/browse/KAFKA-1749 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1, 0.8.2 Reporter: Min Zhou Here is one piece of millions of the exceptions. {noformat} kafka.common.KafkaException: This operation cannot be completed on a complete request. at kafka.network.Transmission$class.expectIncomplete(Transmission.scala:34) at kafka.api.FetchResponseSend.expectIncomplete(FetchResponse.scala:191) at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:214) at kafka.network.Processor.write(SocketServer.scala:375) at kafka.network.Processor.run(SocketServer.scala:247) at java.lang.Thread.run(Thread.java:662) {noformat} We use tools to hook function kafka.api.FetchResponseSend.writeTo, found fetchResponse.sizeInBytes was overflow. Which means below code will get a result over the limit of integer type {noformat} val sizeInBytes = FetchResponse.headerSize + dataGroupedByTopic.foldLeft(0) ((folded, curr) = { val topicData = TopicData(curr._1, curr._2.map { case (topicAndPartition, partitionData) = (topicAndPartition.partition, partitionData) }) folded + topicData.sizeInBytes }) {noformat} If we just fetch few topics by mirrormaker, the brokers ran well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1476) Get a list of consumer groups
[ https://issues.apache.org/jira/browse/KAFKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195419#comment-14195419 ] BalajiSeshadri commented on KAFKA-1476: --- [~nehanarkhede] can you please review my patch,this one has list and describeGroup functionality. Get a list of consumer groups - Key: KAFKA-1476 URL: https://issues.apache.org/jira/browse/KAFKA-1476 Project: Kafka Issue Type: Wish Components: tools Affects Versions: 0.8.1.1 Reporter: Ryan Williams Assignee: BalajiSeshadri Labels: newbie Fix For: 0.9.0 Attachments: ConsumerCommand.scala, KAFKA-1476-LIST-GROUPS.patch, KAFKA-1476-RENAME.patch, KAFKA-1476-REVIEW-COMMENTS.patch, KAFKA-1476.patch It would be useful to have a way to get a list of consumer groups currently active via some tool/script that ships with kafka. This would be helpful so that the system tools can be explored more easily. For example, when running the ConsumerOffsetChecker, it requires a group option bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic test --group ? But, when just getting started with kafka, using the console producer and consumer, it is not clear what value to use for the group option. If a list of consumer groups could be listed, then it would be clear what value to use. Background: http://mail-archives.apache.org/mod_mbox/kafka-users/201405.mbox/%3cCAOq_b1w=slze5jrnakxvak0gu9ctdkpazak1g4dygvqzbsg...@mail.gmail.com%3e -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-960) Upgrade Metrics to 3.x
[ https://issues.apache.org/jira/browse/KAFKA-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195493#comment-14195493 ] Jun Rao edited comment on KAFKA-960 at 11/4/14 1:01 AM: [~jjkoshy], Yes, we can move to the new metrics in the clients package on the broker in the future. We don't have a jira yes. We need to fix a similar issue with mbean name in the new metrics (KAFKA-1723) first though. Also, we probably want to improve the histogram implementation in the new metrics. was (Author: junrao): [~jjkoshy], Yes, we can move to the new metrics in the clients package on the broker in the future. We need to fix a similar issue with mbean name in the new metrics (KAFKA-1723) first though. Also, we probably want to improve the histogram implementation in the new metrics. Upgrade Metrics to 3.x -- Key: KAFKA-960 URL: https://issues.apache.org/jira/browse/KAFKA-960 Project: Kafka Issue Type: Improvement Affects Versions: 0.8.1 Reporter: Cosmin Lehene Now that metrics 3.0 has been released (http://metrics.codahale.com/about/release-notes/) we can upgrade back -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-960) Upgrade Metrics to 3.x
[ https://issues.apache.org/jira/browse/KAFKA-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195493#comment-14195493 ] Jun Rao commented on KAFKA-960: --- [~jjkoshy], Yes, we can move to the new metrics in the clients package on the broker in the future. We need to fix a similar issue with mbean name in the new metrics (KAFKA-1723) first though. Also, we probably want to improve the histogram implementation in the new metrics. Upgrade Metrics to 3.x -- Key: KAFKA-960 URL: https://issues.apache.org/jira/browse/KAFKA-960 Project: Kafka Issue Type: Improvement Affects Versions: 0.8.1 Reporter: Cosmin Lehene Now that metrics 3.0 has been released (http://metrics.codahale.com/about/release-notes/) we can upgrade back -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 26755: Patch for KAFKA-1706
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26755/#review59679 --- Ship it! Minor points that I can fix on check-in. core/src/main/scala/kafka/utils/ByteBoundedBlockingQueue.scala https://reviews.apache.org/r/26755/#comment101009 Unused. i.e, the var can be on line 50 alone core/src/main/scala/kafka/utils/ByteBoundedBlockingQueue.scala https://reviews.apache.org/r/26755/#comment101010 Will add a comment that this is to avoid the putLock.wait(0, 0) scenario core/src/test/scala/unit/kafka/utils/ByteBoundedBlockingQueueTest.scala https://reviews.apache.org/r/26755/#comment100971 I think 32-35 can be removed - Joel Koshy On Oct. 29, 2014, 5:57 p.m., Jiangjie Qin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26755/ --- (Updated Oct. 29, 2014, 5:57 p.m.) Review request for kafka. Bugs: KAFKA-1706 https://issues.apache.org/jira/browse/KAFKA-1706 Repository: kafka Description --- changed arguments name correct typo. Incorporated Joel's comments. Also fixed negative queue size problem. Incorporated Joel's comments. Added unit test for ByteBoundedBlockingQueue. Fixed a bug regarding wating time. Diffs - core/src/main/scala/kafka/utils/ByteBoundedBlockingQueue.scala PRE-CREATION core/src/test/scala/unit/kafka/utils/ByteBoundedBlockingQueueTest.scala PRE-CREATION Diff: https://reviews.apache.org/r/26755/diff/ Testing --- Thanks, Jiangjie Qin
[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names
[ https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195548#comment-14195548 ] Jun Rao commented on KAFKA-1481: [~jjkoshy], do you want to take another look at the patch? Stop using dashes AND underscores as separators in MBean names -- Key: KAFKA-1481 URL: https://issues.apache.org/jira/browse/KAFKA-1481 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Priority: Critical Labels: patch Fix For: 0.8.3 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, KAFKA-1481_2014-10-31_14-35-43.patch, KAFKA-1481_2014-11-03_16-39-41_doc.patch, KAFKA-1481_2014-11-03_17-02-23.patch, KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch MBeans should not use dashes or underscores as separators because these characters are allowed in hostnames, topics, group and consumer IDs, etc., and these are embedded in MBeans names making it impossible to parse out individual bits from MBeans. Perhaps a pipe character should be used to avoid the conflict. This looks like a major blocker because it means nobody can write Kafka 0.8.x monitoring tools unless they are doing it for themselves AND do not use dashes AND do not use underscores. See: http://search-hadoop.com/m/4TaT4lonIW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names
[ https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195545#comment-14195545 ] Jun Rao commented on KAFKA-1481: Thanks for the patch. Just some minor comments below. Otherwise, +1 from me. 60. AbstractFetcherManager: In the following, we don't need to wrap new Gauge in {}. MinFetchRate, { new Gauge[Double] { // current min fetch rate across all fetchers/topics/partitions def value = { val headRate: Double = fetcherThreadMap.headOption.map(_._2.fetcherStats.requestRate.oneMinuteRate).getOrElse(0) fetcherThreadMap.foldLeft(headRate)((curMinAll, fetcherThreadMapEntry) = { fetcherThreadMapEntry._2.fetcherStats.requestRate.oneMinuteRate.min(curMinAll) }) } } }, 61. AbstractFetcherThread: Could we align topic and partition to the same column as clientId? Ditto in a few other places. Map(clientId - metricId.clientId, topic - metricId.topic, partition - metricId.partitionId.toString) ) 62. FetchRequestAndResponseMetrics: In the following, could we put Map in a separate line? val tags = metricId match { case ClientIdAndBroker(clientId, brokerHost, brokerPort) = Map(clientId - clientId, brokerHost - brokerHost, brokerPort - brokerPort.toString) case ClientIdAllBrokers(clientId) = Map(clientId - clientId, allBrokers - true) } 62. TestUtils: It's a bit weird that sendMessagesToBrokerPartition() has both a config and configs. We actually don't need the config any more. When sending a message to a partition, we actually don't know which broker the partition is on. So, the message can just be of the form test + - + partition + - + x. We can also rename the method to sendMessagesToPartition(). Stop using dashes AND underscores as separators in MBean names -- Key: KAFKA-1481 URL: https://issues.apache.org/jira/browse/KAFKA-1481 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Priority: Critical Labels: patch Fix For: 0.8.3 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, KAFKA-1481_2014-10-31_14-35-43.patch, KAFKA-1481_2014-11-03_16-39-41_doc.patch, KAFKA-1481_2014-11-03_17-02-23.patch, KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch MBeans should not use dashes or underscores as separators because these characters are allowed in hostnames, topics, group and consumer IDs, etc., and these are embedded in MBeans names making it impossible to parse out individual bits from MBeans. Perhaps a pipe character should be used to avoid the conflict. This looks like a major blocker because it means nobody can write Kafka 0.8.x monitoring tools unless they are doing it for themselves AND do not use dashes AND do not use underscores. See: http://search-hadoop.com/m/4TaT4lonIW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1745) Each new thread creates a PIPE and KQUEUE as open files during producer.send() and does not get cleared when the thread that creates them is cleared.
[ https://issues.apache.org/jira/browse/KAFKA-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195561#comment-14195561 ] Jun Rao commented on KAFKA-1745: Did you call producer.close() when the producer thread is removed? Each new thread creates a PIPE and KQUEUE as open files during producer.send() and does not get cleared when the thread that creates them is cleared. - Key: KAFKA-1745 URL: https://issues.apache.org/jira/browse/KAFKA-1745 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Environment: Mac OS Mavericks Reporter: Vishal Priority: Critical Hi, I'm using the java client API for Kafka. I wanted to send data to Kafka by using a producer pool as I'm using a sync producer. The thread that sends the data is from the thread pool that grows and shrinks depending on the usage. So, when I try to send data from one thread, 1 KQUEUE and 2 PIPES are created (got this info by using lsof). If I keep using the same thread it's fine but when a new thread sends data to Kafka (using producer.send() ) a new KQUEUE and 2 PIPEs are created. This is okay, but when the thread is cleared from the thread pool and a new thread is created, then new KQUEUEs and PIPEs are created. The problem is that the old ones which were created are not getting destroyed and they are showing up as open files. This is causing a major problem as the number of open file keep increasing and does not decrease. Please suggest any solutions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names
[ https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Koshy updated KAFKA-1481: -- Reviewer: Joel Koshy (was: Jun Rao) Stop using dashes AND underscores as separators in MBean names -- Key: KAFKA-1481 URL: https://issues.apache.org/jira/browse/KAFKA-1481 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Priority: Critical Labels: patch Fix For: 0.8.3 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, KAFKA-1481_2014-10-31_14-35-43.patch, KAFKA-1481_2014-11-03_16-39-41_doc.patch, KAFKA-1481_2014-11-03_17-02-23.patch, KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch MBeans should not use dashes or underscores as separators because these characters are allowed in hostnames, topics, group and consumer IDs, etc., and these are embedded in MBeans names making it impossible to parse out individual bits from MBeans. Perhaps a pipe character should be used to avoid the conflict. This looks like a major blocker because it means nobody can write Kafka 0.8.x monitoring tools unless they are doing it for themselves AND do not use dashes AND do not use underscores. See: http://search-hadoop.com/m/4TaT4lonIW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names
[ https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195573#comment-14195573 ] Joel Koshy commented on KAFKA-1481: --- Yes thanks - I'll take a look at this tomorrow. Stop using dashes AND underscores as separators in MBean names -- Key: KAFKA-1481 URL: https://issues.apache.org/jira/browse/KAFKA-1481 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Priority: Critical Labels: patch Fix For: 0.8.3 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, KAFKA-1481_2014-10-31_14-35-43.patch, KAFKA-1481_2014-11-03_16-39-41_doc.patch, KAFKA-1481_2014-11-03_17-02-23.patch, KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch MBeans should not use dashes or underscores as separators because these characters are allowed in hostnames, topics, group and consumer IDs, etc., and these are embedded in MBeans names making it impossible to parse out individual bits from MBeans. Perhaps a pipe character should be used to avoid the conflict. This looks like a major blocker because it means nobody can write Kafka 0.8.x monitoring tools unless they are doing it for themselves AND do not use dashes AND do not use underscores. See: http://search-hadoop.com/m/4TaT4lonIW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1738) Partitions for topic not created after restart from forced shutdown
[ https://issues.apache.org/jira/browse/KAFKA-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195577#comment-14195577 ] Jun Rao commented on KAFKA-1738: I can't reproduce this issue by following the steps in the description. Does this happen every time? Partitions for topic not created after restart from forced shutdown --- Key: KAFKA-1738 URL: https://issues.apache.org/jira/browse/KAFKA-1738 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1, 0.8.2 Environment: Linux, 2GB RAM, 2 Core CPU Reporter: Pradeep We are using Kafka Topic APIs to create the topic. But in some cases, the topic gets created but we don't see the partition specific files and when producer/consumer tries to get the topic metadata and it fails with exception. Same happens if one tries to create using the command line. k.p.BrokerPartitionInfo - Error while fetching metadata [{TopicMetadata for topic tloader1 - No partition metadata for topic tloader1 due to kafka.common.UnknownTopicOrPartitionException}] for topic [tloader1]: class kafka.common.UnknownTopicOrPartitionException Steps to reproduce - 1. Stop kafka using kill -9 PID of Kafka 2. Start Kafka 3. Create Topic with partition and replication factor of 1. 4. Check the response “Created topic topic_name” 5. Run the list command to verify if its created. 6. Now check the data directory of kakfa. There would not be any for the newly created topic. We see issues when we are creating new topics. This happens randomly and we dont know the exact reasons. We see the below logs in controller during the time of creation of topics which doesnt have the partition files. 2014-11-03 13:12:50,625] INFO [Controller 0]: New topic creation callback for [JobJTopic,0] (kafka.controller.KafkaController) [2014-11-03 13:12:50,626] INFO [Controller 0]: New partition creation callback for [JobJTopic,0] (kafka.controller.KafkaController) [2014-11-03 13:12:50,626] INFO [Partition state machine on Controller 0]: Invoking state change to NewPartition for partitions [JobJTopic,0] (kafka.controller.PartitionStateMachine) [2014-11-03 13:12:50,653] INFO [Replica state machine on controller 0]: Invoking state change to NewReplica for replicas [Topic=JobJTopic,Partition=0,Replica=0] (kafka.controller.ReplicaStateMachine) [2014-11-03 13:12:50,654] INFO [Partition state machine on Controller 0]: Invoking state change to OnlinePartition for partitions [JobJTopic,0] (kafka.controller.PartitionStateMachine) [2014-11-03 13:12:50,654] DEBUG [Partition state machine on Controller 0]: Live assigned replicas for partition [JobJTopic,0] are: [List(0)] (kafka.controller.PartitionStateMachine) [2014-11-03 13:12:50,654] DEBUG [Partition state machine on Controller 0]: Initializing leader and isr for partition [JobJTopic,0] to (Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:2) (kafka.controller.PartitionStateMachine) [2014-11-03 13:12:50,667] INFO [Replica state machine on controller 0]: Invoking state change to OnlineReplica for replicas [Topic=JobJTopic,Partition=0,Replica=0] (kafka.controller.ReplicaStateMachine) [2014-11-03 13:12:50,794] WARN [Controller-0-to-broker-0-send-thread], Controller 0 fails to send a request to broker id:0,host:DMIPVM,port:9092 (kafka.controller.RequestSendThread) java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. at kafka.utils.Utils$.read(Utils.scala:381) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Receive$class.readCompletely(Transmission.scala:56) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) at kafka.network.BlockingChannel.receive(BlockingChannel.scala:108) at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60) [2014-11-03 13:12:50,965] ERROR [Controller-0-to-broker-0-send-thread], Controller 0 epoch 2 failed to send request Name:UpdateMetadataRequest;Version:0;Controller:0;ControllerEpoch:2;CorrelationId:43;ClientId:id_0-host_null-port_9092;AliveBrokers:id:0,host:DMIPVM,port:9092;PartitionState:[JobJTopic,0] - (LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:2),ReplicationFactor:1),AllReplicas:0) to broker id:0,host:DMIPVM,port:9092. Reconnecting to broker. (kafka.controller.RequestSendThread) java.nio.channels.ClosedChannelException at kafka.network.BlockingChannel.send(BlockingChannel.scala:97) at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
[jira] [Updated] (KAFKA-1729) add doc for Kafka-based offset management in 0.8.2
[ https://issues.apache.org/jira/browse/KAFKA-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Koshy updated KAFKA-1729: -- Status: Patch Available (was: Open) add doc for Kafka-based offset management in 0.8.2 -- Key: KAFKA-1729 URL: https://issues.apache.org/jira/browse/KAFKA-1729 Project: Kafka Issue Type: Sub-task Reporter: Jun Rao Assignee: Joel Koshy Attachments: KAFKA-1782-doc-v1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1729) add doc for Kafka-based offset management in 0.8.2
[ https://issues.apache.org/jira/browse/KAFKA-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Koshy updated KAFKA-1729: -- Attachment: KAFKA-1782-doc-v1.patch Uploading v1 for review add doc for Kafka-based offset management in 0.8.2 -- Key: KAFKA-1729 URL: https://issues.apache.org/jira/browse/KAFKA-1729 Project: Kafka Issue Type: Sub-task Reporter: Jun Rao Assignee: Joel Koshy Attachments: KAFKA-1782-doc-v1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1070) Auto-assign node id
[ https://issues.apache.org/jira/browse/KAFKA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195687#comment-14195687 ] Otis Gospodnetic commented on KAFKA-1070: - bq. this patch pending the 0.8.2 release Note this issue doesn't have Fix Version set to 0.8.2. Maybe that's what you want? +1 from me -- it looks like a number of people would like to see this issue resolved! Auto-assign node id --- Key: KAFKA-1070 URL: https://issues.apache.org/jira/browse/KAFKA-1070 Project: Kafka Issue Type: Bug Reporter: Jay Kreps Assignee: Sriharsha Chintalapani Labels: usability Attachments: KAFKA-1070.patch, KAFKA-1070_2014-07-19_16:06:13.patch, KAFKA-1070_2014-07-22_11:34:18.patch, KAFKA-1070_2014-07-24_20:58:17.patch, KAFKA-1070_2014-07-24_21:05:33.patch, KAFKA-1070_2014-08-21_10:26:20.patch It would be nice to have Kafka brokers auto-assign node ids rather than having that be a configuration. Having a configuration is irritating because (1) you have to generate a custom config for each broker and (2) even though it is in configuration, changing the node id can cause all kinds of bad things to happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1070) Auto-assign node id
[ https://issues.apache.org/jira/browse/KAFKA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195700#comment-14195700 ] Sriharsha Chintalapani commented on KAFKA-1070: --- [~otis] Sorry I meant pending review after 0.8.2 release. Not planned for 0.8.2 release. I'll update the patch with ip as brokerid option. Auto-assign node id --- Key: KAFKA-1070 URL: https://issues.apache.org/jira/browse/KAFKA-1070 Project: Kafka Issue Type: Bug Reporter: Jay Kreps Assignee: Sriharsha Chintalapani Labels: usability Attachments: KAFKA-1070.patch, KAFKA-1070_2014-07-19_16:06:13.patch, KAFKA-1070_2014-07-22_11:34:18.patch, KAFKA-1070_2014-07-24_20:58:17.patch, KAFKA-1070_2014-07-24_21:05:33.patch, KAFKA-1070_2014-08-21_10:26:20.patch It would be nice to have Kafka brokers auto-assign node ids rather than having that be a configuration. Having a configuration is irritating because (1) you have to generate a custom config for each broker and (2) even though it is in configuration, changing the node id can cause all kinds of bad things to happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KAFKA-1733) Producer.send will block indeterminately when broker is unavailable.
[ https://issues.apache.org/jira/browse/KAFKA-1733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Rao resolved KAFKA-1733. Resolution: Fixed Fix Version/s: 0.8.3 Assignee: Marc Chung Thanks for the patch. +1 and committed to trunk. Producer.send will block indeterminately when broker is unavailable. Key: KAFKA-1733 URL: https://issues.apache.org/jira/browse/KAFKA-1733 Project: Kafka Issue Type: Bug Components: core, producer Reporter: Marc Chung Assignee: Marc Chung Fix For: 0.8.3 Attachments: kafka-1733-add-connectTimeoutMs.patch This is a follow up to the conversation here: https://mail-archives.apache.org/mod_mbox/kafka-dev/201409.mbox/%3ccaog_4qymoejhkbo0n31+a-ujx0z5unsisd5wbrmn-xtx7gi...@mail.gmail.com%3E During ClientUtils.fetchTopicMetadata, if the broker is unavailable, socket.connect will block indeterminately. Any retry policy (message.send.max.retries) further increases the time spent waiting for the socket to connect. The root fix is to add a connection timeout value to the BlockingChannel's socket configuration, like so: {noformat} -channel.socket.connect(new InetSocketAddress(host, port)) +channel.socket.connect(new InetSocketAddress(host, port), connectTimeoutMs) {noformat} The simplest thing to do here would be to have a constant, default value that would be applied to every socket configuration. Is that acceptable? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-391) Producer request and response classes should use maps
[ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195762#comment-14195762 ] Honghai Chen commented on KAFKA-391: This situation happen under below scenario: one broker is leader for several partitions, for example 3, when send one messageset which has message for all of the 3 partitions of this broker , the response.status.size is 3 and the producerRequest.data.size is 1.then it hit this exception. Any idea for fix? Do we need compare response.status.size with messagesPerTopic.Count instead of producerRequest.data.size ? private def send(brokerId: Int, messagesPerTopic: collection.mutable.Map[TopicAndPartition, ByteBufferMessageSet]) = { if(brokerId 0) { warn(Failed to send data since partitions %s don't have a leader.format(messagesPerTopic.map(_._1).mkString(,))) messagesPerTopic.keys.toSeq } else if(messagesPerTopic.size 0) { val currentCorrelationId = correlationId.getAndIncrement val producerRequest = new ProducerRequest(currentCorrelationId, config.clientId, config.requestRequiredAcks, config.requestTimeoutMs, messagesPerTopic) var failedTopicPartitions = Seq.empty[TopicAndPartition] try { val syncProducer = producerPool.getProducer(brokerId) debug(Producer sending messages with correlation id %d for topics %s to broker %d on %s:%d .format(currentCorrelationId, messagesPerTopic.keySet.mkString(,), brokerId, syncProducer.config.host, syncProducer.config.port)) val response = syncProducer.send(producerRequest) debug(Producer sent messages with correlation id %d for topics %s to broker %d on %s:%d .format(currentCorrelationId, messagesPerTopic.keySet.mkString(,), brokerId, syncProducer.config.host, syncProducer.config.port)) if(response != null) { if (response.status.size != producerRequest.data.size) throw new KafkaException(Incomplete response (%s) for producer request (%s).format(response, producerRequest)) Producer request and response classes should use maps - Key: KAFKA-391 URL: https://issues.apache.org/jira/browse/KAFKA-391 Project: Kafka Issue Type: Bug Reporter: Joel Koshy Assignee: Joel Koshy Priority: Blocker Labels: optimization Fix For: 0.8.0 Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch, KAFKA-391-v3.patch, KAFKA-391-v4.patch Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays. It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks). We can probably do the same in the fetch request/response. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Jenkins build is back to normal : Kafka-trunk #322
See https://builds.apache.org/job/Kafka-trunk/322/changes