[jira] [Created] (KAFKA-1745) Each new thread creates a PIPE and KQUEUE as open files during producer.send() and does no0t get cleared when the thread that creates them is cleared.

2014-11-03 Thread Vishal (JIRA)
Vishal created KAFKA-1745:
-

 Summary: Each new thread creates a PIPE and KQUEUE as open files 
during producer.send() and does no0t get cleared when the thread that creates 
them is cleared.
 Key: KAFKA-1745
 URL: https://issues.apache.org/jira/browse/KAFKA-1745
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
 Environment: Mac OS Mavericks
Reporter: Vishal
Priority: Critical


Hi,
I'm using the java client API for Kafka. I wanted to send data to Kafka by 
using a producer pool as I'm using a sync producer. The thread that sends the 
data is from the thread pool that grows and shrinks depending on the usage. So, 
when I try to send data from one thread, 1 KQUEUE and 2 PIPES are created (got 
this info by using lsof). If I keep using the same thread it's fine but when a 
new thread sends data to Kafka (using producer.send() ) a new KQUEUE and 2 
PIPEs are created.

This is okay, but when the thread is cleared from the thread pool and a new 
thread is created, then new KQUEUEs and PIPEs are created. The problem is that 
the old ones which were created are not getting destroyed and they are showing 
up as open files. This is causing a major problem as the number of open file 
keep increasing and does not decrease.

Please suggest any solutions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1745) Each new thread creates a PIPE and KQUEUE as open files during producer.send() and does not get cleared when the thread that creates them is cleared.

2014-11-03 Thread Vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishal updated KAFKA-1745:
--
Summary: Each new thread creates a PIPE and KQUEUE as open files during 
producer.send() and does not get cleared when the thread that creates them is 
cleared.  (was: Each new thread creates a PIPE and KQUEUE as open files during 
producer.send() and does no0t get cleared when the thread that creates them is 
cleared.)

 Each new thread creates a PIPE and KQUEUE as open files during 
 producer.send() and does not get cleared when the thread that creates them is 
 cleared.
 -

 Key: KAFKA-1745
 URL: https://issues.apache.org/jira/browse/KAFKA-1745
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
 Environment: Mac OS Mavericks
Reporter: Vishal
Priority: Critical

 Hi,
 I'm using the java client API for Kafka. I wanted to send data to Kafka 
 by using a producer pool as I'm using a sync producer. The thread that sends 
 the data is from the thread pool that grows and shrinks depending on the 
 usage. So, when I try to send data from one thread, 1 KQUEUE and 2 PIPES are 
 created (got this info by using lsof). If I keep using the same thread it's 
 fine but when a new thread sends data to Kafka (using producer.send() ) a new 
 KQUEUE and 2 PIPEs are created.
 This is okay, but when the thread is cleared from the thread pool and a new 
 thread is created, then new KQUEUEs and PIPEs are created. The problem is 
 that the old ones which were created are not getting destroyed and they are 
 showing up as open files. This is causing a major problem as the number of 
 open file keep increasing and does not decrease.
 Please suggest any solutions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-11-03 Thread Vladimir Tretyakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Tretyakov updated KAFKA-1481:
--
Attachment: KAFKA-1481_2014-11-03_16-39-41_doc.patch

 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, 
 KAFKA-1481_2014-10-31_14-35-43.patch, 
 KAFKA-1481_2014-11-03_16-39-41_doc.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-11-03 Thread Vladimir Tretyakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Tretyakov updated KAFKA-1481:
--
Attachment: KAFKA-1481_2014-11-03_17-02-23.patch

 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, 
 KAFKA-1481_2014-10-31_14-35-43.patch, 
 KAFKA-1481_2014-11-03_16-39-41_doc.patch, 
 KAFKA-1481_2014-11-03_17-02-23.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-11-03 Thread Vladimir Tretyakov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194551#comment-14194551
 ] 

Vladimir Tretyakov commented on KAFKA-1481:
---

Hi, added new patch

50. I think it is better to have separate case class for 'all' things. It is 
real 'other' case.
51. Done
52.1 Done
52.2 Done
52.3 Done
52.4 Done
52.5 Done
53. 
https://issues.apache.org/jira/secure/attachment/12678927/KAFKA-1481_2014-11-03_16-39-41_doc.patch

 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, 
 KAFKA-1481_2014-10-31_14-35-43.patch, 
 KAFKA-1481_2014-11-03_16-39-41_doc.patch, 
 KAFKA-1481_2014-11-03_17-02-23.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1686) Implement SASL/Kerberos

2014-11-03 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194618#comment-14194618
 ] 

Sriharsha Chintalapani commented on KAFKA-1686:
---

Hi [~gwenshap] sorry for the late reply. I haven't started on this JIRA and 
probably for another week atleast I won't be able to work on it.
 It looks like the first step must be to authenticate Kafka broker  itself 
with Kerberos. 
Yes this can be a separate piece and make it into its own JIRA. I'll look into 
KAFKA-1684 and update the JIRA soon with implementation details.

 Implement SASL/Kerberos
 ---

 Key: KAFKA-1686
 URL: https://issues.apache.org/jira/browse/KAFKA-1686
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Sriharsha Chintalapani
 Fix For: 0.9.0


 Implement SASL/Kerberos authentication.
 To do this we will need to introduce a new SASLRequest and SASLResponse pair 
 to the client protocol. This request and response will each have only a 
 single byte[] field and will be used to handle the SASL challenge/response 
 cycle. Doing this will initialize the SaslServer instance and associate it 
 with the session in a manner similar to KAFKA-1684.
 When using integrity or encryption mechanisms with SASL we will need to wrap 
 and unwrap bytes as in KAFKA-1684 so the same interface that covers the 
 SSLEngine will need to also cover the SaslServer instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1686) Implement SASL/Kerberos

2014-11-03 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194818#comment-14194818
 ] 

Gwen Shapira commented on KAFKA-1686:
-

An existing long-lived connection doesn't require renewing, since the ticket is 
only validated on the initial handshake.
(Yes, it does make it difficult to invalidate clients, but this is pretty 
normal for most kerberized services)
If the connection drops or the client needs another connection (perhaps when 
rebalancing?), the client needs to renew the ticket and present a new one.


 Implement SASL/Kerberos
 ---

 Key: KAFKA-1686
 URL: https://issues.apache.org/jira/browse/KAFKA-1686
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Sriharsha Chintalapani
 Fix For: 0.9.0


 Implement SASL/Kerberos authentication.
 To do this we will need to introduce a new SASLRequest and SASLResponse pair 
 to the client protocol. This request and response will each have only a 
 single byte[] field and will be used to handle the SASL challenge/response 
 cycle. Doing this will initialize the SaslServer instance and associate it 
 with the session in a manner similar to KAFKA-1684.
 When using integrity or encryption mechanisms with SASL we will need to wrap 
 and unwrap bytes as in KAFKA-1684 so the same interface that covers the 
 SSLEngine will need to also cover the SaslServer instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1738) Partitions for topic not created after restart from forced shutdown

2014-11-03 Thread Pradeep (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep updated KAFKA-1738:
---
Description: 
We are using Kafka Topic APIs to create the topic. But in some cases, the topic 
gets created but we don't see the partition specific files and when 
producer/consumer tries to get the topic metadata and it fails with exception. 
Same happens if one tries to create using the command line.

k.p.BrokerPartitionInfo - Error while fetching metadata [{TopicMetadata for 
topic tloader1 - No partition metadata for topic tloader1 due to 
kafka.common.UnknownTopicOrPartitionException}] for topic [tloader1]: class 
kafka.common.UnknownTopicOrPartitionException

Steps to reproduce - 

1.  Stop kafka using kill  -9 PID of Kafka
2.  Start Kafka
3.  Create Topic with partition and replication factor of 1.
4.  Check the response “Created topic topic_name”
5.  Run the list command to verify if its created.
6.  Now check the data directory of kakfa. There would not be any for the 
newly created topic.


We see issues when we are creating new topics. This happens randomly and we 
dont know the exact reasons. We see the below logs in controller during the 
time of creation of topics which doesnt have the partition files.

2014-11-03 13:12:50,625] INFO [Controller 0]: New topic creation callback for 
[JobJTopic,0] (kafka.controller.KafkaController)
[2014-11-03 13:12:50,626] INFO [Controller 0]: New partition creation callback 
for [JobJTopic,0] (kafka.controller.KafkaController)
[2014-11-03 13:12:50,626] INFO [Partition state machine on Controller 0]: 
Invoking state change to NewPartition for partitions [JobJTopic,0] 
(kafka.controller.PartitionStateMachine)
[2014-11-03 13:12:50,653] INFO [Replica state machine on controller 0]: 
Invoking state change to NewReplica for replicas 
[Topic=JobJTopic,Partition=0,Replica=0] (kafka.controller.ReplicaStateMachine)
[2014-11-03 13:12:50,654] INFO [Partition state machine on Controller 0]: 
Invoking state change to OnlinePartition for partitions [JobJTopic,0] 
(kafka.controller.PartitionStateMachine)
[2014-11-03 13:12:50,654] DEBUG [Partition state machine on Controller 0]: Live 
assigned replicas for partition [JobJTopic,0] are: [List(0)] 
(kafka.controller.PartitionStateMachine)
[2014-11-03 13:12:50,654] DEBUG [Partition state machine on Controller 0]: 
Initializing leader and isr for partition [JobJTopic,0] to 
(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:2) 
(kafka.controller.PartitionStateMachine)
[2014-11-03 13:12:50,667] INFO [Replica state machine on controller 0]: 
Invoking state change to OnlineReplica for replicas 
[Topic=JobJTopic,Partition=0,Replica=0] (kafka.controller.ReplicaStateMachine)
[2014-11-03 13:12:50,794] WARN [Controller-0-to-broker-0-send-thread], 
Controller 0 fails to send a request to broker id:0,host:DMIPVM,port:9092 
(kafka.controller.RequestSendThread)
java.io.EOFException: Received -1 when reading from channel, socket has likely 
been closed.
at kafka.utils.Utils$.read(Utils.scala:381)
at 
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
at 
kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
at kafka.network.BlockingChannel.receive(BlockingChannel.scala:108)
at 
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
[2014-11-03 13:12:50,965] ERROR [Controller-0-to-broker-0-send-thread], 
Controller 0 epoch 2 failed to send request 
Name:UpdateMetadataRequest;Version:0;Controller:0;ControllerEpoch:2;CorrelationId:43;ClientId:id_0-host_null-port_9092;AliveBrokers:id:0,host:DMIPVM,port:9092;PartitionState:[JobJTopic,0]
 - 
(LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:2),ReplicationFactor:1),AllReplicas:0)
 to broker id:0,host:DMIPVM,port:9092. Reconnecting to broker. 
(kafka.controller.RequestSendThread)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:97)
at 
kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
at 
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)




  was:
We are using Kafka Topic APIs to create the topic. But in some cases, the topic 
gets created but we don't see the partition specific files and when 
producer/consumer tries to get the topic metadata and it fails with exception. 
Same happens if one tries to create using the command line.

k.p.BrokerPartitionInfo - Error while fetching metadata [{TopicMetadata for 
topic tloader1 - No partition metadata for topic tloader1 due to 

[jira] [Commented] (KAFKA-960) Upgrade Metrics to 3.x

2014-11-03 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194979#comment-14194979
 ] 

Joel Koshy commented on KAFKA-960:
--

Also, we should plan on removing our dependency on Metrics and use the metrics 
package we already have in the clients package. [~junrao] do you know if we 
have a jira for that? We can probably target that for 0.9 or shortly thereafter.

 Upgrade Metrics to 3.x
 --

 Key: KAFKA-960
 URL: https://issues.apache.org/jira/browse/KAFKA-960
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.8.1
Reporter: Cosmin Lehene

 Now that metrics 3.0 has been released 
 (http://metrics.codahale.com/about/release-notes/) we can upgrade back



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1746) System tests don't handle errors well

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)
Ewen Cheslack-Postava created KAFKA-1746:


 Summary: System tests don't handle errors well
 Key: KAFKA-1746
 URL: https://issues.apache.org/jira/browse/KAFKA-1746
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava


The system test scripts don't handle errors well. A couple of key issues:

* Unexpected exceptions during tests are just ignored and the tests appear to 
be successful in the reports.
* The scripts exit code is always 0, even if tests fail.
* Almost no subprocess calls are checked. In a lot of cases this is ok, and 
sometimes it's not possible (e.g. after starting a long-running remote 
process), but in some cases such as calls to DumpLogSegments, the tests can 
miss that the tools is exiting with an exception and the test appears to be 
successful even though no data was verified.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1747) TestcaseEnv improperly shares state between instances

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)
Ewen Cheslack-Postava created KAFKA-1747:


 Summary: TestcaseEnv improperly shares state between instances
 Key: KAFKA-1747
 URL: https://issues.apache.org/jira/browse/KAFKA-1747
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava


TestcaseEnv in system tests uses class variables instead of instance variables 
for a bunch of state. This causes the data to persist between tests. In some 
cases this can cause tests to break (e.g. there will be state from a service 
running in a previous test that doesn't exist in the current test; trying to 
look up state about that service raises an exception or produces invalid data).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1746) System tests don't handle errors well

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1746:
-
Assignee: Ewen Cheslack-Postava
  Status: Patch Available  (was: Open)

 System tests don't handle errors well
 -

 Key: KAFKA-1746
 URL: https://issues.apache.org/jira/browse/KAFKA-1746
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
 Attachments: KAFKA-1746.patch


 The system test scripts don't handle errors well. A couple of key issues:
 * Unexpected exceptions during tests are just ignored and the tests appear to 
 be successful in the reports.
 * The scripts exit code is always 0, even if tests fail.
 * Almost no subprocess calls are checked. In a lot of cases this is ok, and 
 sometimes it's not possible (e.g. after starting a long-running remote 
 process), but in some cases such as calls to DumpLogSegments, the tests can 
 miss that the tools is exiting with an exception and the test appears to be 
 successful even though no data was verified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1746) System tests don't handle errors well

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1746:
-
Attachment: KAFKA-1746.patch

 System tests don't handle errors well
 -

 Key: KAFKA-1746
 URL: https://issues.apache.org/jira/browse/KAFKA-1746
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava
 Attachments: KAFKA-1746.patch


 The system test scripts don't handle errors well. A couple of key issues:
 * Unexpected exceptions during tests are just ignored and the tests appear to 
 be successful in the reports.
 * The scripts exit code is always 0, even if tests fail.
 * Almost no subprocess calls are checked. In a lot of cases this is ok, and 
 sometimes it's not possible (e.g. after starting a long-running remote 
 process), but in some cases such as calls to DumpLogSegments, the tests can 
 miss that the tools is exiting with an exception and the test appears to be 
 successful even though no data was verified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 27534: Patch for KAFKA-1746

2014-11-03 Thread Ewen Cheslack-Postava

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27534/
---

Review request for kafka.


Bugs: KAFKA-1746
https://issues.apache.org/jira/browse/KAFKA-1746


Repository: kafka


Description
---

KAFKA-1746 Make system tests return a useful exit code.


KAFKA-1746 Check the exit code when running DumpLogSegments to verify data.


Diffs
-

  system_test/mirror_maker_testsuite/mirror_maker_test.py 
c0117c64cbb7687ca8fbcec6b5c188eb880300ef 
  system_test/offset_management_testsuite/offset_management_test.py 
12b5cd25140e1eb407dd57eef63d9783257688b2 
  system_test/replication_testsuite/replica_basic_test.py 
660006cc253bbae3e7cd9f02601f1c1937dd1714 
  system_test/system_test_runner.py ee7aa252333553e8eb0bc046edf968ec99dddb70 
  system_test/utils/kafka_system_test_utils.py 
1093b660ebd0cb5ab6d3731d26f151e1bf717f8a 

Diff: https://reviews.apache.org/r/27534/diff/


Testing
---


Thanks,

Ewen Cheslack-Postava



[jira] [Commented] (KAFKA-1746) System tests don't handle errors well

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194983#comment-14194983
 ] 

Ewen Cheslack-Postava commented on KAFKA-1746:
--

Created reviewboard https://reviews.apache.org/r/27534/diff/
 against branch origin/trunk

 System tests don't handle errors well
 -

 Key: KAFKA-1746
 URL: https://issues.apache.org/jira/browse/KAFKA-1746
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava
 Attachments: KAFKA-1746.patch


 The system test scripts don't handle errors well. A couple of key issues:
 * Unexpected exceptions during tests are just ignored and the tests appear to 
 be successful in the reports.
 * The scripts exit code is always 0, even if tests fail.
 * Almost no subprocess calls are checked. In a lot of cases this is ok, and 
 sometimes it's not possible (e.g. after starting a long-running remote 
 process), but in some cases such as calls to DumpLogSegments, the tests can 
 miss that the tools is exiting with an exception and the test appears to be 
 successful even though no data was verified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 27535: Patch for KAFKA-1747

2014-11-03 Thread Ewen Cheslack-Postava

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27535/
---

Review request for kafka.


Bugs: KAFKA-1747
https://issues.apache.org/jira/browse/KAFKA-1747


Repository: kafka


Description
---

KAKFA-1747 Fix TestcaseEnv so state isn't shared between instances.


Diffs
-

  system_test/utils/testcase_env.py b3c29105c04348f036efbbdc430e14e099ca8c70 

Diff: https://reviews.apache.org/r/27535/diff/


Testing
---


Thanks,

Ewen Cheslack-Postava



[jira] [Commented] (KAFKA-1747) TestcaseEnv improperly shares state between instances

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194984#comment-14194984
 ] 

Ewen Cheslack-Postava commented on KAFKA-1747:
--

Created reviewboard https://reviews.apache.org/r/27535/diff/
 against branch origin/trunk

 TestcaseEnv improperly shares state between instances
 -

 Key: KAFKA-1747
 URL: https://issues.apache.org/jira/browse/KAFKA-1747
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava
 Attachments: KAFKA-1747.patch


 TestcaseEnv in system tests uses class variables instead of instance 
 variables for a bunch of state. This causes the data to persist between 
 tests. In some cases this can cause tests to break (e.g. there will be state 
 from a service running in a previous test that doesn't exist in the current 
 test; trying to look up state about that service raises an exception or 
 produces invalid data).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1747) TestcaseEnv improperly shares state between instances

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1747:
-
Attachment: KAFKA-1747.patch

 TestcaseEnv improperly shares state between instances
 -

 Key: KAFKA-1747
 URL: https://issues.apache.org/jira/browse/KAFKA-1747
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava
 Attachments: KAFKA-1747.patch


 TestcaseEnv in system tests uses class variables instead of instance 
 variables for a bunch of state. This causes the data to persist between 
 tests. In some cases this can cause tests to break (e.g. there will be state 
 from a service running in a previous test that doesn't exist in the current 
 test; trying to look up state about that service raises an exception or 
 produces invalid data).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1747) TestcaseEnv improperly shares state between instances

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1747:
-
Assignee: Ewen Cheslack-Postava
  Status: Patch Available  (was: Open)

 TestcaseEnv improperly shares state between instances
 -

 Key: KAFKA-1747
 URL: https://issues.apache.org/jira/browse/KAFKA-1747
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
 Attachments: KAFKA-1747.patch


 TestcaseEnv in system tests uses class variables instead of instance 
 variables for a bunch of state. This causes the data to persist between 
 tests. In some cases this can cause tests to break (e.g. there will be state 
 from a service running in a previous test that doesn't exist in the current 
 test; trying to look up state about that service raises an exception or 
 produces invalid data).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-960) Upgrade Metrics to 3.x

2014-11-03 Thread Erik van Oosten (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194990#comment-14194990
 ] 

Erik van Oosten commented on KAFKA-960:
---

If 2.20 and 2.1.5 are indeed binary compatible (how do you test that?), _all 
existing_ releases could be patched by simply replacing a jar :)

 Upgrade Metrics to 3.x
 --

 Key: KAFKA-960
 URL: https://issues.apache.org/jira/browse/KAFKA-960
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.8.1
Reporter: Cosmin Lehene

 Now that metrics 3.0 has been released 
 (http://metrics.codahale.com/about/release-notes/) we can upgrade back



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1748) Decouple system test cluster resources definition from service definitions

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)
Ewen Cheslack-Postava created KAFKA-1748:


 Summary: Decouple system test cluster resources definition from 
service definitions
 Key: KAFKA-1748
 URL: https://issues.apache.org/jira/browse/KAFKA-1748
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava


Currently the system tests use JSON files that specify the set of services for 
each test and where they should run (i.e. hostname). These currently assume 
that you already have SSH keys setup, use the same username on the host running 
the tests and the test cluster, don't require any additional ssh/scp/rsync 
flags, and assume you'll always have a fixed set of compute resources (or that 
you'll spend a lot of time editing config files).

While we don't want a whole cluster resource manager in the system tests, a bit 
more flexibility would make it easier to, e.g., run tests against a local 
vagrant cluster or on dynamically allocated EC2 instances. We can separate out 
the basic resource spec (i.e. json specifying how to access machines) from the 
service definition (i.e. a broker should run with settings x, y, z). 
Restricting to a very simple set of mappings (i.e. map services to hosts with 
round robin, optionally restricting to no reuse of hosts) should keep things 
simple.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 27536: Patch for KAFKA-1748

2014-11-03 Thread Ewen Cheslack-Postava

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27536/
---

Review request for kafka.


Bugs: KAFKA-1748
https://issues.apache.org/jira/browse/KAFKA-1748


Repository: kafka


Description
---

KAFKA-1748 Make remote hosts used by system tests more flexibly by allowing 
username, hostname, and SSH args.


KAFKA-1748 Make cluster resource declaration separate from test role 
specification and support loading in three ways: a zero-config default on 
localhost, a JSON file, and loading from Vagrant's ssh-config command.


Diffs
-

  system_test/cluster.json PRE-CREATION 
  system_test/metrics.json cd3fc142176b8a3638db5230fb74f055c04c59d4 
  system_test/mirror_maker_testsuite/mirror_maker_test.py 
c0117c64cbb7687ca8fbcec6b5c188eb880300ef 
  system_test/offset_management_testsuite/offset_management_test.py 
12b5cd25140e1eb407dd57eef63d9783257688b2 
  system_test/replication_testsuite/replica_basic_test.py 
660006cc253bbae3e7cd9f02601f1c1937dd1714 
  system_test/system_test_env.py c24d3e83922ef9a56591594b9d80af9bb6a30ce2 
  system_test/system_test_runner.py ee7aa252333553e8eb0bc046edf968ec99dddb70 
  system_test/utils/cluster.py PRE-CREATION 
  system_test/utils/kafka_system_test_utils.py 
1093b660ebd0cb5ab6d3731d26f151e1bf717f8a 
  system_test/utils/metrics.py 3e663483202a1dd856f2fdfd1cfaac88e98f591c 
  system_test/utils/setup_utils.py 0e8b7f97287ec7bc7a2d91573befe0992092734a 
  system_test/utils/system_test_utils.py 
e8529cd31f9202a50b9c9774037ddc744ff92efa 

Diff: https://reviews.apache.org/r/27536/diff/


Testing
---


Thanks,

Ewen Cheslack-Postava



[jira] [Updated] (KAFKA-1748) Decouple system test cluster resources definition from service definitions

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1748:
-
Assignee: Ewen Cheslack-Postava
  Status: Patch Available  (was: Open)

 Decouple system test cluster resources definition from service definitions
 --

 Key: KAFKA-1748
 URL: https://issues.apache.org/jira/browse/KAFKA-1748
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
 Attachments: KAFKA-1748.patch


 Currently the system tests use JSON files that specify the set of services 
 for each test and where they should run (i.e. hostname). These currently 
 assume that you already have SSH keys setup, use the same username on the 
 host running the tests and the test cluster, don't require any additional 
 ssh/scp/rsync flags, and assume you'll always have a fixed set of compute 
 resources (or that you'll spend a lot of time editing config files).
 While we don't want a whole cluster resource manager in the system tests, a 
 bit more flexibility would make it easier to, e.g., run tests against a local 
 vagrant cluster or on dynamically allocated EC2 instances. We can separate 
 out the basic resource spec (i.e. json specifying how to access machines) 
 from the service definition (i.e. a broker should run with settings x, y, z). 
 Restricting to a very simple set of mappings (i.e. map services to hosts with 
 round robin, optionally restricting to no reuse of hosts) should keep things 
 simple.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1748) Decouple system test cluster resources definition from service definitions

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195005#comment-14195005
 ] 

Ewen Cheslack-Postava commented on KAFKA-1748:
--

Created reviewboard https://reviews.apache.org/r/27536/diff/
 against branch origin/trunk

 Decouple system test cluster resources definition from service definitions
 --

 Key: KAFKA-1748
 URL: https://issues.apache.org/jira/browse/KAFKA-1748
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava
 Attachments: KAFKA-1748.patch


 Currently the system tests use JSON files that specify the set of services 
 for each test and where they should run (i.e. hostname). These currently 
 assume that you already have SSH keys setup, use the same username on the 
 host running the tests and the test cluster, don't require any additional 
 ssh/scp/rsync flags, and assume you'll always have a fixed set of compute 
 resources (or that you'll spend a lot of time editing config files).
 While we don't want a whole cluster resource manager in the system tests, a 
 bit more flexibility would make it easier to, e.g., run tests against a local 
 vagrant cluster or on dynamically allocated EC2 instances. We can separate 
 out the basic resource spec (i.e. json specifying how to access machines) 
 from the service definition (i.e. a broker should run with settings x, y, z). 
 Restricting to a very simple set of mappings (i.e. map services to hosts with 
 round robin, optionally restricting to no reuse of hosts) should keep things 
 simple.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1748) Decouple system test cluster resources definition from service definitions

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1748:
-
Attachment: KAFKA-1748_2014-11-03_12:04:18.patch

 Decouple system test cluster resources definition from service definitions
 --

 Key: KAFKA-1748
 URL: https://issues.apache.org/jira/browse/KAFKA-1748
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
 Attachments: KAFKA-1748.patch, KAFKA-1748_2014-11-03_12:04:18.patch


 Currently the system tests use JSON files that specify the set of services 
 for each test and where they should run (i.e. hostname). These currently 
 assume that you already have SSH keys setup, use the same username on the 
 host running the tests and the test cluster, don't require any additional 
 ssh/scp/rsync flags, and assume you'll always have a fixed set of compute 
 resources (or that you'll spend a lot of time editing config files).
 While we don't want a whole cluster resource manager in the system tests, a 
 bit more flexibility would make it easier to, e.g., run tests against a local 
 vagrant cluster or on dynamically allocated EC2 instances. We can separate 
 out the basic resource spec (i.e. json specifying how to access machines) 
 from the service definition (i.e. a broker should run with settings x, y, z). 
 Restricting to a very simple set of mappings (i.e. map services to hosts with 
 round robin, optionally restricting to no reuse of hosts) should keep things 
 simple.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27536: Patch for KAFKA-1748

2014-11-03 Thread Ewen Cheslack-Postava

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27536/
---

(Updated Nov. 3, 2014, 8:04 p.m.)


Review request for kafka.


Bugs: KAFKA-1748
https://issues.apache.org/jira/browse/KAFKA-1748


Repository: kafka


Description (updated)
---

KAFKA-1748 Make cluster resource declaration separate from test role 
specification and support loading in three ways: a zero-config default on 
localhost, a JSON file, and loading from Vagrant's ssh-config command.


Diffs (updated)
-

  system_test/cluster.json PRE-CREATION 
  system_test/mirror_maker_testsuite/mirror_maker_test.py 
c0117c64cbb7687ca8fbcec6b5c188eb880300ef 
  system_test/offset_management_testsuite/offset_management_test.py 
12b5cd25140e1eb407dd57eef63d9783257688b2 
  system_test/replication_testsuite/replica_basic_test.py 
660006cc253bbae3e7cd9f02601f1c1937dd1714 
  system_test/system_test_env.py c24d3e83922ef9a56591594b9d80af9bb6a30ce2 
  system_test/system_test_runner.py ee7aa252333553e8eb0bc046edf968ec99dddb70 
  system_test/utils/cluster.py PRE-CREATION 
  system_test/utils/kafka_system_test_utils.py 
1093b660ebd0cb5ab6d3731d26f151e1bf717f8a 
  system_test/utils/metrics.py 3e663483202a1dd856f2fdfd1cfaac88e98f591c 
  system_test/utils/setup_utils.py 0e8b7f97287ec7bc7a2d91573befe0992092734a 
  system_test/utils/system_test_utils.py 
e8529cd31f9202a50b9c9774037ddc744ff92efa 

Diff: https://reviews.apache.org/r/27536/diff/


Testing
---


Thanks,

Ewen Cheslack-Postava



[jira] [Commented] (KAFKA-1748) Decouple system test cluster resources definition from service definitions

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195006#comment-14195006
 ] 

Ewen Cheslack-Postava commented on KAFKA-1748:
--

Updated reviewboard https://reviews.apache.org/r/27536/diff/
 against branch origin/trunk

 Decouple system test cluster resources definition from service definitions
 --

 Key: KAFKA-1748
 URL: https://issues.apache.org/jira/browse/KAFKA-1748
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
 Attachments: KAFKA-1748.patch, KAFKA-1748_2014-11-03_12:04:18.patch


 Currently the system tests use JSON files that specify the set of services 
 for each test and where they should run (i.e. hostname). These currently 
 assume that you already have SSH keys setup, use the same username on the 
 host running the tests and the test cluster, don't require any additional 
 ssh/scp/rsync flags, and assume you'll always have a fixed set of compute 
 resources (or that you'll spend a lot of time editing config files).
 While we don't want a whole cluster resource manager in the system tests, a 
 bit more flexibility would make it easier to, e.g., run tests against a local 
 vagrant cluster or on dynamically allocated EC2 instances. We can separate 
 out the basic resource spec (i.e. json specifying how to access machines) 
 from the service definition (i.e. a broker should run with settings x, y, z). 
 Restricting to a very simple set of mappings (i.e. map services to hosts with 
 round robin, optionally restricting to no reuse of hosts) should keep things 
 simple.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1748) Decouple system test cluster resources definition from service definitions

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195013#comment-14195013
 ] 

Ewen Cheslack-Postava commented on KAFKA-1748:
--

The patches I submitted make the suggested changes, including some support for 
pulling dynamic configurations (from Vagrant). For the local default config, 
this shouldn't have any effect, but will require config changes if you're 
overriding those settings -- the cluster resources go in cluster.json and just 
contain the hostnames, optional usernames, ssh args, java home and kafka home. 
In the Vagrant case, I defaulted KAFKA_HOME to /opt/kafka to match the setup in 
KAFKA-1173 since there's no place for the user to override that setting.

 Decouple system test cluster resources definition from service definitions
 --

 Key: KAFKA-1748
 URL: https://issues.apache.org/jira/browse/KAFKA-1748
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
 Attachments: KAFKA-1748.patch, KAFKA-1748_2014-11-03_12:04:18.patch


 Currently the system tests use JSON files that specify the set of services 
 for each test and where they should run (i.e. hostname). These currently 
 assume that you already have SSH keys setup, use the same username on the 
 host running the tests and the test cluster, don't require any additional 
 ssh/scp/rsync flags, and assume you'll always have a fixed set of compute 
 resources (or that you'll spend a lot of time editing config files).
 While we don't want a whole cluster resource manager in the system tests, a 
 bit more flexibility would make it easier to, e.g., run tests against a local 
 vagrant cluster or on dynamically allocated EC2 instances. We can separate 
 out the basic resource spec (i.e. json specifying how to access machines) 
 from the service definition (i.e. a broker should run with settings x, y, z). 
 Restricting to a very simple set of mappings (i.e. map services to hosts with 
 round robin, optionally restricting to no reuse of hosts) should keep things 
 simple.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability

2014-11-03 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195029#comment-14195029
 ] 

Joel Koshy commented on KAFKA-1555:
---

Sounds good. In that case, can you modify it a bit? The only remaining 
confusion is that earlier on in that section we start by writing about 
acknowledgement by all replicas, but then directly (without further comment) 
assume it is actually acknowledgement by the current in-sync replicas. How 
about the following:
Instead of _A message that has been acknowledged by all in-sync replicas..._ we 
can write _A message that has been acknowledged by all replicas..._. And then 
say _Note that acknowledgement by all replicas does not guarantee that the 
full set of assigned replicas have received the message. By default, 
acknowledgement happens as soon as all the current in-sync replicas have 
received the message. For example, if a topic is configured with only two 
replicas and one fails (i.e., only one in sync replica remains), then writes 
that specify required.acks=-1 will succeed. However, these writes could be lost 
if the remaining replica also fails. Although this ensures maximum availability 
..._ (from earlier comment)

As for the design itself: I just thought that the broker-side setting taking 
effect only with a client-setting is a bit odd especially if it does not hurt 
do so with the other ack settings.

 provide strong consistency with reasonable availability
 ---

 Key: KAFKA-1555
 URL: https://issues.apache.org/jira/browse/KAFKA-1555
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Affects Versions: 0.8.1.1
Reporter: Jiang Wu
Assignee: Gwen Shapira
 Fix For: 0.8.2

 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, 
 KAFKA-1555-DOCS.2.patch, KAFKA-1555-DOCS.3.patch, KAFKA-1555.0.patch, 
 KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, 
 KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, 
 KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch


 In a mission critical application, we expect a kafka cluster with 3 brokers 
 can satisfy two requirements:
 1. When 1 broker is down, no message loss or service blocking happens.
 2. In worse cases such as two brokers are down, service can be blocked, but 
 no message loss happens.
 We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
 due to its three behaviors:
 1. when choosing a new leader from 2 followers in ISR, the one with less 
 messages may be chosen as the leader.
 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
 has less messages than the leader.
 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
 stored in only 1 broker.
 The following is an analytical proof. 
 We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
 that at the beginning, all 3 replicas, leader A, followers B and C, are in 
 sync, i.e., they have the same messages and are all in ISR.
 According to the value of request.required.acks (acks for short), there are 
 the following cases.
 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this 
 time, although C hasn't received m, C is still in ISR. If A is killed, C can 
 be elected as the new leader, and consumers will miss m.
 3. acks=-1. B and C restart and are removed from ISR. Producer sends a 
 message m to A, and receives an acknowledgement. Disk failure happens in A 
 before B and C replicate m. Message m is lost.
 In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1070) Auto-assign node id

2014-11-03 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195345#comment-14195345
 ] 

Otis Gospodnetic commented on KAFKA-1070:
-

[~harsha_ch] - it looks like a number of people would really like to use IPs 
for broker.id.  There is a lot of interest in having that. Please see this 
thread: http://search-hadoop.com/m/4TaT4dTPKi1
Do you think this is something you could add to this patch, maybe as another 
broker ID assignment scheme/policy?


 Auto-assign node id
 ---

 Key: KAFKA-1070
 URL: https://issues.apache.org/jira/browse/KAFKA-1070
 Project: Kafka
  Issue Type: Bug
Reporter: Jay Kreps
Assignee: Sriharsha Chintalapani
  Labels: usability
 Attachments: KAFKA-1070.patch, KAFKA-1070_2014-07-19_16:06:13.patch, 
 KAFKA-1070_2014-07-22_11:34:18.patch, KAFKA-1070_2014-07-24_20:58:17.patch, 
 KAFKA-1070_2014-07-24_21:05:33.patch, KAFKA-1070_2014-08-21_10:26:20.patch


 It would be nice to have Kafka brokers auto-assign node ids rather than 
 having that be a configuration. Having a configuration is irritating because 
 (1) you have to generate a custom config for each broker and (2) even though 
 it is in configuration, changing the node id can cause all kinds of bad 
 things to happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1749) Brokers continually throw exceptions when there are hundreds of topic being fetched by mirrormaker

2014-11-03 Thread Min Zhou (JIRA)
Min Zhou created KAFKA-1749:
---

 Summary: Brokers continually throw exceptions when there are 
hundreds of topic being fetched by mirrormaker
 Key: KAFKA-1749
 URL: https://issues.apache.org/jira/browse/KAFKA-1749
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1, 0.8.2
Reporter: Min Zhou


Here is one piece of  millions of the exceptions. 

{noformat}
kafka.common.KafkaException: This operation cannot be completed on a complete 
request.
at kafka.network.Transmission$class.expectIncomplete(Transmission.scala:34)
at kafka.api.FetchResponseSend.expectIncomplete(FetchResponse.scala:191)
at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:214)
at kafka.network.Processor.write(SocketServer.scala:375)
at kafka.network.Processor.run(SocketServer.scala:247)
at java.lang.Thread.run(Thread.java:662)
{noformat}

We use tools to hook function kafka.api.FetchResponseSend.writeTo, found 
 fetchResponse.sizeInBytes was overflow.  Which means below code will get a 
result over the limit of integer type

{noformat}
val sizeInBytes =
FetchResponse.headerSize +
dataGroupedByTopic.foldLeft(0) ((folded, curr) = {
  val topicData = TopicData(curr._1, curr._2.map {
case (topicAndPartition, partitionData) = 
(topicAndPartition.partition, partitionData)
  })
  folded + topicData.sizeInBytes
})

{noformat} 

 If we just fetch few topics by mirrormaker, the brokers ran well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1749) Brokers continually throw exceptions when there are hundreds of topic being fetched by mirrormaker

2014-11-03 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195417#comment-14195417
 ] 

Ewen Cheslack-Postava commented on KAFKA-1749:
--

This looks like it's probably a duplicate of KAFKA-1196 -- that looks like the 
stack trace I described in that issue that I get when sending a FetchResponse 
that exceeds 2GB.

 Brokers continually throw exceptions when there are hundreds of topic being 
 fetched by mirrormaker
 --

 Key: KAFKA-1749
 URL: https://issues.apache.org/jira/browse/KAFKA-1749
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1, 0.8.2
Reporter: Min Zhou

 Here is one piece of  millions of the exceptions. 
 {noformat}
 kafka.common.KafkaException: This operation cannot be completed on a complete 
 request.
 at kafka.network.Transmission$class.expectIncomplete(Transmission.scala:34)
 at kafka.api.FetchResponseSend.expectIncomplete(FetchResponse.scala:191)
 at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:214)
 at kafka.network.Processor.write(SocketServer.scala:375)
 at kafka.network.Processor.run(SocketServer.scala:247)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 We use tools to hook function kafka.api.FetchResponseSend.writeTo, found 
  fetchResponse.sizeInBytes was overflow.  Which means below code will get a 
 result over the limit of integer type
 {noformat}
 val sizeInBytes =
 FetchResponse.headerSize +
 dataGroupedByTopic.foldLeft(0) ((folded, curr) = {
   val topicData = TopicData(curr._1, curr._2.map {
 case (topicAndPartition, partitionData) = 
 (topicAndPartition.partition, partitionData)
   })
   folded + topicData.sizeInBytes
 })
 {noformat} 
  If we just fetch few topics by mirrormaker, the brokers ran well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1476) Get a list of consumer groups

2014-11-03 Thread BalajiSeshadri (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195419#comment-14195419
 ] 

BalajiSeshadri commented on KAFKA-1476:
---

[~nehanarkhede] can you please review my patch,this one has list and 
describeGroup functionality.

 Get a list of consumer groups
 -

 Key: KAFKA-1476
 URL: https://issues.apache.org/jira/browse/KAFKA-1476
 Project: Kafka
  Issue Type: Wish
  Components: tools
Affects Versions: 0.8.1.1
Reporter: Ryan Williams
Assignee: BalajiSeshadri
  Labels: newbie
 Fix For: 0.9.0

 Attachments: ConsumerCommand.scala, KAFKA-1476-LIST-GROUPS.patch, 
 KAFKA-1476-RENAME.patch, KAFKA-1476-REVIEW-COMMENTS.patch, KAFKA-1476.patch


 It would be useful to have a way to get a list of consumer groups currently 
 active via some tool/script that ships with kafka. This would be helpful so 
 that the system tools can be explored more easily.
 For example, when running the ConsumerOffsetChecker, it requires a group 
 option
 bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic test --group 
 ?
 But, when just getting started with kafka, using the console producer and 
 consumer, it is not clear what value to use for the group option.  If a list 
 of consumer groups could be listed, then it would be clear what value to use.
 Background:
 http://mail-archives.apache.org/mod_mbox/kafka-users/201405.mbox/%3cCAOq_b1w=slze5jrnakxvak0gu9ctdkpazak1g4dygvqzbsg...@mail.gmail.com%3e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-960) Upgrade Metrics to 3.x

2014-11-03 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195493#comment-14195493
 ] 

Jun Rao edited comment on KAFKA-960 at 11/4/14 1:01 AM:


[~jjkoshy],

Yes, we can move to the new metrics in the clients package on the broker in the 
future. We don't have a jira yes. We need to fix a similar issue with mbean 
name in the new metrics (KAFKA-1723) first though. Also, we probably want to 
improve the histogram implementation in the new metrics.


was (Author: junrao):
[~jjkoshy],

Yes, we can move to the new metrics in the clients package on the broker in the 
future. We need to fix a similar issue with mbean name in the new metrics 
(KAFKA-1723) first though. Also, we probably want to improve the histogram 
implementation in the new metrics.

 Upgrade Metrics to 3.x
 --

 Key: KAFKA-960
 URL: https://issues.apache.org/jira/browse/KAFKA-960
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.8.1
Reporter: Cosmin Lehene

 Now that metrics 3.0 has been released 
 (http://metrics.codahale.com/about/release-notes/) we can upgrade back



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-960) Upgrade Metrics to 3.x

2014-11-03 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195493#comment-14195493
 ] 

Jun Rao commented on KAFKA-960:
---

[~jjkoshy],

Yes, we can move to the new metrics in the clients package on the broker in the 
future. We need to fix a similar issue with mbean name in the new metrics 
(KAFKA-1723) first though. Also, we probably want to improve the histogram 
implementation in the new metrics.

 Upgrade Metrics to 3.x
 --

 Key: KAFKA-960
 URL: https://issues.apache.org/jira/browse/KAFKA-960
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.8.1
Reporter: Cosmin Lehene

 Now that metrics 3.0 has been released 
 (http://metrics.codahale.com/about/release-notes/) we can upgrade back



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 26755: Patch for KAFKA-1706

2014-11-03 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26755/#review59679
---

Ship it!


Minor points that I can fix on check-in.


core/src/main/scala/kafka/utils/ByteBoundedBlockingQueue.scala
https://reviews.apache.org/r/26755/#comment101009

Unused. i.e, the var can be on line 50 alone



core/src/main/scala/kafka/utils/ByteBoundedBlockingQueue.scala
https://reviews.apache.org/r/26755/#comment101010

Will add a comment that this is to avoid the putLock.wait(0, 0) scenario



core/src/test/scala/unit/kafka/utils/ByteBoundedBlockingQueueTest.scala
https://reviews.apache.org/r/26755/#comment100971

I think 32-35 can be removed


- Joel Koshy


On Oct. 29, 2014, 5:57 p.m., Jiangjie Qin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/26755/
 ---
 
 (Updated Oct. 29, 2014, 5:57 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1706
 https://issues.apache.org/jira/browse/KAFKA-1706
 
 
 Repository: kafka
 
 
 Description
 ---
 
 changed arguments name
 
 
 correct typo.
 
 
 Incorporated Joel's comments. Also fixed negative queue size problem.
 
 
 Incorporated Joel's comments.
 
 
 Added unit test for ByteBoundedBlockingQueue. Fixed a bug regarding wating 
 time.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/utils/ByteBoundedBlockingQueue.scala PRE-CREATION 
   core/src/test/scala/unit/kafka/utils/ByteBoundedBlockingQueueTest.scala 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/26755/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jiangjie Qin
 




[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-11-03 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195548#comment-14195548
 ] 

Jun Rao commented on KAFKA-1481:


[~jjkoshy], do you want to take another look at the patch?

 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, 
 KAFKA-1481_2014-10-31_14-35-43.patch, 
 KAFKA-1481_2014-11-03_16-39-41_doc.patch, 
 KAFKA-1481_2014-11-03_17-02-23.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-11-03 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195545#comment-14195545
 ] 

Jun Rao commented on KAFKA-1481:


Thanks for the patch. Just some minor comments below. Otherwise, +1 from me.

60. AbstractFetcherManager: In the following, we don't need to wrap new Gauge 
in {}.
  MinFetchRate, {
new Gauge[Double] {
  // current min fetch rate across all fetchers/topics/partitions
  def value = {
val headRate: Double =
  
fetcherThreadMap.headOption.map(_._2.fetcherStats.requestRate.oneMinuteRate).getOrElse(0)

fetcherThreadMap.foldLeft(headRate)((curMinAll, fetcherThreadMapEntry) 
= {
  
fetcherThreadMapEntry._2.fetcherStats.requestRate.oneMinuteRate.min(curMinAll)
})
  }
}
  },

61. AbstractFetcherThread: Could we align topic and partition to the same 
column as clientId? Ditto in a few other places.
Map(clientId - metricId.clientId,
  topic - metricId.topic,
  partition - metricId.partitionId.toString)
  )

62. FetchRequestAndResponseMetrics: In the following, could we put Map in a 
separate line?
  val tags = metricId match {
case ClientIdAndBroker(clientId, brokerHost, brokerPort) = Map(clientId 
- clientId, brokerHost - brokerHost,
  brokerPort - brokerPort.toString)
case ClientIdAllBrokers(clientId) = Map(clientId - clientId, 
allBrokers - true)
  }

62. TestUtils: It's a bit weird that sendMessagesToBrokerPartition() has both a 
config and configs. We actually don't need the config any more. When sending a 
message to a partition, we actually don't know which broker the partition is 
on. So, the message can just be of the form test + - + partition + - + x. 
We can also rename the method to sendMessagesToPartition().


 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, 
 KAFKA-1481_2014-10-31_14-35-43.patch, 
 KAFKA-1481_2014-11-03_16-39-41_doc.patch, 
 KAFKA-1481_2014-11-03_17-02-23.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1745) Each new thread creates a PIPE and KQUEUE as open files during producer.send() and does not get cleared when the thread that creates them is cleared.

2014-11-03 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195561#comment-14195561
 ] 

Jun Rao commented on KAFKA-1745:


Did you call producer.close() when the producer thread is removed?

 Each new thread creates a PIPE and KQUEUE as open files during 
 producer.send() and does not get cleared when the thread that creates them is 
 cleared.
 -

 Key: KAFKA-1745
 URL: https://issues.apache.org/jira/browse/KAFKA-1745
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
 Environment: Mac OS Mavericks
Reporter: Vishal
Priority: Critical

 Hi,
 I'm using the java client API for Kafka. I wanted to send data to Kafka 
 by using a producer pool as I'm using a sync producer. The thread that sends 
 the data is from the thread pool that grows and shrinks depending on the 
 usage. So, when I try to send data from one thread, 1 KQUEUE and 2 PIPES are 
 created (got this info by using lsof). If I keep using the same thread it's 
 fine but when a new thread sends data to Kafka (using producer.send() ) a new 
 KQUEUE and 2 PIPEs are created.
 This is okay, but when the thread is cleared from the thread pool and a new 
 thread is created, then new KQUEUEs and PIPEs are created. The problem is 
 that the old ones which were created are not getting destroyed and they are 
 showing up as open files. This is causing a major problem as the number of 
 open file keep increasing and does not decrease.
 Please suggest any solutions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-11-03 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-1481:
--
Reviewer: Joel Koshy  (was: Jun Rao)

 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, 
 KAFKA-1481_2014-10-31_14-35-43.patch, 
 KAFKA-1481_2014-11-03_16-39-41_doc.patch, 
 KAFKA-1481_2014-11-03_17-02-23.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-11-03 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195573#comment-14195573
 ] 

Joel Koshy commented on KAFKA-1481:
---

Yes thanks - I'll take a look at this tomorrow.

 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, 
 KAFKA-1481_2014-10-31_14-35-43.patch, 
 KAFKA-1481_2014-11-03_16-39-41_doc.patch, 
 KAFKA-1481_2014-11-03_17-02-23.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1738) Partitions for topic not created after restart from forced shutdown

2014-11-03 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195577#comment-14195577
 ] 

Jun Rao commented on KAFKA-1738:


I can't reproduce this issue by following the steps in the description. Does 
this happen every time?


 Partitions for topic not created after restart from forced shutdown
 ---

 Key: KAFKA-1738
 URL: https://issues.apache.org/jira/browse/KAFKA-1738
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1, 0.8.2
 Environment: Linux, 2GB RAM, 2 Core CPU
Reporter: Pradeep

 We are using Kafka Topic APIs to create the topic. But in some cases, the 
 topic gets created but we don't see the partition specific files and when 
 producer/consumer tries to get the topic metadata and it fails with 
 exception. Same happens if one tries to create using the command line.
 k.p.BrokerPartitionInfo - Error while fetching metadata [{TopicMetadata for 
 topic tloader1 - No partition metadata for topic tloader1 due to 
 kafka.common.UnknownTopicOrPartitionException}] for topic [tloader1]: class 
 kafka.common.UnknownTopicOrPartitionException
 Steps to reproduce - 
 1.  Stop kafka using kill  -9 PID of Kafka
 2.  Start Kafka
 3.  Create Topic with partition and replication factor of 1.
 4.  Check the response “Created topic topic_name”
 5.  Run the list command to verify if its created.
 6.  Now check the data directory of kakfa. There would not be any for the 
 newly created topic.
 We see issues when we are creating new topics. This happens randomly and we 
 dont know the exact reasons. We see the below logs in controller during the 
 time of creation of topics which doesnt have the partition files.
 2014-11-03 13:12:50,625] INFO [Controller 0]: New topic creation callback for 
 [JobJTopic,0] (kafka.controller.KafkaController)
 [2014-11-03 13:12:50,626] INFO [Controller 0]: New partition creation 
 callback for [JobJTopic,0] (kafka.controller.KafkaController)
 [2014-11-03 13:12:50,626] INFO [Partition state machine on Controller 0]: 
 Invoking state change to NewPartition for partitions [JobJTopic,0] 
 (kafka.controller.PartitionStateMachine)
 [2014-11-03 13:12:50,653] INFO [Replica state machine on controller 0]: 
 Invoking state change to NewReplica for replicas 
 [Topic=JobJTopic,Partition=0,Replica=0] (kafka.controller.ReplicaStateMachine)
 [2014-11-03 13:12:50,654] INFO [Partition state machine on Controller 0]: 
 Invoking state change to OnlinePartition for partitions [JobJTopic,0] 
 (kafka.controller.PartitionStateMachine)
 [2014-11-03 13:12:50,654] DEBUG [Partition state machine on Controller 0]: 
 Live assigned replicas for partition [JobJTopic,0] are: [List(0)] 
 (kafka.controller.PartitionStateMachine)
 [2014-11-03 13:12:50,654] DEBUG [Partition state machine on Controller 0]: 
 Initializing leader and isr for partition [JobJTopic,0] to 
 (Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:2) 
 (kafka.controller.PartitionStateMachine)
 [2014-11-03 13:12:50,667] INFO [Replica state machine on controller 0]: 
 Invoking state change to OnlineReplica for replicas 
 [Topic=JobJTopic,Partition=0,Replica=0] (kafka.controller.ReplicaStateMachine)
 [2014-11-03 13:12:50,794] WARN [Controller-0-to-broker-0-send-thread], 
 Controller 0 fails to send a request to broker id:0,host:DMIPVM,port:9092 
 (kafka.controller.RequestSendThread)
 java.io.EOFException: Received -1 when reading from channel, socket has 
 likely been closed.
   at kafka.utils.Utils$.read(Utils.scala:381)
   at 
 kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
   at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
   at 
 kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
   at kafka.network.BlockingChannel.receive(BlockingChannel.scala:108)
   at 
 kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146)
   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
 [2014-11-03 13:12:50,965] ERROR [Controller-0-to-broker-0-send-thread], 
 Controller 0 epoch 2 failed to send request 
 Name:UpdateMetadataRequest;Version:0;Controller:0;ControllerEpoch:2;CorrelationId:43;ClientId:id_0-host_null-port_9092;AliveBrokers:id:0,host:DMIPVM,port:9092;PartitionState:[JobJTopic,0]
  - 
 (LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:2),ReplicationFactor:1),AllReplicas:0)
  to broker id:0,host:DMIPVM,port:9092. Reconnecting to broker. 
 (kafka.controller.RequestSendThread)
 java.nio.channels.ClosedChannelException
   at kafka.network.BlockingChannel.send(BlockingChannel.scala:97)
   at 
 kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
   

[jira] [Updated] (KAFKA-1729) add doc for Kafka-based offset management in 0.8.2

2014-11-03 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-1729:
--
Status: Patch Available  (was: Open)

 add doc for Kafka-based offset management in 0.8.2
 --

 Key: KAFKA-1729
 URL: https://issues.apache.org/jira/browse/KAFKA-1729
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jun Rao
Assignee: Joel Koshy
 Attachments: KAFKA-1782-doc-v1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1729) add doc for Kafka-based offset management in 0.8.2

2014-11-03 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-1729:
--
Attachment: KAFKA-1782-doc-v1.patch

Uploading v1 for review

 add doc for Kafka-based offset management in 0.8.2
 --

 Key: KAFKA-1729
 URL: https://issues.apache.org/jira/browse/KAFKA-1729
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jun Rao
Assignee: Joel Koshy
 Attachments: KAFKA-1782-doc-v1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1070) Auto-assign node id

2014-11-03 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195687#comment-14195687
 ] 

Otis Gospodnetic commented on KAFKA-1070:
-

bq.  this patch pending the 0.8.2 release

Note this issue doesn't have Fix Version set to 0.8.2.  Maybe that's what you 
want? +1 from me -- it looks like a number of people would like to see this 
issue resolved!

 Auto-assign node id
 ---

 Key: KAFKA-1070
 URL: https://issues.apache.org/jira/browse/KAFKA-1070
 Project: Kafka
  Issue Type: Bug
Reporter: Jay Kreps
Assignee: Sriharsha Chintalapani
  Labels: usability
 Attachments: KAFKA-1070.patch, KAFKA-1070_2014-07-19_16:06:13.patch, 
 KAFKA-1070_2014-07-22_11:34:18.patch, KAFKA-1070_2014-07-24_20:58:17.patch, 
 KAFKA-1070_2014-07-24_21:05:33.patch, KAFKA-1070_2014-08-21_10:26:20.patch


 It would be nice to have Kafka brokers auto-assign node ids rather than 
 having that be a configuration. Having a configuration is irritating because 
 (1) you have to generate a custom config for each broker and (2) even though 
 it is in configuration, changing the node id can cause all kinds of bad 
 things to happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1070) Auto-assign node id

2014-11-03 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195700#comment-14195700
 ] 

Sriharsha Chintalapani commented on KAFKA-1070:
---

[~otis] Sorry I meant pending review after 0.8.2 release. Not planned for 0.8.2 
release. I'll update the patch with ip as brokerid option.

 Auto-assign node id
 ---

 Key: KAFKA-1070
 URL: https://issues.apache.org/jira/browse/KAFKA-1070
 Project: Kafka
  Issue Type: Bug
Reporter: Jay Kreps
Assignee: Sriharsha Chintalapani
  Labels: usability
 Attachments: KAFKA-1070.patch, KAFKA-1070_2014-07-19_16:06:13.patch, 
 KAFKA-1070_2014-07-22_11:34:18.patch, KAFKA-1070_2014-07-24_20:58:17.patch, 
 KAFKA-1070_2014-07-24_21:05:33.patch, KAFKA-1070_2014-08-21_10:26:20.patch


 It would be nice to have Kafka brokers auto-assign node ids rather than 
 having that be a configuration. Having a configuration is irritating because 
 (1) you have to generate a custom config for each broker and (2) even though 
 it is in configuration, changing the node id can cause all kinds of bad 
 things to happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-1733) Producer.send will block indeterminately when broker is unavailable.

2014-11-03 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-1733.

   Resolution: Fixed
Fix Version/s: 0.8.3
 Assignee: Marc Chung

Thanks for the patch. +1 and committed to trunk.

 Producer.send will block indeterminately when broker is unavailable.
 

 Key: KAFKA-1733
 URL: https://issues.apache.org/jira/browse/KAFKA-1733
 Project: Kafka
  Issue Type: Bug
  Components: core, producer 
Reporter: Marc Chung
Assignee: Marc Chung
 Fix For: 0.8.3

 Attachments: kafka-1733-add-connectTimeoutMs.patch


 This is a follow up to the conversation here:
 https://mail-archives.apache.org/mod_mbox/kafka-dev/201409.mbox/%3ccaog_4qymoejhkbo0n31+a-ujx0z5unsisd5wbrmn-xtx7gi...@mail.gmail.com%3E
 During ClientUtils.fetchTopicMetadata, if the broker is unavailable, 
 socket.connect will block indeterminately. Any retry policy 
 (message.send.max.retries) further increases the time spent waiting for the 
 socket to connect.
 The root fix is to add a connection timeout value to the BlockingChannel's 
 socket configuration, like so:
 {noformat}
 -channel.socket.connect(new InetSocketAddress(host, port))
 +channel.socket.connect(new InetSocketAddress(host, port), connectTimeoutMs)
 {noformat}
 The simplest thing to do here would be to have a constant, default value that 
 would be applied to every socket configuration. 
 Is that acceptable? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-391) Producer request and response classes should use maps

2014-11-03 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195762#comment-14195762
 ] 

Honghai Chen commented on KAFKA-391:


This situation happen under below scenario:
one broker is leader for several partitions, for example 3,   when send one 
messageset which has message for all of the 3 partitions of this broker ,  
the response.status.size is 3 and  the producerRequest.data.size is 1.then 
it hit this exception.   Any idea for fix?  Do we need compare 
response.status.size  with messagesPerTopic.Count instead of 
producerRequest.data.size ?


  private def send(brokerId: Int, messagesPerTopic: 
collection.mutable.Map[TopicAndPartition, ByteBufferMessageSet]) = {
if(brokerId  0) {
  warn(Failed to send data since partitions %s don't have a 
leader.format(messagesPerTopic.map(_._1).mkString(,)))
  messagesPerTopic.keys.toSeq
} else if(messagesPerTopic.size  0) {
  val currentCorrelationId = correlationId.getAndIncrement
  val producerRequest = new ProducerRequest(currentCorrelationId, 
config.clientId, config.requestRequiredAcks,
config.requestTimeoutMs, messagesPerTopic)
  var failedTopicPartitions = Seq.empty[TopicAndPartition]
  try {
val syncProducer = producerPool.getProducer(brokerId)
debug(Producer sending messages with correlation id %d for topics %s 
to broker %d on %s:%d
  .format(currentCorrelationId, messagesPerTopic.keySet.mkString(,), 
brokerId, syncProducer.config.host, syncProducer.config.port))
val response = syncProducer.send(producerRequest)
debug(Producer sent messages with correlation id %d for topics %s to 
broker %d on %s:%d
  .format(currentCorrelationId, messagesPerTopic.keySet.mkString(,), 
brokerId, syncProducer.config.host, syncProducer.config.port))
if(response != null) {
  if (response.status.size != producerRequest.data.size)
throw new KafkaException(Incomplete response (%s) for producer 
request (%s).format(response, producerRequest))



 Producer request and response classes should use maps
 -

 Key: KAFKA-391
 URL: https://issues.apache.org/jira/browse/KAFKA-391
 Project: Kafka
  Issue Type: Bug
Reporter: Joel Koshy
Assignee: Joel Koshy
Priority: Blocker
  Labels: optimization
 Fix For: 0.8.0

 Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch, 
 KAFKA-391-v3.patch, KAFKA-391-v4.patch


 Producer response contains two arrays of error codes and offsets - the 
 ordering in these arrays correspond to the flattened ordering of the request 
 arrays.
 It would be better to switch to maps in the request and response as this 
 would make the code clearer and more efficient (right now, linear scans are 
 used in handling producer acks).
 We can probably do the same in the fetch request/response.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : Kafka-trunk #322

2014-11-03 Thread Apache Jenkins Server
See https://builds.apache.org/job/Kafka-trunk/322/changes