[jira] [Updated] (KAFKA-777) Add system tests for important tools

2013-02-28 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-777:


Labels: kafka-0.8 p1 replication-testing  (was: kafka-0.8 p1)

 Add system tests for important tools
 

 Key: KAFKA-777
 URL: https://issues.apache.org/jira/browse/KAFKA-777
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Sriram Subramanian
Assignee: John Fung
  Labels: kafka-0.8, p1, replication-testing
 Fix For: 0.8


 Few tools were broken after the zk format change. It would be great to catch 
 these issues during system tests. Some of the tools are 
 1. ShudownBroker
 2. PreferredReplicaAssignment
 3. ConsumerOffsetChecker
 There might be a few more for which we need tests. Need to add them once 
 identified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-772) System Test Transient Failure on testcase_0122

2013-02-28 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-772:


Labels: kafka-0.8 p1  (was: )

 System Test Transient Failure on testcase_0122
 --

 Key: KAFKA-772
 URL: https://issues.apache.org/jira/browse/KAFKA-772
 Project: Kafka
  Issue Type: Bug
Reporter: John Fung
Assignee: Sriram Subramanian
  Labels: kafka-0.8, p1
 Attachments: testcase_0122.tar.gz


 * This test case is failing randomly in the past few weeks. Please note there 
 is a small % data loss allowance for the test case with Ack = 1. But the 
 failure in this case is the mismatch of log segment checksum across the 
 replicas.
 * Test description:
 3 brokers cluster
 Replication factor = 3
 No. topic = 2
 No. partitions = 3
 Controlled failure (kill -15)
 Ack = 1
 * Test case output
 _test_case_name  :  testcase_0122
 _test_class_name  :  ReplicaBasicTest
 arg : auto_create_topic  :  true
 arg : bounce_broker  :  true
 arg : broker_type  :  leader
 arg : message_producing_free_time_sec  :  15
 arg : num_iteration  :  3
 arg : num_partition  :  3
 arg : replica_factor  :  3
 arg : sleep_seconds_between_producer_calls  :  1
 validation_status  : 
  Leader Election Latency - iter 1 brokerid 3  :  377.00 ms
  Leader Election Latency - iter 2 brokerid 1  :  374.00 ms
  Leader Election Latency - iter 3 brokerid 2  :  384.00 ms
  Leader Election Latency MAX  :  384.00
  Leader Election Latency MIN  :  374.00
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-0_r1.log  :  1750
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-0_r2.log  :  1750
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-0_r3.log  :  1750
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-1_r1.log  :  1750
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-1_r2.log  :  1750
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-1_r3.log  :  1750
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-2_r1.log  :  1500
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-2_r2.log  :  1500
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-2_r3.log  :  1500
  Unique messages from consumer on [test_2]  :  5000
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-0_r1.log  :  1714
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-0_r2.log  :  1714
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-0_r3.log  :  1680
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-1_r1.log  :  1708
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-1_r2.log  :  1708
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-1_r3.log  :  1708
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-2_r1.log  :  1469
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-2_r2.log  :  1469
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-2_r3.log  :  1469
  Unique messages from producer on [test_2]  :  4900
  Validate for data matched on topic [test_1] across replicas  :  PASSED
  Validate for data matched on topic [test_2]  :  FAILED
  Validate for data matched on topic [test_2] across replicas  :  FAILED
  Validate for merged log segment checksum in cluster [source]  :  FAILED
  Validate leader election successful  :  PASSED

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-772) System Test Transient Failure on testcase_0122

2013-02-28 Thread Sriram Subramanian (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589971#comment-13589971
 ] 

Sriram Subramanian commented on KAFKA-772:
--

There are two issues with the given logs. Both the issues are for topic 2 - 
partition 0 on broker 3.

1. Segment 1 starting with logical offset 0 on broker 3 does not have 
continuous logical offsets. Logical offset 699 is followed by 734. 
2. Segment 2 starting with logical offset 974 on broker 3 is 0 bytes while that 
in broker 2 has values from 974 to 1713. Broker 3 has segment 3 starting with 
logical offset 1012 to 1713. Broker 2 does not have any third segment.

We have run the test in a loop multiple times for a day but have not been able 
to repro this on the local box. I am still investigating how the logs could end 
up in this state during continuous restarts with ack = 0 and replication factor 
= 3 

 System Test Transient Failure on testcase_0122
 --

 Key: KAFKA-772
 URL: https://issues.apache.org/jira/browse/KAFKA-772
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: John Fung
Assignee: Sriram Subramanian
  Labels: kafka-0.8, p1
 Attachments: testcase_0122.tar.gz


 * This test case is failing randomly in the past few weeks. Please note there 
 is a small % data loss allowance for the test case with Ack = 1. But the 
 failure in this case is the mismatch of log segment checksum across the 
 replicas.
 * Test description:
 3 brokers cluster
 Replication factor = 3
 No. topic = 2
 No. partitions = 3
 Controlled failure (kill -15)
 Ack = 1
 * Test case output
 _test_case_name  :  testcase_0122
 _test_class_name  :  ReplicaBasicTest
 arg : auto_create_topic  :  true
 arg : bounce_broker  :  true
 arg : broker_type  :  leader
 arg : message_producing_free_time_sec  :  15
 arg : num_iteration  :  3
 arg : num_partition  :  3
 arg : replica_factor  :  3
 arg : sleep_seconds_between_producer_calls  :  1
 validation_status  : 
  Leader Election Latency - iter 1 brokerid 3  :  377.00 ms
  Leader Election Latency - iter 2 brokerid 1  :  374.00 ms
  Leader Election Latency - iter 3 brokerid 2  :  384.00 ms
  Leader Election Latency MAX  :  384.00
  Leader Election Latency MIN  :  374.00
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-0_r1.log  :  1750
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-0_r2.log  :  1750
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-0_r3.log  :  1750
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-1_r1.log  :  1750
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-1_r2.log  :  1750
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-1_r3.log  :  1750
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-2_r1.log  :  1500
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-2_r2.log  :  1500
  Unique messages from consumer on [test_1] at 
 simple_consumer_test_1-2_r3.log  :  1500
  Unique messages from consumer on [test_2]  :  5000
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-0_r1.log  :  1714
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-0_r2.log  :  1714
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-0_r3.log  :  1680
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-1_r1.log  :  1708
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-1_r2.log  :  1708
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-1_r3.log  :  1708
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-2_r1.log  :  1469
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-2_r2.log  :  1469
  Unique messages from consumer on [test_2] at 
 simple_consumer_test_2-2_r3.log  :  1469
  Unique messages from producer on [test_2]  :  4900
  Validate for data matched on topic [test_1] across replicas  :  PASSED
  Validate for data matched on topic [test_2]  :  FAILED
  Validate for data matched on topic [test_2] across replicas  :  FAILED
  Validate for merged log segment checksum in cluster [source]  :  FAILED
  Validate leader election successful  :  PASSED

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-778) Fix unclear exceptions when the producer detects 0 partitions for a topic

2013-02-28 Thread Zayeem (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590244#comment-13590244
 ] 

Zayeem commented on KAFKA-778:
--

This is on 0.8 version.

In the test code, I am trying to check if the topic exists, if so , create 
producer and send a message. In the zookeeper shell, i can see that the topic 
exists under /brokers/topics/topicName. However the listing 
/brokers/topics/topicName returns []

Here is the gradle dependency versions of my project

ext {
junitVersion = '4.10'
log4jVersion = '1.2.15'
mockitoVersion = '1.9.0'
springVersion = '3.1.3.RELEASE'
springIntegrationVersion = '2.2.0.RELEASE'
slfVersion = '1.6.4'
commonsPoolVersion = '1.6'
zooKeeperVersion = '3.3.4'
snappyVersion = '1.0.4.1'
scalaVersion = '2.9.2'
idPrefix = 'kafka'
}

 Fix unclear exceptions when the producer detects 0 partitions for a topic
 -

 Key: KAFKA-778
 URL: https://issues.apache.org/jira/browse/KAFKA-778
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Affects Versions: 0.8, 0.7.2
Reporter: Neha Narkhede
Assignee: Jun Rao

 ERROR [main][kafka.producer.async.DefaultEventHandler] Failed to collate
 messages by topic, partition due to
 kafka.common.NoBrokersForPartitionException: Partition key = null
 at
 kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$getPartitionListForTopic(DefaultEventHandler.scala:189)
  at
 kafka.producer.async.DefaultEventHandler$$anonfun$partitionAndCollate$1.apply(DefaultEventHandler.scala:148)
 at
 kafka.producer.async.DefaultEventHandler$$anonfun$partitionAndCollate$1.apply(DefaultEventHandler.scala:147)
  at
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
  at
 kafka.producer.async.DefaultEventHandler.partitionAndCollate(DefaultEventHandler.scala:147)
 at
 kafka.producer.async.DefaultEventHandler.dispatchSerializedData(DefaultEventHandler.scala:93)
  at
 kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:72)
 at kafka.producer.Producer.send(Producer.scala:76)
  at kafka.javaapi.producer.Producer.send(Producer.scala:41)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira