[jira] [Created] (KAFKA-824) java.lang.NullPointerException in commitOffsets

2013-03-25 Thread Yonghui Zhao (JIRA)
Yonghui Zhao created KAFKA-824:
--

 Summary: java.lang.NullPointerException in commitOffsets 
 Key: KAFKA-824
 URL: https://issues.apache.org/jira/browse/KAFKA-824
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 0.7.2
Reporter: Yonghui Zhao
Assignee: Neha Narkhede


Neha Narkhede

Yes, I have. Unfortunately, I never quite around to fixing it. My guess is
that it is caused due to a race condition between the rebalance thread and
the offset commit thread when a rebalance is triggered or the client is
being shutdown. Do you mind filing a bug ?


2013/03/25 12:08:32.020 WARN [ZookeeperConsumerConnector] [] 
0_lu-ml-test10.bj-1364184411339-7c88f710 exception during commitOffsets
java.lang.NullPointerException
at org.I0Itec.zkclient.ZkConnection.writeData(ZkConnection.java:111)
at org.I0Itec.zkclient.ZkClient$10.call(ZkClient.java:813)
at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
at org.I0Itec.zkclient.ZkClient.writeData(ZkClient.java:809)
at org.I0Itec.zkclient.ZkClient.writeData(ZkClient.java:777)
at kafka.utils.ZkUtils$.updatePersistentPath(ZkUtils.scala:103)
at 
kafka.consumer.ZookeeperConsumerConnector$$anonfun$commitOffsets$2$$anonfun$apply$4.apply(ZookeeperConsumerConnector.scala:251)
at 
kafka.consumer.ZookeeperConsumerConnector$$anonfun$commitOffsets$2$$anonfun$apply$4.apply(ZookeeperConsumerConnector.scala:248)
at scala.collection.Iterator$class.foreach(Iterator.scala:631)
at 
scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:549)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
at 
scala.collection.JavaConversions$JCollectionWrapper.foreach(JavaConversions.scala:570)
at 
kafka.consumer.ZookeeperConsumerConnector$$anonfun$commitOffsets$2.apply(ZookeeperConsumerConnector.scala:248)
at 
kafka.consumer.ZookeeperConsumerConnector$$anonfun$commitOffsets$2.apply(ZookeeperConsumerConnector.scala:246)
at scala.collection.Iterator$class.foreach(Iterator.scala:631)
at kafka.utils.Pool$$anon$1.foreach(Pool.scala:53)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
at kafka.utils.Pool.foreach(Pool.scala:24)
at 
kafka.consumer.ZookeeperConsumerConnector.commitOffsets(ZookeeperConsumerConnector.scala:246)
at 
kafka.consumer.ZookeeperConsumerConnector.autoCommit(ZookeeperConsumerConnector.scala:232)
at 
kafka.consumer.ZookeeperConsumerConnector$$anonfun$1.apply$mcV$sp(ZookeeperConsumerConnector.scala:126)
at kafka.utils.Utils$$anon$2.run(Utils.scala:58)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (KAFKA-813) Minor cleanup in Controller

2013-03-25 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-813.
---

Resolution: Fixed

Thanks for patch v4. +1 and committed to 0.8.

 Minor cleanup in Controller
 ---

 Key: KAFKA-813
 URL: https://issues.apache.org/jira/browse/KAFKA-813
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Swapnil Ghike
Assignee: Swapnil Ghike
Priority: Blocker
  Labels: kafka-0.8
 Fix For: 0.8

 Attachments: kafka-813-v1.patch, kafka-813-v2.patch, 
 kafka-813-v3.patch, kafka-813-v4.patch


 Before starting work on delete topic support, uploading a patch first to 
 address some minor hiccups that touch a bunch of files:
 1. Change PartitionOfflineException to PartitionUnavailableException because 
 in the partition state machine we mark a partition offline when its leader is 
 down, whereas the PartitionOfflineException is thrown when all the assigned 
 replicas of the partition are down.
 2. Change PartitionOfflineRate to UnavailablePartitionRate
 3. Remove default leader selector from partition state machine's 
 handleStateChange. We can specify null as default when we don't need to use a 
 leader selector.
 4. Include controller info in the client id of LeaderAndIsrRequest.
 5. Rename controllerContext.allleaders to something more meaningful - 
 partitionLeadershipInfo.
 6. We don't need to put partition in OnlinePartition state in partition state 
 machine initializeLeaderAndIsrForPartition, the state change occurs in 
 handleStateChange.
 7. Add todo in handleStateChanges
 8. Left a comment above ReassignedPartitionLeaderSelector that reassigned 
 replicas are already in the ISR (this is not true for other leader 
 selectors), renamed the vals in the selector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-813) Minor cleanup in Controller

2013-03-25 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-813:


Component/s: controller

 Minor cleanup in Controller
 ---

 Key: KAFKA-813
 URL: https://issues.apache.org/jira/browse/KAFKA-813
 Project: Kafka
  Issue Type: Bug
  Components: controller
Affects Versions: 0.8
Reporter: Swapnil Ghike
Assignee: Swapnil Ghike
Priority: Blocker
  Labels: kafka-0.8
 Fix For: 0.8

 Attachments: kafka-813-v1.patch, kafka-813-v2.patch, 
 kafka-813-v3.patch, kafka-813-v4.patch


 Before starting work on delete topic support, uploading a patch first to 
 address some minor hiccups that touch a bunch of files:
 1. Change PartitionOfflineException to PartitionUnavailableException because 
 in the partition state machine we mark a partition offline when its leader is 
 down, whereas the PartitionOfflineException is thrown when all the assigned 
 replicas of the partition are down.
 2. Change PartitionOfflineRate to UnavailablePartitionRate
 3. Remove default leader selector from partition state machine's 
 handleStateChange. We can specify null as default when we don't need to use a 
 leader selector.
 4. Include controller info in the client id of LeaderAndIsrRequest.
 5. Rename controllerContext.allleaders to something more meaningful - 
 partitionLeadershipInfo.
 6. We don't need to put partition in OnlinePartition state in partition state 
 machine initializeLeaderAndIsrForPartition, the state change occurs in 
 handleStateChange.
 7. Add todo in handleStateChanges
 8. Left a comment above ReassignedPartitionLeaderSelector that reassigned 
 replicas are already in the ISR (this is not true for other leader 
 selectors), renamed the vals in the selector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Closed] (KAFKA-813) Minor cleanup in Controller

2013-03-25 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede closed KAFKA-813.
---


 Minor cleanup in Controller
 ---

 Key: KAFKA-813
 URL: https://issues.apache.org/jira/browse/KAFKA-813
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Swapnil Ghike
Assignee: Swapnil Ghike
Priority: Blocker
  Labels: kafka-0.8
 Fix For: 0.8

 Attachments: kafka-813-v1.patch, kafka-813-v2.patch, 
 kafka-813-v3.patch, kafka-813-v4.patch


 Before starting work on delete topic support, uploading a patch first to 
 address some minor hiccups that touch a bunch of files:
 1. Change PartitionOfflineException to PartitionUnavailableException because 
 in the partition state machine we mark a partition offline when its leader is 
 down, whereas the PartitionOfflineException is thrown when all the assigned 
 replicas of the partition are down.
 2. Change PartitionOfflineRate to UnavailablePartitionRate
 3. Remove default leader selector from partition state machine's 
 handleStateChange. We can specify null as default when we don't need to use a 
 leader selector.
 4. Include controller info in the client id of LeaderAndIsrRequest.
 5. Rename controllerContext.allleaders to something more meaningful - 
 partitionLeadershipInfo.
 6. We don't need to put partition in OnlinePartition state in partition state 
 machine initializeLeaderAndIsrForPartition, the state change occurs in 
 handleStateChange.
 7. Add todo in handleStateChanges
 8. Left a comment above ReassignedPartitionLeaderSelector that reassigned 
 replicas are already in the ISR (this is not true for other leader 
 selectors), renamed the vals in the selector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (KAFKA-825) KafkaController.isActive() needs to be synchronized

2013-03-25 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-825:
-

 Summary: KafkaController.isActive() needs to be synchronized
 Key: KAFKA-825
 URL: https://issues.apache.org/jira/browse/KAFKA-825
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8
Reporter: Jun Rao
Assignee: Jun Rao
Priority: Blocker
 Attachments: kafka-825.patch

KafkaController.isActive() is not synchronized right now. This means that it 
could read an outdated controllerContext.controllerChannelManager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-825) KafkaController.isActive() needs to be synchronized

2013-03-25 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-825:
--

Status: Patch Available  (was: Open)

 KafkaController.isActive() needs to be synchronized
 ---

 Key: KAFKA-825
 URL: https://issues.apache.org/jira/browse/KAFKA-825
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8
Reporter: Jun Rao
Assignee: Jun Rao
Priority: Blocker
 Attachments: kafka-825.patch


 KafkaController.isActive() is not synchronized right now. This means that it 
 could read an outdated controllerContext.controllerChannelManager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-825) KafkaController.isActive() needs to be synchronized

2013-03-25 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-825:
--

Attachment: kafka-825.patch

Attach a patch.

 KafkaController.isActive() needs to be synchronized
 ---

 Key: KAFKA-825
 URL: https://issues.apache.org/jira/browse/KAFKA-825
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8
Reporter: Jun Rao
Assignee: Jun Rao
Priority: Blocker
 Attachments: kafka-825.patch


 KafkaController.isActive() is not synchronized right now. This means that it 
 could read an outdated controllerContext.controllerChannelManager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-825) KafkaController.isActive() needs to be synchronized

2013-03-25 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-825:
--

   Resolution: Fixed
Fix Version/s: 0.8
   Status: Resolved  (was: Patch Available)

Thanks for the review. Committed to 0.8.

 KafkaController.isActive() needs to be synchronized
 ---

 Key: KAFKA-825
 URL: https://issues.apache.org/jira/browse/KAFKA-825
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8
Reporter: Jun Rao
Assignee: Jun Rao
Priority: Blocker
 Fix For: 0.8

 Attachments: kafka-825.patch


 KafkaController.isActive() is not synchronized right now. This means that it 
 could read an outdated controllerContext.controllerChannelManager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Closed] (KAFKA-825) KafkaController.isActive() needs to be synchronized

2013-03-25 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede closed KAFKA-825.
---


 KafkaController.isActive() needs to be synchronized
 ---

 Key: KAFKA-825
 URL: https://issues.apache.org/jira/browse/KAFKA-825
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8
Reporter: Jun Rao
Assignee: Jun Rao
Priority: Blocker
 Fix For: 0.8

 Attachments: kafka-825.patch


 KafkaController.isActive() is not synchronized right now. This means that it 
 could read an outdated controllerContext.controllerChannelManager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-791) Fix validation bugs in System Test

2013-03-25 Thread John Fung (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Fung updated KAFKA-791:


Attachment: kafka-791-v4.patch

Uploaded kafka-791-v4.patch with the following changes:

1. Added system_test_utils.diff_list to compare if 2 lists are identical
2. validate_simple_consumer_data_matched_across_replicas will use diff_list to 
make sure all messages received are in the same order and identical
3. Removed function validate_simple_consumer_data_matched from 
kafka_system_test_utils.py for clean up.

 Fix validation bugs in System Test
 --

 Key: KAFKA-791
 URL: https://issues.apache.org/jira/browse/KAFKA-791
 Project: Kafka
  Issue Type: Task
Reporter: John Fung
Assignee: John Fung
  Labels: replication-testing
 Attachments: kafka-791-v1.patch, kafka-791-v2.patch, 
 kafka-791-v3.patch, kafka-791-v4.patch


 The following issues are found in data / log checksum match in System Test:
 1. kafka_system_test_utils.validate_simple_consumer_data_matched
 It reports PASSED even some log segments don't match
 2. kafka_system_test_utils.validate_data_matched (this is fixed and patched 
 in local Hudson for some time)
 It reports PASSED in the Ack=1 cases even data loss is greater than the 
 tolerance (1%).
 3. kafka_system_test_utils.validate_simple_consumer_data_matched
 It gets a unique set of MessageID to validate. It should leave all MessageID 
 as is (no dedup needed) and the test case should fail if sorted MessageID 
 don't match across the replicas.
 4. There is a data loss tolerance of 1% in the test cases of Ack=1. Currently 
 1% is too strict and seeing some random failures due to 2 ~ 3% of data loss. 
 It will be increased to 5% such that the System Test will get a more 
 consistent passing rate in those test cases. The following will be updated to 
 5% tolerance in kafka_system_test_utils:
 validate_data_matched
 validate_simple_consumer_data_matched
 validate_data_matched_in_multi_topics_from_single_consumer_producer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Any update on the distributed commit problem?

2013-03-25 Thread Darren Sargent
This is where you are reading messages from a broker, doing something with the 
messages, then commit them to some permanent storage such as HBase. There is a 
race condition in commiting the offsets to Zookeeper; if the DB write succeeds, 
but the ZK commit fails for any reason, you'll get a duplicate batch next time 
you query the broker. If you commit to ZK first, and the commit to the DB then 
fails, you lose data.

The Kafka white paper mentions that Kafka stays agnostic about the distributed 
commit problem. There has been some prior discussion about this but I haven't 
seen any solid solutions. If you're using something like PostgreSQL that admits 
two-phase commits, you can roll the offset into the DB transaction, assuming 
you're okay with storing offsets in the DB rather than in ZK, but that's not a 
general solution.

Is there anything in Kafka 0.8.x that helps address this issue?

--Darren Sargent
RichRelevance (www.richrelevance.com)


[jira] Subscription: outstanding kafka patches

2013-03-25 Thread jira
Issue Subscription
Filter: outstanding kafka patches (61 issues)
The list of outstanding kafka patches
Subscriber: kafka-mailing-list

Key Summary
KAFKA-823   merge 0.8 (51421fcc0111031bb77f779a6f6c00520d526a34) to trunk
https://issues.apache.org/jira/browse/KAFKA-823
KAFKA-815   Improve SimpleConsumerShell to take in a max messages config option
https://issues.apache.org/jira/browse/KAFKA-815
KAFKA-808   Migration tool internal queue between consumer and producer threads 
should be configurable
https://issues.apache.org/jira/browse/KAFKA-808
KAFKA-791   Fix validation bugs in System Test
https://issues.apache.org/jira/browse/KAFKA-791
KAFKA-745   Remove getShutdownReceive() and other kafka specific code from the 
RequestChannel
https://issues.apache.org/jira/browse/KAFKA-745
KAFKA-739   Handle null values in Message payload
https://issues.apache.org/jira/browse/KAFKA-739
KAFKA-735   Add looping and JSON output for ConsumerOffsetChecker
https://issues.apache.org/jira/browse/KAFKA-735
KAFKA-733   Fat jar option for build, or override for ivy cache location 
https://issues.apache.org/jira/browse/KAFKA-733
KAFKA-717   scala 2.10 build support
https://issues.apache.org/jira/browse/KAFKA-717
KAFKA-705   Controlled shutdown doesn't seem to work on more than one broker in 
a cluster
https://issues.apache.org/jira/browse/KAFKA-705
KAFKA-686   0.8 Kafka broker should give a better error message when running 
against 0.7 zookeeper
https://issues.apache.org/jira/browse/KAFKA-686
KAFKA-682   java.lang.OutOfMemoryError: Java heap space
https://issues.apache.org/jira/browse/KAFKA-682
KAFKA-677   Retention process gives exception if an empty segment is chosen for 
collection
https://issues.apache.org/jira/browse/KAFKA-677
KAFKA-674   Clean Shutdown Testing - Log segments checksums mismatch
https://issues.apache.org/jira/browse/KAFKA-674
KAFKA-652   Create testcases for clean shut-down
https://issues.apache.org/jira/browse/KAFKA-652
KAFKA-645   Create a shell script to run System Test with DEBUG details and 
tee console output to a file
https://issues.apache.org/jira/browse/KAFKA-645
KAFKA-637   Separate log4j environment variable from KAFKA_OPTS in 
kafka-run-class.sh
https://issues.apache.org/jira/browse/KAFKA-637
KAFKA-621   System Test 9051 : ConsoleConsumer doesn't receives any data for 20 
topics but works for 10
https://issues.apache.org/jira/browse/KAFKA-621
KAFKA-607   System Test Transient Failure (case 4011 Log Retention) - 
ConsoleConsumer receives less data
https://issues.apache.org/jira/browse/KAFKA-607
KAFKA-606   System Test Transient Failure (case 0302 GC Pause) - Log segments 
mismatched across replicas
https://issues.apache.org/jira/browse/KAFKA-606
KAFKA-598   decouple fetch size from max message size
https://issues.apache.org/jira/browse/KAFKA-598
KAFKA-583   SimpleConsumerShell may receive less data inconsistently
https://issues.apache.org/jira/browse/KAFKA-583
KAFKA-552   No error messages logged for those failing-to-send messages from 
Producer
https://issues.apache.org/jira/browse/KAFKA-552
KAFKA-547   The ConsumerStats MBean name should include the groupid
https://issues.apache.org/jira/browse/KAFKA-547
KAFKA-530   kafka.server.KafkaApis: kafka.common.OffsetOutOfRangeException
https://issues.apache.org/jira/browse/KAFKA-530
KAFKA-493   High CPU usage on inactive server
https://issues.apache.org/jira/browse/KAFKA-493
KAFKA-479   ZK EPoll taking 100% CPU usage with Kafka Client
https://issues.apache.org/jira/browse/KAFKA-479
KAFKA-465   Performance test scripts - refactoring leftovers from tools to perf 
package
https://issues.apache.org/jira/browse/KAFKA-465
KAFKA-438   Code cleanup in MessageTest
https://issues.apache.org/jira/browse/KAFKA-438
KAFKA-419   Updated PHP client library to support kafka 0.7+
https://issues.apache.org/jira/browse/KAFKA-419
KAFKA-414   Evaluate mmap-based writes for Log implementation
https://issues.apache.org/jira/browse/KAFKA-414
KAFKA-411   Message Error in high cocurrent environment
https://issues.apache.org/jira/browse/KAFKA-411
KAFKA-404   When using chroot path, create chroot on startup if it doesn't exist
https://issues.apache.org/jira/browse/KAFKA-404
KAFKA-399   0.7.1 seems to show less performance than 0.7.0
https://issues.apache.org/jira/browse/KAFKA-399
KAFKA-398   Enhance SocketServer to Enable Sending Requests
https://issues.apache.org/jira/browse/KAFKA-398
KAFKA-397   kafka.common.InvalidMessageSizeException: null
https://issues.apache.org/jira/browse/KAFKA-397
KAFKA-388   Add a highly available consumer 

Re: Any update on the distributed commit problem?

2013-03-25 Thread Neha Narkhede
Today, the only safe way of controlling consumer state management is
by using the SimpleConsumer. The application is responsible for
checkpointing offsets. So, in your example, when you commit a database
transaction, you can store your consumer's offset as part of the txn.
So either your txn succeeds and the offset moves ahead or your txn
fails and the offset stays where it is.

Kafka 0.9 is when we will attempt to merge the high level and low
level consumer APIs, move the offset management to the broker and
offer stronger offset checkpointing guarantees.

Thanks,
Neha

On Mon, Mar 25, 2013 at 11:36 AM, Darren Sargent
dsarg...@richrelevance.com wrote:
 This is where you are reading messages from a broker, doing something with 
 the messages, then commit them to some permanent storage such as HBase. There 
 is a race condition in commiting the offsets to Zookeeper; if the DB write 
 succeeds, but the ZK commit fails for any reason, you'll get a duplicate 
 batch next time you query the broker. If you commit to ZK first, and the 
 commit to the DB then fails, you lose data.

 The Kafka white paper mentions that Kafka stays agnostic about the 
 distributed commit problem. There has been some prior discussion about this 
 but I haven't seen any solid solutions. If you're using something like 
 PostgreSQL that admits two-phase commits, you can roll the offset into the DB 
 transaction, assuming you're okay with storing offsets in the DB rather than 
 in ZK, but that's not a general solution.

 Is there anything in Kafka 0.8.x that helps address this issue?

 --Darren Sargent
 RichRelevance (www.richrelevance.com)


[jira] [Commented] (KAFKA-133) Publish kafka jar to a public maven repository

2013-03-25 Thread Darren Sargent (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13613096#comment-13613096
 ] 

Darren Sargent commented on KAFKA-133:
--

This ticket is marked as fixed, but the latest on the 0.8 branch still depends 
on an unpublished version of Yammer - this is from Build.scala:

 dependency
groupIdcom.yammer.metrics/groupId
artifactIdmetrics-core/artifactId
version3.0.0-c0c8be71/version
scopecompile/scope
  /dependency
  dependency
groupIdcom.yammer.metrics/groupId
artifactIdmetrics-annotations/artifactId
version3.0.0-c0c8be71/version
scopecompile/scope
  /dependency

Maven Central doesn't have a version anything like the one mentioned above: 
http://repo2.maven.org/maven2/com/yammer/metrics/metrics-core/

 Publish kafka jar to a public maven repository
 --

 Key: KAFKA-133
 URL: https://issues.apache.org/jira/browse/KAFKA-133
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.6, 0.8
Reporter: Neha Narkhede
  Labels: patch
 Fix For: 0.8

 Attachments: KAFKA-133.patch, pom.xml


 The released kafka jar must be download manually and then deploy to a private 
 repository before they can be used by a developer using maven2.
 Similar to other Apache projects, it will be nice to have a way to publish 
 Kafka releases to a public maven repo. 
 In the past, we gave it a try using sbt publish to Sonatype Nexus maven repo, 
 but ran into some authentication problems. It will be good to revisit this 
 and get it resolved.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (KAFKA-826) Make Kafka 0.8 depend on metrics 2.2.0 instead of 3.x

2013-03-25 Thread Neha Narkhede (JIRA)
Neha Narkhede created KAFKA-826:
---

 Summary: Make Kafka 0.8 depend on metrics 2.2.0 instead of 3.x
 Key: KAFKA-826
 URL: https://issues.apache.org/jira/browse/KAFKA-826
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8
Reporter: Neha Narkhede
Priority: Blocker


In order to mavenize Kafka 0.8, we have to depend on metrics 2.2.0 since 
metrics 3.x is a huge change as well as not an officially supported release.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-826) Make Kafka 0.8 depend on metrics 2.2.0 instead of 3.x

2013-03-25 Thread Scott Carey (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13613192#comment-13613192
 ] 

Scott Carey commented on KAFKA-826:
---

Thank you!  We will be able to test and validate this quickly once there is a 
patch.

Metrics 3.0.x has hit its first snapshot recently:
https://groups.google.com/forum/#!topic/metrics-user/c4sPUhLjHEQ

However, it looks like it won't be done in time for Kafka 0.8.  It now at least 
does not conflict  with a copy of 2.2.x as badly as it did a couple months ago.

 Make Kafka 0.8 depend on metrics 2.2.0 instead of 3.x
 -

 Key: KAFKA-826
 URL: https://issues.apache.org/jira/browse/KAFKA-826
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8
Reporter: Neha Narkhede
Priority: Blocker
  Labels: kafka-0.8

 In order to mavenize Kafka 0.8, we have to depend on metrics 2.2.0 since 
 metrics 3.x is a huge change as well as not an officially supported release.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Any update on the distributed commit problem?

2013-03-25 Thread Scott Carey
In order to succeed in a client with two phase commit one needs the
coordinates of the current batch of records from the client perspective
of the batch (which may be many SimpleConsumer smaller batches).  I
believe this is the range of offsets plus any partition information that
describe the batch uniquely.  One risk I worry about is someone restarting
a consumer with a different configured batch size after such a race
condition is triggered, causing the batch to appear to be 'after' what has
been committed but actually have overlap -- but perhaps this is only a
fear due to the limitations and opaqueness of the SimpleConsumer in 0.7.x
and an incomplete understanding of what can be done with the 0.8.x version.

Is the offset alone from the SimpleConsumer in 0.8 sufficient to build two
phase commit in a consumer application with multiple partitions?  Are
there any additional requirements, such as committing on certain offset
boundaries that align with boundaries elsewhere?

I'm not afraid of building my own two phase commit, but I am afraid of not
having all the information I need from Kafka to succeed in the attempt.



On 3/25/13 12:33 PM, Neha Narkhede neha.narkh...@gmail.com wrote:

Today, the only safe way of controlling consumer state management is
by using the SimpleConsumer. The application is responsible for
checkpointing offsets. So, in your example, when you commit a database
transaction, you can store your consumer's offset as part of the txn.
So either your txn succeeds and the offset moves ahead or your txn
fails and the offset stays where it is.

Kafka 0.9 is when we will attempt to merge the high level and low
level consumer APIs, move the offset management to the broker and
offer stronger offset checkpointing guarantees.

Thanks,
Neha

On Mon, Mar 25, 2013 at 11:36 AM, Darren Sargent
dsarg...@richrelevance.com wrote:
 This is where you are reading messages from a broker, doing something
with the messages, then commit them to some permanent storage such as
HBase. There is a race condition in commiting the offsets to Zookeeper;
if the DB write succeeds, but the ZK commit fails for any reason, you'll
get a duplicate batch next time you query the broker. If you commit to
ZK first, and the commit to the DB then fails, you lose data.

 The Kafka white paper mentions that Kafka stays agnostic about the
distributed commit problem. There has been some prior discussion about
this but I haven't seen any solid solutions. If you're using something
like PostgreSQL that admits two-phase commits, you can roll the offset
into the DB transaction, assuming you're okay with storing offsets in
the DB rather than in ZK, but that's not a general solution.

 Is there anything in Kafka 0.8.x that helps address this issue?

 --Darren Sargent
 RichRelevance (www.richrelevance.com)



kafka pull request: zkclient and scalatest library updates

2013-03-25 Thread polymorphic
GitHub user polymorphic opened a pull request:

https://github.com/apache/kafka/pull/3

zkclient and scalatest library updates

Following https://issues.apache.org/jira/browse/KAFKA-826 I forked the code 
and included fixes for 2 bugs I reported, 
https://issues.apache.org/jira/browse/KAFKA-807 and 
https://issues.apache.org/jira/browse/KAFKA-809

All tests pass except kafka.log.LogTest which fails on my Mac--I don't 
think it is related to the zkclient fix, I could be wrong.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/polymorphic/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3.patch


commit 9699033b90f911dc6c9d7b2b8a733114b84b080a
Author: Dragos Manolescu dragos.manole...@service-now.com
Date:   2013-03-25T22:22:28Z

Initial commit after fork

commit 7183a5cd046f25c2eb77efb99a5a2da4d1baadef
Author: Dragos Manolescu dragos.manole...@service-now.com
Date:   2013-03-25T23:04:13Z

Updated zkclient dependency to 0.2

commit 1bf91504f4d16f2343e7003f27f316d33888c931
Author: Dragos Manolescu dragos.manole...@service-now.com
Date:   2013-03-25T23:34:10Z

Fixed typo in ConsoleProducer; updated scalatest version for 2.9.2 (if 
required)





Re: kafka pull request: zkclient and scalatest library updates

2013-03-25 Thread Neha Narkhede
Thanks for the pull request, do you mind attaching the patches to the
respective JIRAs ? That is how we review and accept patches in Kafka.

-Neha

On Mon, Mar 25, 2013 at 4:37 PM, polymorphic g...@git.apache.org wrote:
 GitHub user polymorphic opened a pull request:

 https://github.com/apache/kafka/pull/3

 zkclient and scalatest library updates

 Following https://issues.apache.org/jira/browse/KAFKA-826 I forked the 
 code and included fixes for 2 bugs I reported, 
 https://issues.apache.org/jira/browse/KAFKA-807 and 
 https://issues.apache.org/jira/browse/KAFKA-809

 All tests pass except kafka.log.LogTest which fails on my Mac--I don't 
 think it is related to the zkclient fix, I could be wrong.

 You can merge this pull request into a Git repository by running:

 $ git pull https://github.com/polymorphic/kafka trunk

 Alternatively you can review and apply these changes as the patch at:

 https://github.com/apache/kafka/pull/3.patch

 
 commit 9699033b90f911dc6c9d7b2b8a733114b84b080a
 Author: Dragos Manolescu dragos.manole...@service-now.com
 Date:   2013-03-25T22:22:28Z

 Initial commit after fork

 commit 7183a5cd046f25c2eb77efb99a5a2da4d1baadef
 Author: Dragos Manolescu dragos.manole...@service-now.com
 Date:   2013-03-25T23:04:13Z

 Updated zkclient dependency to 0.2

 commit 1bf91504f4d16f2343e7003f27f316d33888c931
 Author: Dragos Manolescu dragos.manole...@service-now.com
 Date:   2013-03-25T23:34:10Z

 Fixed typo in ConsoleProducer; updated scalatest version for 2.9.2 (if 
 required)