Consumer Events Problem
I am working on a small project and discovered that our consumer hasn't been executed for over a month now. How can i check the unprocessed events ? From which date the events are available and what is the retention policy on the producer side ? Thanks, Sanket Maru
Re: Consumer Events Problem
What version are you running? Philip On Mon, Dec 9, 2013 at 4:30 AM, Sanket Maru san...@ecinity.com wrote: I am working on a small project and discovered that our consumer hasn't been executed for over a month now. How can i check the unprocessed events ? From which date the events are available and what is the retention policy on the producer side ? Thanks, Sanket Maru
Re: Consumer Events Problem
I am using kafka 0.8.0 On Mon, Dec 9, 2013 at 6:09 PM, Philip O'Toole phi...@loggly.com wrote: What version are you running? Philip On Mon, Dec 9, 2013 at 4:30 AM, Sanket Maru san...@ecinity.com wrote: I am working on a small project and discovered that our consumer hasn't been executed for over a month now. How can i check the unprocessed events ? From which date the events are available and what is the retention policy on the producer side ? Thanks, Sanket Maru
Re: Consumer Events Problem
OK, I am only familiar with 0.72. Philip On Mon, Dec 9, 2013 at 4:54 AM, Sanket Maru san...@ecinity.com wrote: I am using kafka 0.8.0 On Mon, Dec 9, 2013 at 6:09 PM, Philip O'Toole phi...@loggly.com wrote: What version are you running? Philip On Mon, Dec 9, 2013 at 4:30 AM, Sanket Maru san...@ecinity.com wrote: I am working on a small project and discovered that our consumer hasn't been executed for over a month now. How can i check the unprocessed events ? From which date the events are available and what is the retention policy on the producer side ? Thanks, Sanket Maru
Re: How do you keep track of offset in a partition
Hi, How to know where to start consuming from, if we have already consumed a few messages. Some where in between latest and the earliest? How to identify the timestamp or offset from the api. Thanks, Bhargav --
Re: Consumer Events Problem
By default, each topic is kept on the broker for 7 days. Older data, whether consumed or not, will be deleted. To check # of unconsumed messages, you can either use the ConsumerOffsetChecker tool (see https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Myconsumerseemstohavestopped%2Cwhy%3F) or use the maxlag jmx in the consumer application (see http://kafka.apache.org/documentation.html#consumerconfigs) Thanks, Jun On Mon, Dec 9, 2013 at 4:30 AM, Sanket Maru san...@ecinity.com wrote: I am working on a small project and discovered that our consumer hasn't been executed for over a month now. How can i check the unprocessed events ? From which date the events are available and what is the retention policy on the producer side ? Thanks, Sanket Maru
Re: How do you keep track of offset in a partition
Consumer offsets are typically checkpointed into ZK. To check # of unconsumed messages, you can either use the ConsumerOffsetChecker tool (see https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Myconsumerseemstohavestopped%2Cwhy%3F) or use the maxlag jmx in the consumer application (see http://kafka.apache.org/documentation.html#consumerconfigs) Currently, we don't have an api to translate from an offset to timestamp. Do you want the timestamp when the message is added to the broker? Thanks, Jun On Sun, Dec 8, 2013 at 11:57 PM, Bhargav bharga...@cleartrip.com wrote: Hi, How to know where to start consuming from, if we have already consumed a few messages. Some where in between latest and the earliest? How to identify the timestamp or offset from the api. Thanks, Bhargav --
Re: Partition reassignment stucked
Hi, We used the code checked in this branch a few hours before the official 0.8.0 final release : https://github.com/apache/kafka/tree/0.8 So hopefully it should be the exact same code as the official release. The controller logs are empty. In a previous exchange you advised us to not use trunk unless we felt adventurous. So if you think 0.8.1 is production ready, and that it shouldn't crash unexpectedly, we can probably give it a try. Otherwise, since our data pipeline heavily relies on Kafka, we might have to use another solution... (bring up a brand new cluster each time we want to resize the cluster ?) Thanks Maxime 2013/12/8 Neha Narkhede neha.narkh...@gmail.com Unfortunately quite a few bugs with reassignment are fixed only in 0.8.1. I wonder if you can run trunk and see how that goes? Thanks, Neha On Dec 6, 2013 9:46 PM, Jun Rao jun...@gmail.com wrote: Are you using the 0.8.0 final release? Any error in controller log? Thanks, Jun On Fri, Dec 6, 2013 at 4:38 PM, Maxime Nay maxime...@gmail.com wrote: Hi, We are trying to add a broker to a 10 node cluster. We have 7 different topics, each of them is divided in 10 partitions, and their replication factor is 3. To send traffic to this new node, we tried the kafka-reassign-partitions.sh tool, but for some reason, it doesnt work, and now it seems that we are stuck in some kind of partition reassignment process. We can't see any data on our new node, and when we try to execute the partition reassignment tool for another topic it says : Partitions reassignment failed due to Partition reassignment currently in progress for Map([AD_EVENTS,5] - ReassignedPartitionsContext(List(2),null), [AD_EVENTS,3] - ReassignedPartitionsContext(List(1),null), [AD_EVENTS,4] - ReassignedPartitionsContext(List(3),null)). Aborting operation kafka.common.AdminCommandFailedException: Partition reassignment currently in progress for Map([AD_EVENTS,5] - ReassignedPartitionsContext(List(2),null), [AD_EVENTS,3] - ReassignedPartitionsContext(List(1),null), [AD_EVENTS,4] - ReassignedPartitionsContext(List(3),null)). Aborting operation at kafka.admin.ReassignPartitionsCommand.reassignPartitions(ReassignPartitionsCommand.scala:201) at kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:140) at kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala) Any help would be greatly appreciated ! Thanks Maxime
Re: Partition reassignment stucked
I wouldn't call 0.8.1 production ready just yet. We are still in the process of deploying it at LinkedIn. Until it is ready, there isn't a good cluster expansion solution other than spinning up a new cluster. This is probably a little easier if you have a VIP in front of your kafka cluster. Thanks, Neha On Mon, Dec 9, 2013 at 9:40 AM, Maxime Nay maxime...@gmail.com wrote: Hi, We used the code checked in this branch a few hours before the official 0.8.0 final release : https://github.com/apache/kafka/tree/0.8 So hopefully it should be the exact same code as the official release. The controller logs are empty. In a previous exchange you advised us to not use trunk unless we felt adventurous. So if you think 0.8.1 is production ready, and that it shouldn't crash unexpectedly, we can probably give it a try. Otherwise, since our data pipeline heavily relies on Kafka, we might have to use another solution... (bring up a brand new cluster each time we want to resize the cluster ?) Thanks Maxime 2013/12/8 Neha Narkhede neha.narkh...@gmail.com Unfortunately quite a few bugs with reassignment are fixed only in 0.8.1. I wonder if you can run trunk and see how that goes? Thanks, Neha On Dec 6, 2013 9:46 PM, Jun Rao jun...@gmail.com wrote: Are you using the 0.8.0 final release? Any error in controller log? Thanks, Jun On Fri, Dec 6, 2013 at 4:38 PM, Maxime Nay maxime...@gmail.com wrote: Hi, We are trying to add a broker to a 10 node cluster. We have 7 different topics, each of them is divided in 10 partitions, and their replication factor is 3. To send traffic to this new node, we tried the kafka-reassign-partitions.sh tool, but for some reason, it doesnt work, and now it seems that we are stuck in some kind of partition reassignment process. We can't see any data on our new node, and when we try to execute the partition reassignment tool for another topic it says : Partitions reassignment failed due to Partition reassignment currently in progress for Map([AD_EVENTS,5] - ReassignedPartitionsContext(List(2),null), [AD_EVENTS,3] - ReassignedPartitionsContext(List(1),null), [AD_EVENTS,4] - ReassignedPartitionsContext(List(3),null)). Aborting operation kafka.common.AdminCommandFailedException: Partition reassignment currently in progress for Map([AD_EVENTS,5] - ReassignedPartitionsContext(List(2),null), [AD_EVENTS,3] - ReassignedPartitionsContext(List(1),null), [AD_EVENTS,4] - ReassignedPartitionsContext(List(3),null)). Aborting operation at kafka.admin.ReassignPartitionsCommand.reassignPartitions(ReassignPartitionsCommand.scala:201) at kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:140) at kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala) Any help would be greatly appreciated ! Thanks Maxime
Re: Partition reassignment stucked
Ok, thanks for your help. When 0.8.1 will be production ready, will you announce it somewhere (will you release it right away) ? Thanks, Maxime 2013/12/9 Neha Narkhede neha.narkh...@gmail.com I wouldn't call 0.8.1 production ready just yet. We are still in the process of deploying it at LinkedIn. Until it is ready, there isn't a good cluster expansion solution other than spinning up a new cluster. This is probably a little easier if you have a VIP in front of your kafka cluster. Thanks, Neha On Mon, Dec 9, 2013 at 9:40 AM, Maxime Nay maxime...@gmail.com wrote: Hi, We used the code checked in this branch a few hours before the official 0.8.0 final release : https://github.com/apache/kafka/tree/0.8 So hopefully it should be the exact same code as the official release. The controller logs are empty. In a previous exchange you advised us to not use trunk unless we felt adventurous. So if you think 0.8.1 is production ready, and that it shouldn't crash unexpectedly, we can probably give it a try. Otherwise, since our data pipeline heavily relies on Kafka, we might have to use another solution... (bring up a brand new cluster each time we want to resize the cluster ?) Thanks Maxime 2013/12/8 Neha Narkhede neha.narkh...@gmail.com Unfortunately quite a few bugs with reassignment are fixed only in 0.8.1. I wonder if you can run trunk and see how that goes? Thanks, Neha On Dec 6, 2013 9:46 PM, Jun Rao jun...@gmail.com wrote: Are you using the 0.8.0 final release? Any error in controller log? Thanks, Jun On Fri, Dec 6, 2013 at 4:38 PM, Maxime Nay maxime...@gmail.com wrote: Hi, We are trying to add a broker to a 10 node cluster. We have 7 different topics, each of them is divided in 10 partitions, and their replication factor is 3. To send traffic to this new node, we tried the kafka-reassign-partitions.sh tool, but for some reason, it doesnt work, and now it seems that we are stuck in some kind of partition reassignment process. We can't see any data on our new node, and when we try to execute the partition reassignment tool for another topic it says : Partitions reassignment failed due to Partition reassignment currently in progress for Map([AD_EVENTS,5] - ReassignedPartitionsContext(List(2),null), [AD_EVENTS,3] - ReassignedPartitionsContext(List(1),null), [AD_EVENTS,4] - ReassignedPartitionsContext(List(3),null)). Aborting operation kafka.common.AdminCommandFailedException: Partition reassignment currently in progress for Map([AD_EVENTS,5] - ReassignedPartitionsContext(List(2),null), [AD_EVENTS,3] - ReassignedPartitionsContext(List(1),null), [AD_EVENTS,4] - ReassignedPartitionsContext(List(3),null)). Aborting operation at kafka.admin.ReassignPartitionsCommand.reassignPartitions(ReassignPartitionsCommand.scala:201) at kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:140) at kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala) Any help would be greatly appreciated ! Thanks Maxime
Re: storing last processed offset, recovery of failed message processing etc.
Say am I doing this, a scenerio that I just came up with that demonstrates #2. Someone signs up on a website, and you have to: 1. create the user profile 2. send email confirmation email 3. resize avatar Now once a person registers on a website, I write a message to Kafka. Now I have 3 different things to process (1,2,3), if I get to #2 and then the server loses power, if I replay, I will re-send the confirmation email 2 times. Sure in this case its not that big of a deal, but just pretend it is, what should be done? I guess I have to keep track of state then per step in ZK right? I mean that's the only way so I guess I am answering my own question but was hoping for people with real-life experience to chime in. I could write 3 messages to kafka, but maybe order is important :) On Mon, Dec 9, 2013 at 3:31 PM, Philip O'Toole phi...@loggly.com wrote: We use Zookeeper, as is standard with Kafka. Our systems are idempotent, so we only store offsets when the message is fully processed. If this means we occasionally replay a message due to some corner-case, or simply a restart, it doesn't matter. Philip On Mon, Dec 9, 2013 at 12:28 PM, S Ahmed sahmed1...@gmail.com wrote: I was hoping people could comment on how they handle the following scenerios: 1. Storing the last successfully processed messageId/Offset. Are people using mysql, redis, etc.? What are the tradeoffs here? 2. How do you handle recovering from an error while processesing a given event? There are various scenerioes for #2, like: 1. Do you mark the start of processing a message somewhere, and then update the status to complete and THEN update the last messaged processed for #1? 2. Do you only mark the status as complete, and not the start of processing it? I guess this depends of there are intermediate steps and processing the entire message again would result in some duplicated work right?
Re: storing last processed offset, recovery of failed message processing etc.
You might look at Curator http://curator.apache.org/ On Mon, Dec 9, 2013 at 12:36 PM, S Ahmed sahmed1...@gmail.com wrote: Say am I doing this, a scenerio that I just came up with that demonstrates #2. Someone signs up on a website, and you have to: 1. create the user profile 2. send email confirmation email 3. resize avatar Now once a person registers on a website, I write a message to Kafka. Now I have 3 different things to process (1,2,3), if I get to #2 and then the server loses power, if I replay, I will re-send the confirmation email 2 times. Sure in this case its not that big of a deal, but just pretend it is, what should be done? I guess I have to keep track of state then per step in ZK right? I mean that's the only way so I guess I am answering my own question but was hoping for people with real-life experience to chime in. I could write 3 messages to kafka, but maybe order is important :) On Mon, Dec 9, 2013 at 3:31 PM, Philip O'Toole phi...@loggly.com wrote: We use Zookeeper, as is standard with Kafka. Our systems are idempotent, so we only store offsets when the message is fully processed. If this means we occasionally replay a message due to some corner-case, or simply a restart, it doesn't matter. Philip On Mon, Dec 9, 2013 at 12:28 PM, S Ahmed sahmed1...@gmail.com wrote: I was hoping people could comment on how they handle the following scenerios: 1. Storing the last successfully processed messageId/Offset. Are people using mysql, redis, etc.? What are the tradeoffs here? 2. How do you handle recovering from an error while processesing a given event? There are various scenerioes for #2, like: 1. Do you mark the start of processing a message somewhere, and then update the status to complete and THEN update the last messaged processed for #1? 2. Do you only mark the status as complete, and not the start of processing it? I guess this depends of there are intermediate steps and processing the entire message again would result in some duplicated work right?
Migration Process from Kafka 0.7 to 0.8
Hi, I am working on a migration plan - moving from kafka 0.7 to kafka 0.8 - and came across the webpage below that was last edited on April 26, 2013. Is this the most up-to-date information page regarding the migration process? And are there other useful webpages? https://cwiki.apache.org/confluence/display/KAFKA/Migrating+from+0.7+to+0.8#Migratingfrom0.7to0.8-IsshallowiterationsupportedfortheMigrationtool%27sconsumerconfig%3F -- Available at these partners: [image: CloudFlare | shopify | Bigcommerce] -- ARNAUD LAWSON, SYSTEMS OPERATIONS ENGINEER *SOCIOCAST* Simple. Powerful. Predictions. 96 SPRING ST, 7TH FLOOR, NEW YORK, NY 10012 C: (240) 393 - 6703 F: 646.349.4063 E: arnaud.law...@sociocast.com W: www.sociocast.com Increase Conversion Rates up to 500%. Go to www.sociocast.com and enter your URL for a free trial!
Re: Partition reassignment stucked
We will announce it on this mailing list. It is probably a month away from a release. Thanks, Neha On Mon, Dec 9, 2013 at 12:02 PM, Maxime Nay maxime...@gmail.com wrote: Ok, thanks for your help. When 0.8.1 will be production ready, will you announce it somewhere (will you release it right away) ? Thanks, Maxime 2013/12/9 Neha Narkhede neha.narkh...@gmail.com I wouldn't call 0.8.1 production ready just yet. We are still in the process of deploying it at LinkedIn. Until it is ready, there isn't a good cluster expansion solution other than spinning up a new cluster. This is probably a little easier if you have a VIP in front of your kafka cluster. Thanks, Neha On Mon, Dec 9, 2013 at 9:40 AM, Maxime Nay maxime...@gmail.com wrote: Hi, We used the code checked in this branch a few hours before the official 0.8.0 final release : https://github.com/apache/kafka/tree/0.8 So hopefully it should be the exact same code as the official release. The controller logs are empty. In a previous exchange you advised us to not use trunk unless we felt adventurous. So if you think 0.8.1 is production ready, and that it shouldn't crash unexpectedly, we can probably give it a try. Otherwise, since our data pipeline heavily relies on Kafka, we might have to use another solution... (bring up a brand new cluster each time we want to resize the cluster ?) Thanks Maxime 2013/12/8 Neha Narkhede neha.narkh...@gmail.com Unfortunately quite a few bugs with reassignment are fixed only in 0.8.1. I wonder if you can run trunk and see how that goes? Thanks, Neha On Dec 6, 2013 9:46 PM, Jun Rao jun...@gmail.com wrote: Are you using the 0.8.0 final release? Any error in controller log? Thanks, Jun On Fri, Dec 6, 2013 at 4:38 PM, Maxime Nay maxime...@gmail.com wrote: Hi, We are trying to add a broker to a 10 node cluster. We have 7 different topics, each of them is divided in 10 partitions, and their replication factor is 3. To send traffic to this new node, we tried the kafka-reassign-partitions.sh tool, but for some reason, it doesnt work, and now it seems that we are stuck in some kind of partition reassignment process. We can't see any data on our new node, and when we try to execute the partition reassignment tool for another topic it says : Partitions reassignment failed due to Partition reassignment currently in progress for Map([AD_EVENTS,5] - ReassignedPartitionsContext(List(2),null), [AD_EVENTS,3] - ReassignedPartitionsContext(List(1),null), [AD_EVENTS,4] - ReassignedPartitionsContext(List(3),null)). Aborting operation kafka.common.AdminCommandFailedException: Partition reassignment currently in progress for Map([AD_EVENTS,5] - ReassignedPartitionsContext(List(2),null), [AD_EVENTS,3] - ReassignedPartitionsContext(List(1),null), [AD_EVENTS,4] - ReassignedPartitionsContext(List(3),null)). Aborting operation at kafka.admin.ReassignPartitionsCommand.reassignPartitions(ReassignPartitionsCommand.scala:201) at kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:140) at kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala) Any help would be greatly appreciated ! Thanks Maxime
Presentation on our use of Kafka at Flurry
Hello fellow kafka users, I just wanted to share a presentation one of my colleagues made at SK Planet's Tech Planet conference. It's about how Flurry transitioned from a map-reduce batching data processing system to using Kafka to do continuous processing in our data ingestion pipeline. I thought some people on the list might enjoy hearing about how Kafka improved our lives. :) The video can be found here on YouTube: http://www.youtube.com/watch?v=hB1l0wFKNBA -- Ian Friedman
Re: Partition reassignment stucked
That's good to know. Thanks for your help! 2013/12/9 Neha Narkhede neha.narkh...@gmail.com We will announce it on this mailing list. It is probably a month away from a release. Thanks, Neha On Mon, Dec 9, 2013 at 12:02 PM, Maxime Nay maxime...@gmail.com wrote: Ok, thanks for your help. When 0.8.1 will be production ready, will you announce it somewhere (will you release it right away) ? Thanks, Maxime 2013/12/9 Neha Narkhede neha.narkh...@gmail.com I wouldn't call 0.8.1 production ready just yet. We are still in the process of deploying it at LinkedIn. Until it is ready, there isn't a good cluster expansion solution other than spinning up a new cluster. This is probably a little easier if you have a VIP in front of your kafka cluster. Thanks, Neha On Mon, Dec 9, 2013 at 9:40 AM, Maxime Nay maxime...@gmail.com wrote: Hi, We used the code checked in this branch a few hours before the official 0.8.0 final release : https://github.com/apache/kafka/tree/0.8 So hopefully it should be the exact same code as the official release. The controller logs are empty. In a previous exchange you advised us to not use trunk unless we felt adventurous. So if you think 0.8.1 is production ready, and that it shouldn't crash unexpectedly, we can probably give it a try. Otherwise, since our data pipeline heavily relies on Kafka, we might have to use another solution... (bring up a brand new cluster each time we want to resize the cluster ?) Thanks Maxime 2013/12/8 Neha Narkhede neha.narkh...@gmail.com Unfortunately quite a few bugs with reassignment are fixed only in 0.8.1. I wonder if you can run trunk and see how that goes? Thanks, Neha On Dec 6, 2013 9:46 PM, Jun Rao jun...@gmail.com wrote: Are you using the 0.8.0 final release? Any error in controller log? Thanks, Jun On Fri, Dec 6, 2013 at 4:38 PM, Maxime Nay maxime...@gmail.com wrote: Hi, We are trying to add a broker to a 10 node cluster. We have 7 different topics, each of them is divided in 10 partitions, and their replication factor is 3. To send traffic to this new node, we tried the kafka-reassign-partitions.sh tool, but for some reason, it doesnt work, and now it seems that we are stuck in some kind of partition reassignment process. We can't see any data on our new node, and when we try to execute the partition reassignment tool for another topic it says : Partitions reassignment failed due to Partition reassignment currently in progress for Map([AD_EVENTS,5] - ReassignedPartitionsContext(List(2),null), [AD_EVENTS,3] - ReassignedPartitionsContext(List(1),null), [AD_EVENTS,4] - ReassignedPartitionsContext(List(3),null)). Aborting operation kafka.common.AdminCommandFailedException: Partition reassignment currently in progress for Map([AD_EVENTS,5] - ReassignedPartitionsContext(List(2),null), [AD_EVENTS,3] - ReassignedPartitionsContext(List(1),null), [AD_EVENTS,4] - ReassignedPartitionsContext(List(3),null)). Aborting operation at kafka.admin.ReassignPartitionsCommand.reassignPartitions(ReassignPartitionsCommand.scala:201) at kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:140) at kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala) Any help would be greatly appreciated ! Thanks Maxime
Re: Consumer Events Problem
For mine topic the offset isn't increasing which means the consumer has stopped. I wanted to get the count(#) of events that are still remaining to be processed. Is that possible ? On Mon, Dec 9, 2013 at 9:44 PM, Jun Rao jun...@gmail.com wrote: By default, each topic is kept on the broker for 7 days. Older data, whether consumed or not, will be deleted. To check # of unconsumed messages, you can either use the ConsumerOffsetChecker tool (see https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Myconsumerseemstohavestopped%2Cwhy%3F ) or use the maxlag jmx in the consumer application (see http://kafka.apache.org/documentation.html#consumerconfigs) Thanks, Jun On Mon, Dec 9, 2013 at 4:30 AM, Sanket Maru san...@ecinity.com wrote: I am working on a small project and discovered that our consumer hasn't been executed for over a month now. How can i check the unprocessed events ? From which date the events are available and what is the retention policy on the producer side ? Thanks, Sanket Maru