Re: large amount of disk space freed on restart
We are also seeing this problem with version 0.7.1 and logs on an XFS partition. At our largest scale we can frequently free over 600GB of disk usage by simply restarting Kafka. We've examined the `lsof` output from the Kafka process and while it does appear to have FDs open for all log files on disk (even those long past read from), it does not have any files open that were previously deleted from disk. Du output agrees that the seen size is much larger than apparent-size size: root@kafkanode-1:/raid0/kafka-logs/measures-0# du -h 242666442619.kafka 1.1G 242666442619.kafka root@kafkanode-1:/raid0/kafka-logs/measures-0# du -h --apparent-size 242666442619.kafka 513M 242666442619.kafka Our log size/retention policy is: log.file.size=536870912 log.retention.hours=96 We tried dropping the caches from the Stack Overflow suggestion (sync; echo 3 /proc/sys/vm/drop_caches) but that didn't seem to clear up the extra space. Haven't had the chance to try remounting with the allocsize option. In summary, it would be great if Kafka would close FD's to log files that hadn't been read from for some period of time if it addresses this issue. Cheers, Mike On Fri, Jul 26, 2013 at 5:03 PM, Jay Kreps jay.kr...@gmail.com wrote: Cool, good to know. On Fri, Jul 26, 2013 at 2:00 PM, Jason Rosenberg j...@squareup.com wrote: Jay, My only experience so far with this is using XFS. It appears the XFS behavior is evolving, and in fact, we see somewhat different behavior from 2 of our CentOS kernel versions in use. I've been trying to ask questions about all this on the XFS.org mailing list, but so far, having not much luck understanding the xfs versioning correlated to CentOS versions. Anyway, yes, I think it would definitely be worth trying the solution you suggest, which would be to close the file on rotation, and re-open read-only. Or to close files after a few hours of not being accessed. If a patch for one of these approaches can be cobbled together, I'd love to test it out on our staging environment. I'd be willing to experiment with such a patch myself, although I'm not 100% of all the places to look (but might dive in). Xfs appears to the option of using dynamic, speculative preallocation, in which case it progressively doubles the amount of space reserved for a file, as the file grows. It does do this for all open files. If the file is closed, it will then release the preallocated space not in use. It's not clear whether this releasing of space happens immediately on close, and whether re-opening the file read-only immediately, will keep it from releasing space (still trying to gather more info on that). I haven't looked too much at the index files, but those too appear to have this behavior (e.g. preallocated size is always on the order of double the actual size, until the app is restarted). Jason On Fri, Jul 26, 2013 at 12:46 PM, Jay Kreps jay.kr...@gmail.com wrote: Interesting. Yes, Kafka keeps all log files open indefinitely. There is no inherent reason this needs to be the case, though, it would be possible to LRU out old file descriptors and close them if they are not accessed for a few hours and then reopen on the first access. We just haven't implemented anything like that. It would be good to understand this a little better. Does xfs pre-allocate space for all open files? Perhaps just closing the file on log role and opening it read-only would solve the issue? Is this at all related to the use of sparse files for the indexes (i.e. RandomAccessFile.setLength(10MB) when we create the index)? Does this effect other filesystems or just xfs? -Jay On Fri, Jul 26, 2013 at 12:42 AM, Jason Rosenberg j...@squareup.com wrote: It looks like xfs will reclaim the preallocated space for a file, after it is closed. Does kafka close a file after it has reached it's max size and started writing to the next log file in sequence? Or does it keep all open until they are deleted, or the server quits (that's what it seems like). I could imagine that it might need to keep log files open, in order to allow consumers access to them. But does it keep them open indefinitely, after there is no longer any data to be written to them, and no consumers are currently attempting to read from them? Jason On Tue, Jul 16, 2013 at 4:32 PM, Jay Kreps jay.kr...@gmail.com wrote: Interesting. Yes it will respect whatever setting it is given for new segments created from that point on. -Jay On Tue, Jul 16, 2013 at 11:23 AM, Jason Rosenberg j...@squareup.com wrote: Ok, An update on this. It seems we are using XFS, which is available in newer versions of Centos. It definitely does pre-allocate space as a
Re: Patch for mmap + windows
Btw, I've been running this patch in our cloud env and it's been working fine so far. I actually filed another bug as I saw another problem on windows locally ( https://issues.apache.org/jira/browse/KAFKA-1036). Tim On Wed, Aug 21, 2013 at 4:29 PM, Jay Kreps jay.kr...@gmail.com wrote: That would be great! -Jay On Wed, Aug 21, 2013 at 3:13 PM, Timothy Chen tnac...@gmail.com wrote: Hi Jay, I'm planning to test run Kafka on Windows in our test environments evaluating if it's suitable for production usage. I can provide feedback with the patch how well it works and if we encounter any functional or perf problems. Tim On Wed, Aug 21, 2013 at 2:54 PM, Jay Kreps jay.kr...@gmail.com wrote: Elizabeth and I have a patch to support our memory mapped offset index files properly on Windows: https://issues.apache.org/jira/browse/KAFKA-1008 Question: Do we want this on 0.8 or trunk? I would feel more comfortable with it in trunk, but that means windows support in 0.8 is known to be broken (as opposed to not known to be broken but not known to be working either since we are not doing aggressive system testing on windows). I would feel more comfortable doing the patch on 0.8 if there was someone who would be willing to take on real load testing and/or production operation on Windows so we could have some confidence that Kafka on Windows actually works, otherwise this could just be the tip of the iceberg. Also it would be great to get review on that patch regardless of the destination. -Jay
Re: large amount of disk space freed on restart
This could certainly be done. It would be slightly involved since you would need to implement some kind of file-handle cache for both indexes and log files and re-open them on demand when a read occurs. If someone wants to take a shot at this, the first step would be to get a design wiki in place on how this would work. This is potentially nice to reduce the open file count (though open files are pretty cheap). That said this issue only impacts xfs and it seems to be fixed by that setting jonathan found. I wonder if you could give that a try and see if it works for you too? I feel dealing with closed files does add a lot of complexity so if there is an easy fix I would probably rather avoid it. -Jay On Mon, Sep 9, 2013 at 8:17 AM, Mike Heffner m...@librato.com wrote: We are also seeing this problem with version 0.7.1 and logs on an XFS partition. At our largest scale we can frequently free over 600GB of disk usage by simply restarting Kafka. We've examined the `lsof` output from the Kafka process and while it does appear to have FDs open for all log files on disk (even those long past read from), it does not have any files open that were previously deleted from disk. Du output agrees that the seen size is much larger than apparent-size size: root@kafkanode-1:/raid0/kafka-logs/measures-0# du -h 242666442619.kafka 1.1G 242666442619.kafka root@kafkanode-1:/raid0/kafka-logs/measures-0# du -h --apparent-size 242666442619.kafka 513M 242666442619.kafka Our log size/retention policy is: log.file.size=536870912 log.retention.hours=96 We tried dropping the caches from the Stack Overflow suggestion (sync; echo 3 /proc/sys/vm/drop_caches) but that didn't seem to clear up the extra space. Haven't had the chance to try remounting with the allocsize option. In summary, it would be great if Kafka would close FD's to log files that hadn't been read from for some period of time if it addresses this issue. Cheers, Mike On Fri, Jul 26, 2013 at 5:03 PM, Jay Kreps jay.kr...@gmail.com wrote: Cool, good to know. On Fri, Jul 26, 2013 at 2:00 PM, Jason Rosenberg j...@squareup.com wrote: Jay, My only experience so far with this is using XFS. It appears the XFS behavior is evolving, and in fact, we see somewhat different behavior from 2 of our CentOS kernel versions in use. I've been trying to ask questions about all this on the XFS.org mailing list, but so far, having not much luck understanding the xfs versioning correlated to CentOS versions. Anyway, yes, I think it would definitely be worth trying the solution you suggest, which would be to close the file on rotation, and re-open read-only. Or to close files after a few hours of not being accessed. If a patch for one of these approaches can be cobbled together, I'd love to test it out on our staging environment. I'd be willing to experiment with such a patch myself, although I'm not 100% of all the places to look (but might dive in). Xfs appears to the option of using dynamic, speculative preallocation, in which case it progressively doubles the amount of space reserved for a file, as the file grows. It does do this for all open files. If the file is closed, it will then release the preallocated space not in use. It's not clear whether this releasing of space happens immediately on close, and whether re-opening the file read-only immediately, will keep it from releasing space (still trying to gather more info on that). I haven't looked too much at the index files, but those too appear to have this behavior (e.g. preallocated size is always on the order of double the actual size, until the app is restarted). Jason On Fri, Jul 26, 2013 at 12:46 PM, Jay Kreps jay.kr...@gmail.com wrote: Interesting. Yes, Kafka keeps all log files open indefinitely. There is no inherent reason this needs to be the case, though, it would be possible to LRU out old file descriptors and close them if they are not accessed for a few hours and then reopen on the first access. We just haven't implemented anything like that. It would be good to understand this a little better. Does xfs pre-allocate space for all open files? Perhaps just closing the file on log role and opening it read-only would solve the issue? Is this at all related to the use of sparse files for the indexes (i.e. RandomAccessFile.setLength(10MB) when we create the index)? Does this effect other filesystems or just xfs? -Jay On Fri, Jul 26, 2013 at 12:42 AM, Jason Rosenberg j...@squareup.com wrote: It looks like xfs will reclaim the preallocated space for a file, after it is closed. Does kafka close a file after it has reached it's max size and started writing to the next log file in sequence? Or does
Re: Patch for mmap + windows
So guys, do we want to do these in 0.8? The first patch was a little involved but I think it would be good to have windows support in 0.8 and it sounds like Tim is able to get things working after these changes. -Jay On Mon, Sep 9, 2013 at 10:19 AM, Timothy Chen tnac...@gmail.com wrote: Btw, I've been running this patch in our cloud env and it's been working fine so far. I actually filed another bug as I saw another problem on windows locally ( https://issues.apache.org/jira/browse/KAFKA-1036). Tim On Wed, Aug 21, 2013 at 4:29 PM, Jay Kreps jay.kr...@gmail.com wrote: That would be great! -Jay On Wed, Aug 21, 2013 at 3:13 PM, Timothy Chen tnac...@gmail.com wrote: Hi Jay, I'm planning to test run Kafka on Windows in our test environments evaluating if it's suitable for production usage. I can provide feedback with the patch how well it works and if we encounter any functional or perf problems. Tim On Wed, Aug 21, 2013 at 2:54 PM, Jay Kreps jay.kr...@gmail.com wrote: Elizabeth and I have a patch to support our memory mapped offset index files properly on Windows: https://issues.apache.org/jira/browse/KAFKA-1008 Question: Do we want this on 0.8 or trunk? I would feel more comfortable with it in trunk, but that means windows support in 0.8 is known to be broken (as opposed to not known to be broken but not known to be working either since we are not doing aggressive system testing on windows). I would feel more comfortable doing the patch on 0.8 if there was someone who would be willing to take on real load testing and/or production operation on Windows so we could have some confidence that Kafka on Windows actually works, otherwise this could just be the tip of the iceberg. Also it would be great to get review on that patch regardless of the destination. -Jay
RE: is it possible to commit offsets on a per stream basis?
Thanks, Neha. That number of connections formula is very helpful. Regards, Libo -Original Message- From: Neha Narkhede [mailto:neha.narkh...@gmail.com] Sent: Monday, September 09, 2013 12:17 PM To: users@kafka.apache.org Subject: Re: is it possible to commit offsets on a per stream basis? Memory might become an issue if all the connectors are part of the same process. But this is easily solvable by distributing the connectors over several machines. Number of connections would be (# of connectors) * (# of brokers) and will proportionately increase with the # of connectors. Thanks, Neha On Mon, Sep 9, 2013 at 9:08 AM, Yu, Libo libo...@citi.com wrote: If one connector is used for a single stream, when there are many topics/streams, will that cause any performance issue, e.g. too many connections or too much memory or big latency? Regards, Libo -Original Message- From: Neha Narkhede [mailto:neha.narkh...@gmail.com] Sent: Sunday, September 08, 2013 12:46 PM To: users@kafka.apache.org Subject: Re: is it possible to commit offsets on a per stream basis? That should be fine too. On Sat, Sep 7, 2013 at 8:33 PM, Jason Rosenberg j...@squareup.com wrote: To be clear, it looks like I forgot to add to my question, that I am asking about creating multiple connectors, within the same consumer process (as I realize I can obviously have multiple connectors running on multiple hosts, etc.). But I'm guessing that should be fine too? Jason On Sat, Sep 7, 2013 at 3:09 PM, Neha Narkhede neha.narkh...@gmail.com wrote: Can I create multiple connectors, and have each use the same Regex for the TopicFilter? Will each connector share the set of available topics? Is this safe to do? Or is it necessary to create mutually non-intersecting regex's for each connector? As long as each of those consumer connectors share the same group id, Kafka consumer rebalancing should automatically re-distribute the topic/partitions amongst the consumer connectors/streams evenly. Thanks, Neha On Mon, Sep 2, 2013 at 1:35 PM, Jason Rosenberg j...@squareup.com wrote: Will this work if we are using a TopicFilter, that can map to multiple topics. Can I create multiple connectors, and have each use the same Regex for the TopicFilter? Will each connector share the set of available topics? Is this safe to do? Or is it necessary to create mutually non-intersecting regex's for each connector? It seems I have a similar issue. I have been using auto commit mode, but it doesn't guarantee that all messages committed have been successfully processed (seems a change to the connector itself might expose a way to use auto offset commit, and have it never commit a message until it is processed). But that would be a change to the ZookeeperConsumerConnectorEssentially, it would be great if after processing each message, we could mark the message as 'processed', and thus use that status as the max offset to commit when the auto offset commit background thread wakes up each time. Jason On Thu, Aug 29, 2013 at 11:58 AM, Yu, Libo libo...@citi.com wrote: Thanks, Neha. That is a great answer. Regards, Libo -Original Message- From: Neha Narkhede [mailto:neha.narkh...@gmail.com] Sent: Thursday, August 29, 2013 1:55 PM To: users@kafka.apache.org Subject: Re: is it possible to commit offsets on a per stream basis? 1 We can create multiple connectors. From each connector create only one stream. 2 Use a single thread for a stream. In this case, the connector in each thread can commit freely without any dependence on the other threads. Is this the right way to go? Will it introduce any dead lock when multiple connectors commit at the same time? This is a better approach as there is no complex locking involved. Thanks, Neha On Thu, Aug 29, 2013 at 10:28 AM, Yu, Libo libo...@citi.com wrote: Hi team, This is our current use case: Assume there is a topic with multiple partitions. 1 Create a connector first and create multiple streams from the connector for a topic. 2 Create multiple threads, one for each stream. You can assume the thread's job is to save the message into the database. 3 When it is time to commit offsets, all threads have to synchronize on a barrier before committing the offsets. This is to ensure no message loss in case of process crash. As all threads need to synchronize before committing, it is not efficient. This is a workaround: 1 We can create multiple connectors. From each connector
Re: Mirror maker doesn't replicate new topics
Hi Raja, So just to summarize the scenario: 1) The consumer of mirror maker is successfully consuming all partitions of the newly created topic. 2) The producer of mirror maker is not producing the new messages immediately when the topic is created (observed from ProducerSendThread's log). 3) The producer of mirror maker will start producing the new messages when more messages are sent to the source cluster. If 1) is true then KAFKA-1030 is excluded, since the consumer successfully recognize all the partitions and start consuming. If both 2) and 3) is true, I would wonder if the batch size of the mirror maker producer is large and hence will not send until enough messages are accumulated at the producer queue. Guozhang On Mon, Sep 9, 2013 at 2:36 PM, Rajasekar Elango rela...@salesforce.comwrote: yes, the data exists in source cluster, but not in target cluster. I can't replicate this problem in dev environment and it happens only in prod environment. I turned on debug logging, but not able to identify the problem. Basically, whenever I send data to new topic, I don't see any log messages from ProducerSendThread in mirrormaker log so they are not produced to target cluster. If I send more messages to same topic, the producer send thread kicks off and replicates the messages. But whatever messages send first time gets lost. How can I trouble shoot this problem further? Even this could be due to know issue https://issues.apache.org/jira/browse/KAFKA-1030, how can I confirm that? Is there config tweaking I can make to workaround this..? ConsumerOffsetChecks helps to track consumers. Its there any other tool we can use to track producers in mirrormaker. ? Thanks in advance for help. Thanks, Raja. On Fri, Sep 6, 2013 at 3:50 AM, Swapnil Ghike sgh...@linkedin.com wrote: Hi Rajasekar, You said that ConsumerOffsetChecker shows that new topics are successfully consumed and the lag is 0. If that's the case, can you verify that there is data on the source cluster for these new topics? If there is no data at the source, MirrorMaker will only assign consumer streams to the new topic, but the lag will be 0. This could otherwise be related to https://issues.apache.org/jira/browse/KAFKA-1030. Swapnil On 9/5/13 8:38 PM, Guozhang Wang wangg...@gmail.com wrote: Could you let me know the process of reproducing this issue? Guozhang On Thu, Sep 5, 2013 at 5:04 PM, Rajasekar Elango rela...@salesforce.comwrote: Yes guozhang Sent from my iPhone On Sep 5, 2013, at 7:53 PM, Guozhang Wang wangg...@gmail.com wrote: Hi Rajasekar, Is auto.create.topics.enable set to true in your target cluster? Guozhang On Thu, Sep 5, 2013 at 4:39 PM, Rajasekar Elango rela...@salesforce.com wrote: We having issues that mirormaker not longer replicate newly created topics. It continues to replicate data for existing topics and but new topics doesn't get created on target cluster. ConsumerOffsetTracker shows that new topics are successfully consumed and Lag is 0. But those topics doesn't get created in target cluster. I also don't see mbeans for this new topic under kafka.producer.ProducerTopicMetrics.topic namemetric. In logs I see warning for NotLeaderForPatition. but don't see major error. What else can we look to troubleshoot this further. -- Thanks, Raja. -- -- Guozhang -- -- Guozhang -- Thanks, Raja. -- -- Guozhang
Re: Patch for mmap + windows
+1 for windows support on 0.8 Thanks, Neha On Mon, Sep 9, 2013 at 10:48 AM, Jay Kreps jay.kr...@gmail.com wrote: So guys, do we want to do these in 0.8? The first patch was a little involved but I think it would be good to have windows support in 0.8 and it sounds like Tim is able to get things working after these changes. -Jay On Mon, Sep 9, 2013 at 10:19 AM, Timothy Chen tnac...@gmail.com wrote: Btw, I've been running this patch in our cloud env and it's been working fine so far. I actually filed another bug as I saw another problem on windows locally ( https://issues.apache.org/jira/browse/KAFKA-1036). Tim On Wed, Aug 21, 2013 at 4:29 PM, Jay Kreps jay.kr...@gmail.com wrote: That would be great! -Jay On Wed, Aug 21, 2013 at 3:13 PM, Timothy Chen tnac...@gmail.com wrote: Hi Jay, I'm planning to test run Kafka on Windows in our test environments evaluating if it's suitable for production usage. I can provide feedback with the patch how well it works and if we encounter any functional or perf problems. Tim On Wed, Aug 21, 2013 at 2:54 PM, Jay Kreps jay.kr...@gmail.com wrote: Elizabeth and I have a patch to support our memory mapped offset index files properly on Windows: https://issues.apache.org/jira/browse/KAFKA-1008 Question: Do we want this on 0.8 or trunk? I would feel more comfortable with it in trunk, but that means windows support in 0.8 is known to be broken (as opposed to not known to be broken but not known to be working either since we are not doing aggressive system testing on windows). I would feel more comfortable doing the patch on 0.8 if there was someone who would be willing to take on real load testing and/or production operation on Windows so we could have some confidence that Kafka on Windows actually works, otherwise this could just be the tip of the iceberg. Also it would be great to get review on that patch regardless of the destination. -Jay
Re: Thanks for Kafka
That's awesome! Thanks for taking the time to let us know...the nature of infrastructure is that we usually only hear about things when they don't work. :-) It would be great for the project if you guys could do a blog post on your setup. Also, any objections if I add you to our powered by page? Cheers, -Jay On Mon, Sep 9, 2013 at 12:49 PM, Philip O'Toole phi...@loggly.com wrote: Hello Kafka users and developers, We at Loggly launched our new system last week, and Kafka is a critical part. I just wanted to say a sincere thank-you to the Kafka team at LinkedIn who put this software together. It's really, really great, and has allowed us to build a solid, performant, system. I also want to thank the team for their presence on this list -- their answers have always been really helpful. I hope to write a more detailed blog post in the future about how we use Kafka, but you can find a high-level view of our new stack at the blog post blow. http://www.loggly.com/behind-the-screens/ Thanks again, Philip
Re: Thanks for Kafka
No problem at all -- please feel free to add our name to that page. The marketing blurb is: Loggly is the world's most popular cloud-based log management. Our cloud-based log management service helps DevOps and technical teams make sense of the the massive quantity of logs that are being produced by a growing number of cloud-centric applications – in order to solve operational problems faster. When we write more about our use of Kafka I will let the list know. Philip On Mon, Sep 9, 2013 at 1:31 PM, Jay Kreps jay.kr...@gmail.com wrote: That's awesome! Thanks for taking the time to let us know...the nature of infrastructure is that we usually only hear about things when they don't work. :-) It would be great for the project if you guys could do a blog post on your setup. Also, any objections if I add you to our powered by page? Cheers, -Jay On Mon, Sep 9, 2013 at 12:49 PM, Philip O'Toole phi...@loggly.com wrote: Hello Kafka users and developers, We at Loggly launched our new system last week, and Kafka is a critical part. I just wanted to say a sincere thank-you to the Kafka team at LinkedIn who put this software together. It's really, really great, and has allowed us to build a solid, performant, system. I also want to thank the team for their presence on this list -- their answers have always been really helpful. I hope to write a more detailed blog post in the future about how we use Kafka, but you can find a high-level view of our new stack at the blog post blow. http://www.loggly.com/behind-the-screens/ Thanks again, Philip
Re: Mirror maker doesn't replicate new topics
yes, the data exists in source cluster, but not in target cluster. I can't replicate this problem in dev environment and it happens only in prod environment. I turned on debug logging, but not able to identify the problem. Basically, whenever I send data to new topic, I don't see any log messages from ProducerSendThread in mirrormaker log so they are not produced to target cluster. If I send more messages to same topic, the producer send thread kicks off and replicates the messages. But whatever messages send first time gets lost. How can I trouble shoot this problem further? Even this could be due to know issue https://issues.apache.org/jira/browse/KAFKA-1030, how can I confirm that? Is there config tweaking I can make to workaround this..? ConsumerOffsetChecks helps to track consumers. Its there any other tool we can use to track producers in mirrormaker. ? Thanks in advance for help. Thanks, Raja. On Fri, Sep 6, 2013 at 3:50 AM, Swapnil Ghike sgh...@linkedin.com wrote: Hi Rajasekar, You said that ConsumerOffsetChecker shows that new topics are successfully consumed and the lag is 0. If that's the case, can you verify that there is data on the source cluster for these new topics? If there is no data at the source, MirrorMaker will only assign consumer streams to the new topic, but the lag will be 0. This could otherwise be related to https://issues.apache.org/jira/browse/KAFKA-1030. Swapnil On 9/5/13 8:38 PM, Guozhang Wang wangg...@gmail.com wrote: Could you let me know the process of reproducing this issue? Guozhang On Thu, Sep 5, 2013 at 5:04 PM, Rajasekar Elango rela...@salesforce.comwrote: Yes guozhang Sent from my iPhone On Sep 5, 2013, at 7:53 PM, Guozhang Wang wangg...@gmail.com wrote: Hi Rajasekar, Is auto.create.topics.enable set to true in your target cluster? Guozhang On Thu, Sep 5, 2013 at 4:39 PM, Rajasekar Elango rela...@salesforce.com wrote: We having issues that mirormaker not longer replicate newly created topics. It continues to replicate data for existing topics and but new topics doesn't get created on target cluster. ConsumerOffsetTracker shows that new topics are successfully consumed and Lag is 0. But those topics doesn't get created in target cluster. I also don't see mbeans for this new topic under kafka.producer.ProducerTopicMetrics.topic namemetric. In logs I see warning for NotLeaderForPatition. but don't see major error. What else can we look to troubleshoot this further. -- Thanks, Raja. -- -- Guozhang -- -- Guozhang -- Thanks, Raja.
Re: Patch for mmap + windows
Cool can we get a reviewer for KAFKA-1008 then? I can take on the other issue for the checkpoint files. -Jay On Mon, Sep 9, 2013 at 3:16 PM, Neha Narkhede neha.narkh...@gmail.comwrote: +1 for windows support on 0.8 Thanks, Neha On Mon, Sep 9, 2013 at 10:48 AM, Jay Kreps jay.kr...@gmail.com wrote: So guys, do we want to do these in 0.8? The first patch was a little involved but I think it would be good to have windows support in 0.8 and it sounds like Tim is able to get things working after these changes. -Jay On Mon, Sep 9, 2013 at 10:19 AM, Timothy Chen tnac...@gmail.com wrote: Btw, I've been running this patch in our cloud env and it's been working fine so far. I actually filed another bug as I saw another problem on windows locally ( https://issues.apache.org/jira/browse/KAFKA-1036). Tim On Wed, Aug 21, 2013 at 4:29 PM, Jay Kreps jay.kr...@gmail.com wrote: That would be great! -Jay On Wed, Aug 21, 2013 at 3:13 PM, Timothy Chen tnac...@gmail.com wrote: Hi Jay, I'm planning to test run Kafka on Windows in our test environments evaluating if it's suitable for production usage. I can provide feedback with the patch how well it works and if we encounter any functional or perf problems. Tim On Wed, Aug 21, 2013 at 2:54 PM, Jay Kreps jay.kr...@gmail.com wrote: Elizabeth and I have a patch to support our memory mapped offset index files properly on Windows: https://issues.apache.org/jira/browse/KAFKA-1008 Question: Do we want this on 0.8 or trunk? I would feel more comfortable with it in trunk, but that means windows support in 0.8 is known to be broken (as opposed to not known to be broken but not known to be working either since we are not doing aggressive system testing on windows). I would feel more comfortable doing the patch on 0.8 if there was someone who would be willing to take on real load testing and/or production operation on Windows so we could have some confidence that Kafka on Windows actually works, otherwise this could just be the tip of the iceberg. Also it would be great to get review on that patch regardless of the destination. -Jay
Re: Patch for mmap + windows
I did take a look at KAFKA-1008 a while back and added some comments. On 9/9/13 3:52 PM, Jay Kreps jay.kr...@gmail.com wrote: Cool can we get a reviewer for KAFKA-1008 then? I can take on the other issue for the checkpoint files. -Jay On Mon, Sep 9, 2013 at 3:16 PM, Neha Narkhede neha.narkh...@gmail.comwrote: +1 for windows support on 0.8 Thanks, Neha On Mon, Sep 9, 2013 at 10:48 AM, Jay Kreps jay.kr...@gmail.com wrote: So guys, do we want to do these in 0.8? The first patch was a little involved but I think it would be good to have windows support in 0.8 and it sounds like Tim is able to get things working after these changes. -Jay On Mon, Sep 9, 2013 at 10:19 AM, Timothy Chen tnac...@gmail.com wrote: Btw, I've been running this patch in our cloud env and it's been working fine so far. I actually filed another bug as I saw another problem on windows locally ( https://issues.apache.org/jira/browse/KAFKA-1036). Tim On Wed, Aug 21, 2013 at 4:29 PM, Jay Kreps jay.kr...@gmail.com wrote: That would be great! -Jay On Wed, Aug 21, 2013 at 3:13 PM, Timothy Chen tnac...@gmail.com wrote: Hi Jay, I'm planning to test run Kafka on Windows in our test environments evaluating if it's suitable for production usage. I can provide feedback with the patch how well it works and if we encounter any functional or perf problems. Tim On Wed, Aug 21, 2013 at 2:54 PM, Jay Kreps jay.kr...@gmail.com wrote: Elizabeth and I have a patch to support our memory mapped offset index files properly on Windows: https://issues.apache.org/jira/browse/KAFKA-1008 Question: Do we want this on 0.8 or trunk? I would feel more comfortable with it in trunk, but that means windows support in 0.8 is known to be broken (as opposed to not known to be broken but not known to be working either since we are not doing aggressive system testing on windows). I would feel more comfortable doing the patch on 0.8 if there was someone who would be willing to take on real load testing and/or production operation on Windows so we could have some confidence that Kafka on Windows actually works, otherwise this could just be the tip of the iceberg. Also it would be great to get review on that patch regardless of the destination. -Jay
Failover for Zookeeper and Kafka
Hi everyone, I am trying to setup a Kafka cluster and have a couple of questions about failover. Has anyone deployed more than one zookeeper for a single Kafka cluster and have high availability so if one zookeeper node goes down, the cluster automatically fails over to a backup zookeeper node? If so, how is this done? My second question is how can I set up for automatic failover if I have a mirror secondary Kafka cluster. So, if the main Kafka cluster goes down, what do I need to do in order for my producers and consumers to automatically fail over to the backup mirror Kafka cluster. Do I need to code this into my producers and consumers, should I setup a DNS redirect in case the main Kafka cluster goes down to point to the mirror cluster, or is there some other configuration that I can do? Thanks, Xuyen
Re: Failover for Zookeeper and Kafka
You want to setup a Zookeeper ensemble ( always an odd number of servers, three is often acceptable ) http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html#sc_zkMulitServerSetup And use Kafka 0.8 replication http://kafka.apache.org/documentation.html#replication in addition if you want to mirror the cluster like in another data center you could also use https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring+(MirrorMaker) /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop / On Sep 9, 2013, at 8:39 PM, Xuyen On x...@ancestry.com wrote: Hi everyone, I am trying to setup a Kafka cluster and have a couple of questions about failover. Has anyone deployed more than one zookeeper for a single Kafka cluster and have high availability so if one zookeeper node goes down, the cluster automatically fails over to a backup zookeeper node? If so, how is this done? My second question is how can I set up for automatic failover if I have a mirror secondary Kafka cluster. So, if the main Kafka cluster goes down, what do I need to do in order for my producers and consumers to automatically fail over to the backup mirror Kafka cluster. Do I need to code this into my producers and consumers, should I setup a DNS redirect in case the main Kafka cluster goes down to point to the mirror cluster, or is there some other configuration that I can do? Thanks, Xuyen
Re: Patch for mmap + windows
I think Srirams complaint is that I haven't yet addressed his concerns :-) Sent from my iPhone On Sep 9, 2013, at 3:56 PM, Sriram Subramanian srsubraman...@linkedin.com wrote: I did take a look at KAFKA-1008 a while back and added some comments. On 9/9/13 3:52 PM, Jay Kreps jay.kr...@gmail.com wrote: Cool can we get a reviewer for KAFKA-1008 then? I can take on the other issue for the checkpoint files. -Jay On Mon, Sep 9, 2013 at 3:16 PM, Neha Narkhede neha.narkh...@gmail.comwrote: +1 for windows support on 0.8 Thanks, Neha On Mon, Sep 9, 2013 at 10:48 AM, Jay Kreps jay.kr...@gmail.com wrote: So guys, do we want to do these in 0.8? The first patch was a little involved but I think it would be good to have windows support in 0.8 and it sounds like Tim is able to get things working after these changes. -Jay On Mon, Sep 9, 2013 at 10:19 AM, Timothy Chen tnac...@gmail.com wrote: Btw, I've been running this patch in our cloud env and it's been working fine so far. I actually filed another bug as I saw another problem on windows locally ( https://issues.apache.org/jira/browse/KAFKA-1036). Tim On Wed, Aug 21, 2013 at 4:29 PM, Jay Kreps jay.kr...@gmail.com wrote: That would be great! -Jay On Wed, Aug 21, 2013 at 3:13 PM, Timothy Chen tnac...@gmail.com wrote: Hi Jay, I'm planning to test run Kafka on Windows in our test environments evaluating if it's suitable for production usage. I can provide feedback with the patch how well it works and if we encounter any functional or perf problems. Tim On Wed, Aug 21, 2013 at 2:54 PM, Jay Kreps jay.kr...@gmail.com wrote: Elizabeth and I have a patch to support our memory mapped offset index files properly on Windows: https://issues.apache.org/jira/browse/KAFKA-1008 Question: Do we want this on 0.8 or trunk? I would feel more comfortable with it in trunk, but that means windows support in 0.8 is known to be broken (as opposed to not known to be broken but not known to be working either since we are not doing aggressive system testing on windows). I would feel more comfortable doing the patch on 0.8 if there was someone who would be willing to take on real load testing and/or production operation on Windows so we could have some confidence that Kafka on Windows actually works, otherwise this could just be the tip of the iceberg. Also it would be great to get review on that patch regardless of the destination. -Jay
Re: Thanks for Kafka
Philip, Thanks for posting this. Are you guys using 0.7 or 0.8? Jun On Mon, Sep 9, 2013 at 12:49 PM, Philip O'Toole phi...@loggly.com wrote: Hello Kafka users and developers, We at Loggly launched our new system last week, and Kafka is a critical part. I just wanted to say a sincere thank-you to the Kafka team at LinkedIn who put this software together. It's really, really great, and has allowed us to build a solid, performant, system. I also want to thank the team for their presence on this list -- their answers have always been really helpful. I hope to write a more detailed blog post in the future about how we use Kafka, but you can find a high-level view of our new stack at the blog post blow. http://www.loggly.com/behind-the-screens/ Thanks again, Philip
Re: Thanks for Kafka
We are currently using 0.72. I will provide more details in a future post, including partition configuration, what kind of producers and consumers we use, how we use it, etc. Philip On Mon, Sep 9, 2013 at 8:48 PM, Jun Rao jun...@gmail.com wrote: Philip, Thanks for posting this. Are you guys using 0.7 or 0.8? Jun On Mon, Sep 9, 2013 at 12:49 PM, Philip O'Toole phi...@loggly.com wrote: Hello Kafka users and developers, We at Loggly launched our new system last week, and Kafka is a critical part. I just wanted to say a sincere thank-you to the Kafka team at LinkedIn who put this software together. It's really, really great, and has allowed us to build a solid, performant, system. I also want to thank the team for their presence on this list -- their answers have always been really helpful. I hope to write a more detailed blog post in the future about how we use Kafka, but you can find a high-level view of our new stack at the blog post blow. http://www.loggly.com/behind-the-screens/ Thanks again, Philip
Re: Patch for mmap + windows
Gotcha :) Seems like this will be taken care of then. Tim On Mon, Sep 9, 2013 at 6:22 PM, Jay Kreps jay.kr...@gmail.com wrote: I think Srirams complaint is that I haven't yet addressed his concerns :-) Sent from my iPhone On Sep 9, 2013, at 3:56 PM, Sriram Subramanian srsubraman...@linkedin.com wrote: I did take a look at KAFKA-1008 a while back and added some comments. On 9/9/13 3:52 PM, Jay Kreps jay.kr...@gmail.com wrote: Cool can we get a reviewer for KAFKA-1008 then? I can take on the other issue for the checkpoint files. -Jay On Mon, Sep 9, 2013 at 3:16 PM, Neha Narkhede neha.narkh...@gmail.comwrote: +1 for windows support on 0.8 Thanks, Neha On Mon, Sep 9, 2013 at 10:48 AM, Jay Kreps jay.kr...@gmail.com wrote: So guys, do we want to do these in 0.8? The first patch was a little involved but I think it would be good to have windows support in 0.8 and it sounds like Tim is able to get things working after these changes. -Jay On Mon, Sep 9, 2013 at 10:19 AM, Timothy Chen tnac...@gmail.com wrote: Btw, I've been running this patch in our cloud env and it's been working fine so far. I actually filed another bug as I saw another problem on windows locally ( https://issues.apache.org/jira/browse/KAFKA-1036). Tim On Wed, Aug 21, 2013 at 4:29 PM, Jay Kreps jay.kr...@gmail.com wrote: That would be great! -Jay On Wed, Aug 21, 2013 at 3:13 PM, Timothy Chen tnac...@gmail.com wrote: Hi Jay, I'm planning to test run Kafka on Windows in our test environments evaluating if it's suitable for production usage. I can provide feedback with the patch how well it works and if we encounter any functional or perf problems. Tim On Wed, Aug 21, 2013 at 2:54 PM, Jay Kreps jay.kr...@gmail.com wrote: Elizabeth and I have a patch to support our memory mapped offset index files properly on Windows: https://issues.apache.org/jira/browse/KAFKA-1008 Question: Do we want this on 0.8 or trunk? I would feel more comfortable with it in trunk, but that means windows support in 0.8 is known to be broken (as opposed to not known to be broken but not known to be working either since we are not doing aggressive system testing on windows). I would feel more comfortable doing the patch on 0.8 if there was someone who would be willing to take on real load testing and/or production operation on Windows so we could have some confidence that Kafka on Windows actually works, otherwise this could just be the tip of the iceberg. Also it would be great to get review on that patch regardless of the destination. -Jay