Re: Retrieve most-recent-n messages from kafka topic
Thanks Johan, I converted your code to vanilla java with a few small modifications (included below in case anyone wants to use it) and ran it a few times. Seems like it works ok for the quick peek use case, but I wouldn't recommend anyone rely on the accuracy of it since I find, at least in our case, anywhere between 1-10% of the result lines to be corrupt on each call. It looks like in those cases there are a few special chars at the beginning, probably just a function of the header regex being imprecise as you mentioned before. private ListString getCurrentLinesFromKafka(String topicName, int linesToFetch) throws UnsupportedEncodingException { int bytesToFetch = linesToFetch*AVG_LINE_SIZE_IN_BYTES; SimpleConsumer sConsumer = new SimpleConsumer(BROKER_NAME, 9092, 1000, 1024000); long[] currentOffset = sConsumer.getOffsetsBefore(topicName, PARTITION_ID, -1, 3); Long offset = Math.max((currentOffset[0] - bytesToFetch), (currentOffset[currentOffset.length - 1])); FetchRequest fetchRequest = new FetchRequest(topicName, 0, offset, bytesToFetch); ByteBufferMessageSet msgBuffer = sConsumer.fetch(fetchRequest); sConsumer.close(); String decStr = decodeBuffer(msgBuffer.getBuffer(), UTF-8); String header = \u\u.?.?.?.?.?.?.?.?; String[] strLst = decStr.split(header); if (strLst.length linesToFetch + 2) { //take only the last linesToFetch of them, also ignore the first and last since they may be corrupted int end = strLst.length - 1; //end is excluded in copyOfRange int start = end - linesToFetch; return Lists.newArrayList(Arrays.copyOfRange(strLst, start, end)); } else if (strLst.length 2) { //we can at least return something since we have more than the corrupt first and last values int end = strLst.length - 1; //end is excluded in copyOfRange int start = 1;//ignore the probably corrupt first value return Lists.newArrayList(Arrays.copyOfRange(strLst, start, end)); } else { return Lists.newArrayList(); } } private String decodeBuffer(ByteBuffer buffer, String encoding) throws UnsupportedEncodingException { Integer size; try { size = buffer.getInt(); } catch (Exception e) { size = -1; } if (size 0) { return No recent messages in topic; } byte[] bytes = buffer.array(); return new String(bytes, encoding); } On Fri, Jul 19, 2013 at 1:26 PM, Johan Lundahl johan.lund...@gmail.comwrote: Here is my current (very hacky) piece of code handling this part: def getLastMessages(fetchSize: Int = 1): List[String] = { val sConsumer = new SimpleConsumer(clusterip, 9092, 1000, 1024000) val currentOffset = sConsumer.getOffsetsBefore(topic, 0, -1, 3) val fetchRequest = new FetchRequest(topic, 0, (currentOffset(0) - fetchSize).max(currentOffset(currentOffset.length - 1)), fetchSize) val msgBuffer = sConsumer.fetch(fetchRequest) sConsumer.close() def decodeBuffer(buffer: ByteBuffer, encoding: String, arrSize: Int = msgBuffer.sizeInBytes.toInt - 6): String = { val size: Int = Option(try { buffer.getInt } catch { case e: Throwable = -1 }).getOrElse(-1) if (size 0) return sNo recent messages in topic $topic val bytes = new Array[Byte](arrSize) buffer.get(bytes) new String(bytes, encoding) } val decStr = decodeBuffer(msgBuffer.getBuffer, UTF-8) val header = \u\u.?.?.?.?.?.?.?.? val strLst = decStr.split(header).toList if (strLst.size 1) strLst.tail else strLst } On Fri, Jul 19, 2013 at 10:02 PM, Shane Moriah shanemor...@gmail.com wrote: I have a similar use-case to Johan. We do stream processing off the topics in the backend but I'd like to expose a recent sample of a topic's data to a front-end web-app (just in a synchronous, click-a-button-and-see-results fashion). If I can only start from the last file offset 500MB behind current and not (current - n bytes) then the data might be very stale depending on how fast that topic is being filled. I could iterate from the last offset and keep only the final n, but that might mean processing 500MB each time just to grab 10 messages. Johan, are you using just the simple FetchRequest? Did you get around the InvalidMessageSizeError when you try to force a fetch offset different from those returned by getOffsetsBefore? Or are you also starting from that last known offset and iterating forwards by the desired amount? On Fri, Jul 19, 2013 at 11:33 AM, Johan Lundahl johan.lund...@gmail.com wrote: I've had a similar use case where we want to browse and display the latest few messages in different topics in a webapp. This kind of works by doing as you
Replacing brokers in a cluster (0.8)
I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones (better hardware). I'm using a replication factor of 2. I'm thinking the plan should be to spin up the 3 new nodes, and operate as a 5 node cluster for a while. Then first remove 1 of the old nodes, and wait for the partitions on the removed node to get replicated to the other nodes. Then, do the same for the other old node. Does this sound sensible? How does the cluster decide when to re-replicate partitions that are on a node that is no longer available? Does it only happen if/when new messages arrive for that partition? Is it on a partition by partition basis? Or is it a cluster-level decision that a broker is no longer valid, in which case all affected partitions would immediately get replicated to new brokers as needed? I'm just wondering how I will know when it will be safe to take down my second old node, after the first one is removed, etc. Thanks, Jason
Apache Kafka Question
Hi, I am planning to use Apache Kafka 0.8 to handle millions of messages per day. Now I need to form the environment, like (i) How many Topics to be created? (ii) How many partitions/replications to be created? (iii) How many Brokers to be created? (iv) How many consumer instances in consumer group? (v) Topic or Queue? If topic whether we need to create multiple group Id as supposed to single one? How we can go about it? Please clarify. Thanks Regards, Anantha Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com
Re: Apache Kafka Question
Millions of messages per day (with each message being few bytes) is not really 'Big Data'. Kafka has been tested for a million message per second. The answer to all your question IMO is It depends. You can start with a single instance (Single machine installation). Let your producer send messages. Keep one broker. Increase to N brokers. When you touch the upper limit add a server and repeat all the stuff. Bench marking and scalability are aspects which you should try on your own by playing with Kafka. Every use case is different. So performance metric of one is not a global answer. For your question on Topic or Queue, please read something about Distributed Computing Pub/Sub, Message Queue's and other patterns which are generic concepts and has nothing to do with Kafka. It again depends on your use case. Please read as to what topics in Kafka are? If you just go through the definition of topics you would yourself answer your question within a minute. Replications and all would be next steps once you are done with a single running instance of Kafka. So go ahead and get your hands dirty. You will love Kafka :) And yes, the most important thing: Please read the documentation first (bit of theory) and then dive. There is no silver bullet. Cheers, Yavar http://lnkd.in/GRrrDJ On Mon, Jul 22, 2013 at 4:27 PM, anantha.muru...@wipro.com wrote: Hi, I am planning to use Apache Kafka 0.8 to handle millions of messages per day. Now I need to form the environment, like (i) How many Topics to be created? (ii) How many partitions/replications to be created? (iii) How many Brokers to be created? (iv) How many consumer instances in consumer group? (v) Topic or Queue? If topic whether we need to create multiple group Id as supposed to single one? How we can go about it? Please clarify. Thanks Regards, Anantha Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com
Re: Replacing brokers in a cluster (0.8)
This seems like the type of behavior I'd ultimately want from the controlled shutdown tool https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-1.ControlledShutdown. Currently, I believe the ShutdownBroker causes new leaders to be selected for any partition the dying node was leading, but I don't think it explicitly forces a rebalance for topics in which the dying node was just an ISR (in-sync replica set) member. Ostensibly, leadership elections are what we want to avoid, due to the Zookeeper chattiness that would ensue for ensembles with lots of partitions, but I'd wager we'd benefit from a reduction in rebalances too. The preferred replication election tool also seems to have some similar level of control (manual selection of the preferred replicas), but still doesn't let you add/remove brokers from the ISR directly. I know the kafka-reassign-partitions tool lets you specify a full list of partitions and replica assignment, but I don't know how easily integrated that will be with the lifecycle you described. Anyone know if controlled shutdown is the right tool for this? Our devops team will certainly be interested in the canonical answer as well. --glenn On 07/22/2013 05:14 AM, Jason Rosenberg wrote: I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones (better hardware). I'm using a replication factor of 2. I'm thinking the plan should be to spin up the 3 new nodes, and operate as a 5 node cluster for a while. Then first remove 1 of the old nodes, and wait for the partitions on the removed node to get replicated to the other nodes. Then, do the same for the other old node. Does this sound sensible? How does the cluster decide when to re-replicate partitions that are on a node that is no longer available? Does it only happen if/when new messages arrive for that partition? Is it on a partition by partition basis? Or is it a cluster-level decision that a broker is no longer valid, in which case all affected partitions would immediately get replicated to new brokers as needed? I'm just wondering how I will know when it will be safe to take down my second old node, after the first one is removed, etc. Thanks, Jason
Re: Logo
Yeah, good point. I hadn't seen that before. -Jay On Mon, Jul 22, 2013 at 10:20 AM, Radek Gruchalski radek.gruchal...@portico.io wrote: 296 looks familiar: https://www.nodejitsu.com/ Kind regards, Radek Gruchalski radek.gruchal...@technicolor.com (mailto:radek.gruchal...@technicolor.com) | radek.gruchal...@portico.io (mailto:radek.gruchal...@portico.io) | ra...@gruchalski.com (mailto:ra...@gruchalski.com) 00447889948663 Confidentiality: This communication is intended for the above-named person and may be confidential and/or legally privileged. If it has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender immediately. On Monday, 22 July 2013 at 18:51, Jay Kreps wrote: Hey guys, We need a logo! I got a few designs from a 99 designs contest that I would like to put forward: https://issues.apache.org/jira/browse/KAFKA-982 If anyone else would like to submit a design that would be great. Let's do a vote to choose one. -Jay
Re: Logo
296 looks familiar: https://www.nodejitsu.com/ Kind regards, Radek Gruchalski radek.gruchal...@technicolor.com (mailto:radek.gruchal...@technicolor.com) | radek.gruchal...@portico.io (mailto:radek.gruchal...@portico.io) | ra...@gruchalski.com (mailto:ra...@gruchalski.com) 00447889948663 Confidentiality: This communication is intended for the above-named person and may be confidential and/or legally privileged. If it has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender immediately. On Monday, 22 July 2013 at 18:51, Jay Kreps wrote: Hey guys, We need a logo! I got a few designs from a 99 designs contest that I would like to put forward: https://issues.apache.org/jira/browse/KAFKA-982 If anyone else would like to submit a design that would be great. Let's do a vote to choose one. -Jay
Re: Replacing brokers in a cluster (0.8)
Is the kafka-reassign-partitions tool something I can experiment with now (this will only be staging data, in the first go-round). How does it work? Do I manually have to specify each replica I want to move? This would be cumbersome, as I have on the order of 100's of topicsOr does the tool have the ability to specify all replicas on a particular broker? How can I easily check whether a partition has all its replicas in the ISR? For some reason, I had thought there would be a default behavior, whereby a replica could automatically be declared dead after a configurable timeout period. Re-assigning broker id's would not be ideal, since I have a scheme currently whereby broker id's are auto-generated, from a hostname/ip, etc. I could make it work, but it's not my preference to override that! Jason On Mon, Jul 22, 2013 at 11:50 AM, Jun Rao jun...@gmail.com wrote: A replica's data won't be automatically moved to another broker where there are failures. This is because we don't know if the failure is transient or permanent. The right tool to use is the kafka-reassign-partitions tool. It hasn't been thoroughly tested tough. We hope to harden it in the final 0.8.0 release. You can also replace a broker with a new server by keeping the same broker id. When the new server starts up, it will replica data from the leader. You know the data is fully replicated when both replicas are in ISR. Thanks, Jun On Mon, Jul 22, 2013 at 2:14 AM, Jason Rosenberg j...@squareup.com wrote: I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones (better hardware). I'm using a replication factor of 2. I'm thinking the plan should be to spin up the 3 new nodes, and operate as a 5 node cluster for a while. Then first remove 1 of the old nodes, and wait for the partitions on the removed node to get replicated to the other nodes. Then, do the same for the other old node. Does this sound sensible? How does the cluster decide when to re-replicate partitions that are on a node that is no longer available? Does it only happen if/when new messages arrive for that partition? Is it on a partition by partition basis? Or is it a cluster-level decision that a broker is no longer valid, in which case all affected partitions would immediately get replicated to new brokers as needed? I'm just wondering how I will know when it will be safe to take down my second old node, after the first one is removed, etc. Thanks, Jason
Re: Replacing brokers in a cluster (0.8)
Here's a ruby cli that you can use to replace brokers...it shells out to the kafka-reassign-partitions.sh tool after figuring out broker lists from zk. Hope its useful. #!/usr/bin/env ruby require 'excon' require 'json' require 'zookeeper' def replace(arr, o, n) arr.map{|v| v == o ? n : v } end if ARGV.length != 4 puts Usage: bundle exec bin/replace-instance zkstr topic-name old-broker-id new-broker-id else zkstr = ARGV[0] zk = Zookeeper.new(zkstr) topic = ARGV[1] old = ARGV[2].to_i new = ARGV[3].to_i puts Replacing broker #{old} with #{new} on all partitions of topic #{topic} current = JSON.parse(zk.get(:path = /brokers/topics/#{topic})[:data]) replacements_array = [] replacements = {partitions = replacements_array} current[partitions].each { |partition, brokers| replacements_array.push({topic = topic, partition = partition.to_i, replicas = replace(brokers, old, new)}) } replacement_json = JSON.generate(replacements) file = /tmp/replace-#{topic}-#{old}-#{new} if File.exist?(file) File.delete file end File.open(file, 'w') { |f| f.write(replacement_json) } puts ./bin/kafka-reassign-partitions.sh --zookeeper #{zkstr} --path-to-json-file #{file} system ./bin/kafka-reassign-partitions.sh --zookeeper #{zkstr} --path-to-json-file #{file} On Mon, Jul 22, 2013 at 10:40 AM, Jason Rosenberg j...@squareup.com wrote: Is the kafka-reassign-partitions tool something I can experiment with now (this will only be staging data, in the first go-round). How does it work? Do I manually have to specify each replica I want to move? This would be cumbersome, as I have on the order of 100's of topicsOr does the tool have the ability to specify all replicas on a particular broker? How can I easily check whether a partition has all its replicas in the ISR? For some reason, I had thought there would be a default behavior, whereby a replica could automatically be declared dead after a configurable timeout period. Re-assigning broker id's would not be ideal, since I have a scheme currently whereby broker id's are auto-generated, from a hostname/ip, etc. I could make it work, but it's not my preference to override that! Jason On Mon, Jul 22, 2013 at 11:50 AM, Jun Rao jun...@gmail.com wrote: A replica's data won't be automatically moved to another broker where there are failures. This is because we don't know if the failure is transient or permanent. The right tool to use is the kafka-reassign-partitions tool. It hasn't been thoroughly tested tough. We hope to harden it in the final 0.8.0 release. You can also replace a broker with a new server by keeping the same broker id. When the new server starts up, it will replica data from the leader. You know the data is fully replicated when both replicas are in ISR. Thanks, Jun On Mon, Jul 22, 2013 at 2:14 AM, Jason Rosenberg j...@squareup.com wrote: I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones (better hardware). I'm using a replication factor of 2. I'm thinking the plan should be to spin up the 3 new nodes, and operate as a 5 node cluster for a while. Then first remove 1 of the old nodes, and wait for the partitions on the removed node to get replicated to the other nodes. Then, do the same for the other old node. Does this sound sensible? How does the cluster decide when to re-replicate partitions that are on a node that is no longer available? Does it only happen if/when new messages arrive for that partition? Is it on a partition by partition basis? Or is it a cluster-level decision that a broker is no longer valid, in which case all affected partitions would immediately get replicated to new brokers as needed? I'm just wondering how I will know when it will be safe to take down my second old node, after the first one is removed, etc. Thanks, Jason
Re: Logo
Similar, yet different. I like it! On Mon, Jul 22, 2013 at 1:25 PM, Jay Kreps jay.kr...@gmail.com wrote: Yeah, good point. I hadn't seen that before. -Jay On Mon, Jul 22, 2013 at 10:20 AM, Radek Gruchalski radek.gruchal...@portico.io wrote: 296 looks familiar: https://www.nodejitsu.com/ Kind regards, Radek Gruchalski radek.gruchal...@technicolor.com (mailto: radek.gruchal...@technicolor.com) | radek.gruchal...@portico.io (mailto:radek.gruchal...@portico.io) | ra...@gruchalski.com (mailto:ra...@gruchalski.com) 00447889948663 Confidentiality: This communication is intended for the above-named person and may be confidential and/or legally privileged. If it has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender immediately. On Monday, 22 July 2013 at 18:51, Jay Kreps wrote: Hey guys, We need a logo! I got a few designs from a 99 designs contest that I would like to put forward: https://issues.apache.org/jira/browse/KAFKA-982 If anyone else would like to submit a design that would be great. Let's do a vote to choose one. -Jay
Re: Logo
It should be a roach in honor of Franz Kafka's Metamorphosis. On 7/22/2013 2:55 PM, S Ahmed wrote: Similar, yet different. I like it! On Mon, Jul 22, 2013 at 1:25 PM, Jay Kreps jay.kr...@gmail.com wrote: Yeah, good point. I hadn't seen that before. -Jay On Mon, Jul 22, 2013 at 10:20 AM, Radek Gruchalski radek.gruchal...@portico.io wrote: 296 looks familiar: https://www.nodejitsu.com/ Kind regards, Radek Gruchalski radek.gruchal...@technicolor.com (mailto: radek.gruchal...@technicolor.com) | radek.gruchal...@portico.io (mailto:radek.gruchal...@portico.io) | ra...@gruchalski.com (mailto:ra...@gruchalski.com) 00447889948663 Confidentiality: This communication is intended for the above-named person and may be confidential and/or legally privileged. If it has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender immediately. On Monday, 22 July 2013 at 18:51, Jay Kreps wrote: Hey guys, We need a logo! I got a few designs from a 99 designs contest that I would like to put forward: https://issues.apache.org/jira/browse/KAFKA-982 If anyone else would like to submit a design that would be great. Let's do a vote to choose one. -Jay -- David Harris Bridge Interactive Group email: dhar...@big-llc.com cell: 404-831-7015 office: 888-901-0150 Bridge Software Products: www.big-llc.com www.realvaluator.com www.rvleadgen.com
Re: Logo
I actually did this the last time a logo was discussed :) https://docs.google.com/drawings/d/11WHfjkRGbSiZK6rRkedCrgmgFoP_vQ-QuWNENd4u7UY/edit As it turns out, it was a dung beetle in the book (I thought it was a roach as well). -David On 7/22/13 2:59 PM, David Harris wrote: It should be a roach in honor of Franz Kafka's Metamorphosis. On 7/22/2013 2:55 PM, S Ahmed wrote: Similar, yet different. I like it! On Mon, Jul 22, 2013 at 1:25 PM, Jay Kreps jay.kr...@gmail.com wrote: Yeah, good point. I hadn't seen that before. -Jay On Mon, Jul 22, 2013 at 10:20 AM, Radek Gruchalski radek.gruchal...@portico.io wrote: 296 looks familiar: https://www.nodejitsu.com/ Kind regards, Radek Gruchalski radek.gruchal...@technicolor.com (mailto: radek.gruchal...@technicolor.com) | radek.gruchal...@portico.io (mailto:radek.gruchal...@portico.io) | ra...@gruchalski.com (mailto:ra...@gruchalski.com) 00447889948663 Confidentiality: This communication is intended for the above-named person and may be confidential and/or legally privileged. If it has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender immediately. On Monday, 22 July 2013 at 18:51, Jay Kreps wrote: Hey guys, We need a logo! I got a few designs from a 99 designs contest that I would like to put forward: https://issues.apache.org/jira/browse/KAFKA-982 If anyone else would like to submit a design that would be great. Let's do a vote to choose one. -Jay -- David Harris Bridge Interactive Group email: dhar...@big-llc.com cell: 404-831-7015 office: 888-901-0150 Bridge Software Products: www.big-llc.com www.realvaluator.com www.rvleadgen.com
Messages TTL setting
Hi, We have a 3 node Kafka cluster. We want to increase the maximum amount of time for which messages are saved in Kafka data logs. Can we change the configuration on one node, stop it and start it and then change the configuration of the next node? Or should we stop all 3 nodes at a time, make configuration changes and then restart all 3? Please suggest. Thanks, Arathi
Re: Messages TTL setting
Yes, all configuration changes should be possible to do one node at a time. -Jay On Mon, Jul 22, 2013 at 2:03 PM, arathi maddula arathimadd...@gmail.comwrote: Hi, We have a 3 node Kafka cluster. We want to increase the maximum amount of time for which messages are saved in Kafka data logs. Can we change the configuration on one node, stop it and start it and then change the configuration of the next node? Or should we stop all 3 nodes at a time, make configuration changes and then restart all 3? Please suggest. Thanks, Arathi
Recommended log level in prod environment.
The beta release comes with mostly trace level logging. Is this recommended? I notice our cluster produce way too many logs. I set all the level to info currently.
Re: Recommended log level in prod environment.
nah. We just changed it to INFO and will monitor the log. We have GBs of logs when it was at trace level. the kafka-request log was going crazy. On Jul 22, 2013, at 10:54 PM, Jay Kreps jay.kr...@gmail.com wrote: We run at info too except when debugging stuff. Are you saying that info is too verbose? -Jay On Mon, Jul 22, 2013 at 6:43 PM, Calvin Lei ckp...@gmail.com wrote: The beta release comes with mostly trace level logging. Is this recommended? I notice our cluster produce way too many logs. I set all the level to info currently.
Re: Replacing brokers in a cluster (0.8)
You can try kafka-reassign-partitions now. You do have to specify the new replica assignment manually. We are improving that tool to make it more automatic. Thanks, Jun On Mon, Jul 22, 2013 at 10:40 AM, Jason Rosenberg j...@squareup.com wrote: Is the kafka-reassign-partitions tool something I can experiment with now (this will only be staging data, in the first go-round). How does it work? Do I manually have to specify each replica I want to move? This would be cumbersome, as I have on the order of 100's of topicsOr does the tool have the ability to specify all replicas on a particular broker? How can I easily check whether a partition has all its replicas in the ISR? For some reason, I had thought there would be a default behavior, whereby a replica could automatically be declared dead after a configurable timeout period. Re-assigning broker id's would not be ideal, since I have a scheme currently whereby broker id's are auto-generated, from a hostname/ip, etc. I could make it work, but it's not my preference to override that! Jason On Mon, Jul 22, 2013 at 11:50 AM, Jun Rao jun...@gmail.com wrote: A replica's data won't be automatically moved to another broker where there are failures. This is because we don't know if the failure is transient or permanent. The right tool to use is the kafka-reassign-partitions tool. It hasn't been thoroughly tested tough. We hope to harden it in the final 0.8.0 release. You can also replace a broker with a new server by keeping the same broker id. When the new server starts up, it will replica data from the leader. You know the data is fully replicated when both replicas are in ISR. Thanks, Jun On Mon, Jul 22, 2013 at 2:14 AM, Jason Rosenberg j...@squareup.com wrote: I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones (better hardware). I'm using a replication factor of 2. I'm thinking the plan should be to spin up the 3 new nodes, and operate as a 5 node cluster for a while. Then first remove 1 of the old nodes, and wait for the partitions on the removed node to get replicated to the other nodes. Then, do the same for the other old node. Does this sound sensible? How does the cluster decide when to re-replicate partitions that are on a node that is no longer available? Does it only happen if/when new messages arrive for that partition? Is it on a partition by partition basis? Or is it a cluster-level decision that a broker is no longer valid, in which case all affected partitions would immediately get replicated to new brokers as needed? I'm just wondering how I will know when it will be safe to take down my second old node, after the first one is removed, etc. Thanks, Jason
Re: Recommended log level in prod environment.
Yes, the kafka-request log logs every request (in TRACE). It's mostly for debugging purpose. Other than that, there is no harm to turn it off. Thanks, Jun On Mon, Jul 22, 2013 at 7:59 PM, Calvin Lei ckp...@gmail.com wrote: nah. We just changed it to INFO and will monitor the log. We have GBs of logs when it was at trace level. the kafka-request log was going crazy. On Jul 22, 2013, at 10:54 PM, Jay Kreps jay.kr...@gmail.com wrote: We run at info too except when debugging stuff. Are you saying that info is too verbose? -Jay On Mon, Jul 22, 2013 at 6:43 PM, Calvin Lei ckp...@gmail.com wrote: The beta release comes with mostly trace level logging. Is this recommended? I notice our cluster produce way too many logs. I set all the level to info currently.