Re: topic's partition have no leader and isr
@Jun Rao, Kafka version: 0.8.1.1 @Guozhang Wang, I can not found the original controller log, but I can give the controller log after execute ./bin/kafka-reassign-partitions.sh and ./bin/kafka-preferred-replica-election.sh Now I do not known how to recover leader for partition 25 and 31, any idea? - controller log for ./bin/kafka-reassign-partitions.sh --- [2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on 5]: Partitions reassigned listener fired for path /admin/reassign_partitions. Record partitions to be reassigned {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25,"replicas":[6]},{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} (kafka.controller.PartitionsReassignedListener) [2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add Partition triggered {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} for path /brokers/topics/org.mobile_nginx (kafka.controller.PartitionStateMachine$AddPartitionsListener) [2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on 5]: Partitions reassigned listener fired for path /admin/reassign_partitions. Record partitions to be reassigned {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} (kafka.controller.PartitionsReassignedListener) [2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add Partition triggered {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} for path /brokers/topics/org.mobile_nginx (kafka.controller.PartitionStateMachine$AddPartitionsListener) - controller log for ./bin/kafka-reassign-partitions.sh --- [2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener on 5]: Preferred replica election listener fired for path /admin/preferred_replica_election. Record partitions to undergo preferred replica election {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25},{"topic":"org.mobile_nginx","partition":31}]} (kafka.controller.PreferredReplicaElectionListener) [2014-07-09 15:07:02,969] INFO [Controller 5]: Starting preferred replica leader election for partitions [org.mobile_nginx,25],[org.mobile_nginx,31] (kafka.controller.KafkaController) [2014-07-09 15:07:02,969] INFO [Partition state machine on Controller 5]: Invoking state change to OnlinePartition for partitions [org.mobile_nginx,25],[org.mobile_nginx,31] (kafka.controller.PartitionStateMachine) [2014-07-09 15:07:02,972] INFO [PreferredReplicaPartitionLeaderSelector]: Current leader -1 for partition [org.mobile_nginx,25] is not the preferred replica. Trigerring preferred replica leader election (kafka.controller.PreferredReplicaPartitionLeaderSelector) [2014-07-09 15:07:02,973] INFO [PreferredReplicaPartitionLeaderSelector]: Current leader -1 for partition [org.mobile_nginx,31] is not the preferred replica. Trigerring preferred replica leader election (kafka.controller.PreferredReplicaPartitionLeaderSelector) [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition [org.mobile_nginx,25] failed to complete preferred replica leader election. Leader is -1 (kafka.controller.KafkaController) [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition [org.mobile_nginx,31] failed to complete preferred replica leader election. Leader is -1 (kafka.controller.KafkaController) On Sun, Jul 6, 2014 at 11:47 PM, Jun Rao wrote: > Also, which version of Kafka are you using? > > Thanks, > > Jun > > > On Thu, Jul 3, 2014 at 2:26 AM, 鞠大升 wrote: > > > hi, all > > > > I have a topic with 32 partitions, after some reassign operation, 2 > > partitions became to no leader and isr. > > > > > --- > > Topic:org.mobile_nginx PartitionCount:32 ReplicationFactor:1 > > Configs: > > Topic: org.mobile_nginx Partition: 0Leader: 3 > Replicas: 3 > > Isr: 3 > > Topic: org.mobile_nginx Partition: 1Leader: 4 > Replicas: 4 > > Isr: 4 > > Topic: org.mobile_nginx Partition: 2Leader: 5 > Replicas: 5 > > Isr: 5 > > Topic: org.mobile_nginx Partition: 3Leader: 6 > Replicas: 6 > > Isr: 6 > > Topic: org.mobile_nginx Partition: 4Leader: 3 > Replicas: 3 > > Isr: 3 > > Topic: org.mobile_nginx Partition: 5Leader: 4 > Replicas: 4 > > Isr: 4 > > Topic: org.mobile_nginx Partition
Re: topic's partition have no leader and isr
I think @chenlax has encounter the same problem with me in us...@kafka.apache.org with titile "How recover leader when broker restart". cc to us...@kafka.apache.org. On Wed, Jul 9, 2014 at 3:10 PM, 鞠大升 wrote: > @Jun Rao, Kafka version: 0.8.1.1 > > @Guozhang Wang, I can not found the original controller log, but I can > give the controller log after execute ./bin/kafka-reassign-partitions.sh > and ./bin/kafka-preferred-replica-election.sh > > Now I do not known how to recover leader for partition 25 and 31, any idea? > > - controller log for ./bin/kafka-reassign-partitions.sh > --- > [2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on 5]: > Partitions reassigned listener fired for path /admin/reassign_partitions. > Record partitions to be reassigned > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25,"replicas":[6]},{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > (kafka.controller.PartitionsReassignedListener) > [2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add Partition > triggered > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > for path /brokers/topics/org.mobile_nginx > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > [2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on 5]: > Partitions reassigned listener fired for path /admin/reassign_partitions. > Record partitions to be reassigned > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > (kafka.controller.PartitionsReassignedListener) > [2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add Partition > triggered > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > for path /brokers/topics/org.mobile_nginx > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > > - controller log for ./bin/kafka-reassign-partitions.sh > --- > [2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener on 5]: > Preferred replica election listener fired for path > /admin/preferred_replica_election. Record partitions to undergo preferred > replica election > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25},{"topic":"org.mobile_nginx","partition":31}]} > (kafka.controller.PreferredReplicaElectionListener) > [2014-07-09 15:07:02,969] INFO [Controller 5]: Starting preferred replica > leader election for partitions [org.mobile_nginx,25],[org.mobile_nginx,31] > (kafka.controller.KafkaController) > [2014-07-09 15:07:02,969] INFO [Partition state machine on Controller 5]: > Invoking state change to OnlinePartition for partitions > [org.mobile_nginx,25],[org.mobile_nginx,31] > (kafka.controller.PartitionStateMachine) > [2014-07-09 15:07:02,972] INFO [PreferredReplicaPartitionLeaderSelector]: > Current leader -1 for partition [org.mobile_nginx,25] is not the preferred > replica. Trigerring preferred replica leader election > (kafka.controller.PreferredReplicaPartitionLeaderSelector) > [2014-07-09 15:07:02,973] INFO [PreferredReplicaPartitionLeaderSelector]: > Current leader -1 for partition [org.mobile_nginx,31] is not the preferred > replica. Trigerring preferred replica leader election > (kafka.controller.PreferredReplicaPartitionLeaderSelector) > [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition > [org.mobile_nginx,25] failed to complete preferred replica leader election. > Leader is -1 (kafka.controller.KafkaController) > [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition > [org.mobile_nginx,31] failed to complete preferred replica leader election. > Leader is -1 (kafka.controller.KafkaController) > > > On Sun, Jul 6, 2014 at 11:47 PM, Jun Rao wrote: > >> Also, which version of Kafka are you using? >> >> Thanks, >> >> Jun >> >> >> On Thu, Jul 3, 2014 at 2:26 AM, 鞠大升 wrote: >> >> > hi, all >> > >> > I have a topic with 32 partitions, after some reassign operation, 2 >> > partitions became to no leader and isr. >> > >> > >> --- >> > Topic:org.mobile_nginx PartitionCount:32 ReplicationFactor:1 >> > Configs: >> > Topic: org.mobile_nginx Partition: 0Leader: 3 >> Replicas: 3 >> > Isr: 3 >> > Topic: org.mobile_nginx Partition: 1Leader: 4 >> Replicas: 4 >> > Isr: 4 >> > Topic: org.mobile_nginx Partition: 2
[jira] [Created] (KAFKA-1531) zookeeper.connection.timeout.ms is set to 10000000 in configuration file in Kafka tarball
Michał Michalski created KAFKA-1531: --- Summary: zookeeper.connection.timeout.ms is set to 1000 in configuration file in Kafka tarball Key: KAFKA-1531 URL: https://issues.apache.org/jira/browse/KAFKA-1531 Project: Kafka Issue Type: Bug Components: config Affects Versions: 0.8.1.1, 0.8.0 Reporter: Michał Michalski I've noticed that Kafka tarball comes with zookeeper.connection.timeout.ms=100 in server.properties while https://kafka.apache.org/08/documentation.html says the default is 6000. This setting was introduced in configuration file in 46b6144a, so quite a long time ago (3 years), so it looks intentional, but as per Jun Rao's comment on IRC, 6000 sounds more reasonable, so that entry should probably be changed or removed from config at all. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (KAFKA-1531) zookeeper.connection.timeout.ms is set to 10000000 in configuration file in Kafka tarball
[ https://issues.apache.org/jira/browse/KAFKA-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michał Michalski updated KAFKA-1531: Description: I've noticed that Kafka tarball comes with zookeeper.connection.timeout.ms=100 in server.properties while https://kafka.apache.org/08/documentation.html says the default is 6000. This setting was introduced in configuration file in 46b6144a, so quite a long time ago (3 years), which makes it look intentional, but as per Jun Rao's comment on IRC, 6000 sounds more reasonable, so that entry should probably be changed or removed from config at all. (was: I've noticed that Kafka tarball comes with zookeeper.connection.timeout.ms=100 in server.properties while https://kafka.apache.org/08/documentation.html says the default is 6000. This setting was introduced in configuration file in 46b6144a, so quite a long time ago (3 years), so it looks intentional, but as per Jun Rao's comment on IRC, 6000 sounds more reasonable, so that entry should probably be changed or removed from config at all.) > zookeeper.connection.timeout.ms is set to 1000 in configuration file in > Kafka tarball > - > > Key: KAFKA-1531 > URL: https://issues.apache.org/jira/browse/KAFKA-1531 > Project: Kafka > Issue Type: Bug > Components: config >Affects Versions: 0.8.0, 0.8.1.1 >Reporter: Michał Michalski > > I've noticed that Kafka tarball comes with > zookeeper.connection.timeout.ms=100 in server.properties while > https://kafka.apache.org/08/documentation.html says the default is 6000. This > setting was introduced in configuration file in 46b6144a, so quite a long > time ago (3 years), which makes it look intentional, but as per Jun Rao's > comment on IRC, 6000 sounds more reasonable, so that entry should probably be > changed or removed from config at all. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1530) howto update continuously
[ https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056049#comment-14056049 ] Stanislav Gilmulin commented on KAFKA-1530: --- Thank you, i'd like to ask some questions. If a cluster has a lot of nodes and a few of them are lagging or down, we can't guarantee we would stop and restart nodes properly and in the right order Is there any recommended way to manage it? Or even an already existing script or tool for it? Replication factor = 3. Version 0.8.1.1 > howto update continuously > - > > Key: KAFKA-1530 > URL: https://issues.apache.org/jira/browse/KAFKA-1530 > Project: Kafka > Issue Type: Wish >Reporter: Stanislav Gilmulin >Priority: Minor > Labels: operating_manual, performance > > Hi, > > Could I ask you a question about the Kafka update procedure? > Is there a way to update software, which doesn't require service interruption > or lead to data losses? > We can't stop message brokering during the update as we have a strict SLA. > > Best regards > Stanislav Gilmulin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (KAFKA-1414) Speedup broker startup after hard reset
[ https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Ozeritskiy updated KAFKA-1414: - Attachment: parallel-dir-loading-trunk-threadpool.patch > Speedup broker startup after hard reset > --- > > Key: KAFKA-1414 > URL: https://issues.apache.org/jira/browse/KAFKA-1414 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.2, 0.9.0, 0.8.1.1 >Reporter: Dmitry Bugaychenko >Assignee: Jay Kreps > Attachments: parallel-dir-loading-0.8.patch, > parallel-dir-loading-trunk-threadpool.patch, parallel-dir-loading-trunk.patch > > > After hard reset due to power failure broker takes way too much time > recovering unflushed segments in a single thread. This could be easiliy > improved launching multiple threads (one per data dirrectory, assuming that > typically each data directory is on a dedicated drive). Localy we trie this > simple patch to LogManager.loadLogs and it seems to work, however I'm too new > to scala, so do not take it literally: > {code} > /** >* Recover and load all logs in the given data directories >*/ > private def loadLogs(dirs: Seq[File]) { > val threads : Array[Thread] = new Array[Thread](dirs.size) > var i: Int = 0 > val me = this > for(dir <- dirs) { > val thread = new Thread( new Runnable { > def run() > { > val recoveryPoints = me.recoveryPointCheckpoints(dir).read > /* load the logs */ > val subDirs = dir.listFiles() > if(subDirs != null) { > val cleanShutDownFile = new File(dir, Log.CleanShutdownFile) > if(cleanShutDownFile.exists()) > info("Found clean shutdown file. Skipping recovery for all logs > in data directory '%s'".format(dir.getAbsolutePath)) > for(dir <- subDirs) { > if(dir.isDirectory) { > info("Loading log '" + dir.getName + "'") > val topicPartition = Log.parseTopicPartitionName(dir.getName) > val config = topicConfigs.getOrElse(topicPartition.topic, > defaultConfig) > val log = new Log(dir, > config, > recoveryPoints.getOrElse(topicPartition, 0L), > scheduler, > time) > val previous = addLogWithLock(topicPartition, log) > if(previous != null) > throw new IllegalArgumentException("Duplicate log > directories found: %s, %s!".format(log.dir.getAbsolutePath, > previous.dir.getAbsolutePath)) > } > } > cleanShutDownFile.delete() > } > } > }) > thread.start() > threads(i) = thread > i = i + 1 > } > for(thread <- threads) { > thread.join() > } > } > def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = { > logCreationOrDeletionLock synchronized { > this.logs.put(topicPartition, log) > } > } > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset
[ https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056065#comment-14056065 ] Alexey Ozeritskiy commented on KAFKA-1414: -- You are right. I have made new version with threadpool. It works with all scala versions (since 2.8.0). > Speedup broker startup after hard reset > --- > > Key: KAFKA-1414 > URL: https://issues.apache.org/jira/browse/KAFKA-1414 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.2, 0.9.0, 0.8.1.1 >Reporter: Dmitry Bugaychenko >Assignee: Jay Kreps > Attachments: parallel-dir-loading-0.8.patch, > parallel-dir-loading-trunk-threadpool.patch, parallel-dir-loading-trunk.patch > > > After hard reset due to power failure broker takes way too much time > recovering unflushed segments in a single thread. This could be easiliy > improved launching multiple threads (one per data dirrectory, assuming that > typically each data directory is on a dedicated drive). Localy we trie this > simple patch to LogManager.loadLogs and it seems to work, however I'm too new > to scala, so do not take it literally: > {code} > /** >* Recover and load all logs in the given data directories >*/ > private def loadLogs(dirs: Seq[File]) { > val threads : Array[Thread] = new Array[Thread](dirs.size) > var i: Int = 0 > val me = this > for(dir <- dirs) { > val thread = new Thread( new Runnable { > def run() > { > val recoveryPoints = me.recoveryPointCheckpoints(dir).read > /* load the logs */ > val subDirs = dir.listFiles() > if(subDirs != null) { > val cleanShutDownFile = new File(dir, Log.CleanShutdownFile) > if(cleanShutDownFile.exists()) > info("Found clean shutdown file. Skipping recovery for all logs > in data directory '%s'".format(dir.getAbsolutePath)) > for(dir <- subDirs) { > if(dir.isDirectory) { > info("Loading log '" + dir.getName + "'") > val topicPartition = Log.parseTopicPartitionName(dir.getName) > val config = topicConfigs.getOrElse(topicPartition.topic, > defaultConfig) > val log = new Log(dir, > config, > recoveryPoints.getOrElse(topicPartition, 0L), > scheduler, > time) > val previous = addLogWithLock(topicPartition, log) > if(previous != null) > throw new IllegalArgumentException("Duplicate log > directories found: %s, %s!".format(log.dir.getAbsolutePath, > previous.dir.getAbsolutePath)) > } > } > cleanShutDownFile.delete() > } > } > }) > thread.start() > threads(i) = thread > i = i + 1 > } > for(thread <- threads) { > thread.join() > } > } > def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = { > logCreationOrDeletionLock synchronized { > this.logs.put(topicPartition, log) > } > } > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1406) Fix scaladoc/javadoc warnings
[ https://issues.apache.org/jira/browse/KAFKA-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056147#comment-14056147 ] Alan Lee commented on KAFKA-1406: - Would it be okay if I work on this? > Fix scaladoc/javadoc warnings > - > > Key: KAFKA-1406 > URL: https://issues.apache.org/jira/browse/KAFKA-1406 > Project: Kafka > Issue Type: Bug > Components: packaging >Reporter: Joel Koshy > Labels: build > Fix For: 0.8.2 > > > ./gradlew docsJarAll > You will see a bunch of warnings mainly due to typos/incorrect use of > javadoc/scaladoc -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (KAFKA-1532) Move CRC32 to AFTER the payload
Julian Morrison created KAFKA-1532: -- Summary: Move CRC32 to AFTER the payload Key: KAFKA-1532 URL: https://issues.apache.org/jira/browse/KAFKA-1532 Project: Kafka Issue Type: Improvement Components: core, producer Reporter: Julian Morrison Assignee: Jun Rao Priority: Minor To support streaming a message of known length but unknown content, take the CRC32 out of the message header and make it a message trailer. Then client libraries can calculate it after streaming the message to Kafka, without materializing the whole message in RAM. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1325) Fix inconsistent per topic log configs
[ https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056258#comment-14056258 ] Manikumar Reddy commented on KAFKA-1325: Created reviewboard against branch origin/trunk > Fix inconsistent per topic log configs > -- > > Key: KAFKA-1325 > URL: https://issues.apache.org/jira/browse/KAFKA-1325 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1 >Reporter: Neha Narkhede > Labels: usability > Attachments: KAFKA-1325.patch > > > Related thread from the user mailing list - > http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E > Our documentation is a little confusing on the log configs. > The log property for retention.ms is in millis but the server default it maps > to is in minutes. > Same is true for segment.ms as well. We could either improve the docs or > change the per-topic configs to be consistent with the server defaults. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (KAFKA-1325) Fix inconsistent per topic log configs
[ https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikumar Reddy updated KAFKA-1325: --- Attachment: KAFKA-1325.patch > Fix inconsistent per topic log configs > -- > > Key: KAFKA-1325 > URL: https://issues.apache.org/jira/browse/KAFKA-1325 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1 >Reporter: Neha Narkhede > Labels: usability > Attachments: KAFKA-1325.patch > > > Related thread from the user mailing list - > http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E > Our documentation is a little confusing on the log configs. > The log property for retention.ms is in millis but the server default it maps > to is in minutes. > Same is true for segment.ms as well. We could either improve the docs or > change the per-topic configs to be consistent with the server defaults. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (KAFKA-1325) Fix inconsistent per topic log configs
[ https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikumar Reddy updated KAFKA-1325: --- Attachment: (was: KAFKA-1325.patch) > Fix inconsistent per topic log configs > -- > > Key: KAFKA-1325 > URL: https://issues.apache.org/jira/browse/KAFKA-1325 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1 >Reporter: Neha Narkhede > Labels: usability > Attachments: KAFKA-1325.patch > > > Related thread from the user mailing list - > http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E > Our documentation is a little confusing on the log configs. > The log property for retention.ms is in millis but the server default it maps > to is in minutes. > Same is true for segment.ms as well. We could either improve the docs or > change the per-topic configs to be consistent with the server defaults. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: topic's partition have no leader and isr
It's weird that you have replication factor 1, but two of the partitions 25 and 31 have 2 assigned replicas. What's the command you used for reassignment? Thanks, Jun On Wed, Jul 9, 2014 at 12:10 AM, 鞠大升 wrote: > @Jun Rao, Kafka version: 0.8.1.1 > > @Guozhang Wang, I can not found the original controller log, but I can give > the controller log after execute ./bin/kafka-reassign-partitions.sh > and ./bin/kafka-preferred-replica-election.sh > > Now I do not known how to recover leader for partition 25 and 31, any idea? > > - controller log for ./bin/kafka-reassign-partitions.sh > --- > [2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on 5]: > Partitions reassigned listener fired for path /admin/reassign_partitions. > Record partitions to be reassigned > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25,"replicas":[6]},{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > (kafka.controller.PartitionsReassignedListener) > [2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add Partition > triggered > > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > for path /brokers/topics/org.mobile_nginx > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > [2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on 5]: > Partitions reassigned listener fired for path /admin/reassign_partitions. > Record partitions to be reassigned > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > (kafka.controller.PartitionsReassignedListener) > [2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add Partition > triggered > > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > for path /brokers/topics/org.mobile_nginx > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > > - controller log for ./bin/kafka-reassign-partitions.sh > --- > [2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener on 5]: > Preferred replica election listener fired for path > /admin/preferred_replica_election. Record partitions to undergo preferred > replica election > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25},{"topic":"org.mobile_nginx","partition":31}]} > (kafka.controller.PreferredReplicaElectionListener) > [2014-07-09 15:07:02,969] INFO [Controller 5]: Starting preferred replica > leader election for partitions [org.mobile_nginx,25],[org.mobile_nginx,31] > (kafka.controller.KafkaController) > [2014-07-09 15:07:02,969] INFO [Partition state machine on Controller 5]: > Invoking state change to OnlinePartition for partitions > [org.mobile_nginx,25],[org.mobile_nginx,31] > (kafka.controller.PartitionStateMachine) > [2014-07-09 15:07:02,972] INFO [PreferredReplicaPartitionLeaderSelector]: > Current leader -1 for partition [org.mobile_nginx,25] is not the preferred > replica. Trigerring preferred replica leader election > (kafka.controller.PreferredReplicaPartitionLeaderSelector) > [2014-07-09 15:07:02,973] INFO [PreferredReplicaPartitionLeaderSelector]: > Current leader -1 for partition [org.mobile_nginx,31] is not the preferred > replica. Trigerring preferred replica leader election > (kafka.controller.PreferredReplicaPartitionLeaderSelector) > [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition > [org.mobile_nginx,25] failed to complete preferred replica leader election. > Leader is -1 (kafka.controller.KafkaController) > [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition > [org.mobile_nginx,31] failed to complete preferred replica leader election. > Leader is -1 (kafka.controller.KafkaController) > > > On Sun, Jul 6, 2014 at 11:47 PM, Jun Rao wrote: > > > Also, which version of Kafka are you using? > > > > Thanks, > > > > Jun > > > > > > On Thu, Jul 3, 2014 at 2:26 AM, 鞠大升 wrote: > > > > > hi, all > > > > > > I have a topic with 32 partitions, after some reassign operation, 2 > > > partitions became to no leader and isr. > > > > > > > > > --- > > > Topic:org.mobile_nginx PartitionCount:32 ReplicationFactor:1 > > > Configs: > > > Topic: org.mobile_nginx Partition: 0Leader: 3 > > Replicas: 3 > > > Isr: 3 > > > Topic: org.mobile_nginx Partition: 1Leader: 4 > > Replicas: 4 > > > Isr: 4 > > >
Review Request 23362: Patch for KAFKA-1325
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23362/ --- Review request for kafka. Bugs: KAFKA-1325 https://issues.apache.org/jira/browse/KAFKA-1325 Repository: kafka Description --- KAFKA-1325 Changed per-topic configs segment.ms -> segment.hours, retention.ms -> retention.minutes to be consistent with the server defaults Diffs - core/src/main/scala/kafka/log/LogConfig.scala 5746ad4767589594f904aa085131dd95e56d72bb core/src/test/scala/unit/kafka/admin/AdminTest.scala e28979827110dfbbb92fe5b152e7f1cc973de400 Diff: https://reviews.apache.org/r/23362/diff/ Testing --- Thanks, Manikumar Reddy O
[jira] [Updated] (KAFKA-1325) Fix inconsistent per topic log configs
[ https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikumar Reddy updated KAFKA-1325: --- Attachment: KAFKA-1325.patch > Fix inconsistent per topic log configs > -- > > Key: KAFKA-1325 > URL: https://issues.apache.org/jira/browse/KAFKA-1325 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1 >Reporter: Neha Narkhede > Labels: usability > Attachments: KAFKA-1325.patch, KAFKA-1325.patch > > > Related thread from the user mailing list - > http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E > Our documentation is a little confusing on the log configs. > The log property for retention.ms is in millis but the server default it maps > to is in minutes. > Same is true for segment.ms as well. We could either improve the docs or > change the per-topic configs to be consistent with the server defaults. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1406) Fix scaladoc/javadoc warnings
[ https://issues.apache.org/jira/browse/KAFKA-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056299#comment-14056299 ] Jun Rao commented on KAFKA-1406: Alan, That would be great. Thanks for your help. > Fix scaladoc/javadoc warnings > - > > Key: KAFKA-1406 > URL: https://issues.apache.org/jira/browse/KAFKA-1406 > Project: Kafka > Issue Type: Bug > Components: packaging >Reporter: Joel Koshy > Labels: build > Fix For: 0.8.2 > > > ./gradlew docsJarAll > You will see a bunch of warnings mainly due to typos/incorrect use of > javadoc/scaladoc -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1325) Fix inconsistent per topic log configs
[ https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056300#comment-14056300 ] Manikumar Reddy commented on KAFKA-1325: Created reviewboard https://reviews.apache.org/r/23362/diff/ against branch origin/trunk > Fix inconsistent per topic log configs > -- > > Key: KAFKA-1325 > URL: https://issues.apache.org/jira/browse/KAFKA-1325 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1 >Reporter: Neha Narkhede > Labels: usability > Attachments: KAFKA-1325.patch, KAFKA-1325.patch > > > Related thread from the user mailing list - > http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E > Our documentation is a little confusing on the log configs. > The log property for retention.ms is in millis but the server default it maps > to is in minutes. > Same is true for segment.ms as well. We could either improve the docs or > change the per-topic configs to be consistent with the server defaults. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (KAFKA-1325) Fix inconsistent per topic log configs
[ https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikumar Reddy updated KAFKA-1325: --- Attachment: (was: KAFKA-1325.patch) > Fix inconsistent per topic log configs > -- > > Key: KAFKA-1325 > URL: https://issues.apache.org/jira/browse/KAFKA-1325 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1 >Reporter: Neha Narkhede > Labels: usability > Attachments: KAFKA-1325.patch > > > Related thread from the user mailing list - > http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E > Our documentation is a little confusing on the log configs. > The log property for retention.ms is in millis but the server default it maps > to is in minutes. > Same is true for segment.ms as well. We could either improve the docs or > change the per-topic configs to be consistent with the server defaults. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1325) Fix inconsistent per topic log configs
[ https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056319#comment-14056319 ] Jun Rao commented on KAFKA-1325: Thanks for the patch. It seems to me that ms granularity is more general. Perhaps we can add log.retention.ms and log.roll.ms in the server config and leave the per topic config unchanged. This will also make the change backward compatible. > Fix inconsistent per topic log configs > -- > > Key: KAFKA-1325 > URL: https://issues.apache.org/jira/browse/KAFKA-1325 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1 >Reporter: Neha Narkhede > Labels: usability > Attachments: KAFKA-1325.patch > > > Related thread from the user mailing list - > http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E > Our documentation is a little confusing on the log configs. > The log property for retention.ms is in millis but the server default it maps > to is in minutes. > Same is true for segment.ms as well. We could either improve the docs or > change the per-topic configs to be consistent with the server defaults. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1528) Normalize all the line endings
[ https://issues.apache.org/jira/browse/KAFKA-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056323#comment-14056323 ] Jun Rao commented on KAFKA-1528: Thanks for the patch. It doesn't apply to trunk though. Are there changes other than adding the .gitattributes file? > Normalize all the line endings > -- > > Key: KAFKA-1528 > URL: https://issues.apache.org/jira/browse/KAFKA-1528 > Project: Kafka > Issue Type: Improvement >Reporter: Evgeny Vereshchagin >Priority: Trivial > Attachments: KAFKA-1528.patch, KAFKA-1528_2014-07-06_16:20:28.patch, > KAFKA-1528_2014-07-06_16:23:21.patch > > > Hi! > I add .gitattributes file and remove all '\r' from some .bat files > See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ and > https://help.github.com/articles/dealing-with-line-endings for explanation. > Maybe, add https://help.github.com/articles/dealing-with-line-endings to > contributing/commiting guide? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset
[ https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056338#comment-14056338 ] Jun Rao commented on KAFKA-1414: Thanks for the patch. A few comments. 1. If there are lots of partitions in a broker, having a thread per partition may be too much. Perhaps we can add a new config like log.recovery.threads. It can default to 1 to preserve the current behavior. 2. If we hit any exception when loading the logs, the current behavior is that we will shut down the broker. We need to preserve this behavior when running log loading in separate threads. So, we need a way to propagate the exceptions from those threads to the caller in KafkaServer. > Speedup broker startup after hard reset > --- > > Key: KAFKA-1414 > URL: https://issues.apache.org/jira/browse/KAFKA-1414 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.2, 0.9.0, 0.8.1.1 >Reporter: Dmitry Bugaychenko >Assignee: Jay Kreps > Attachments: parallel-dir-loading-0.8.patch, > parallel-dir-loading-trunk-threadpool.patch, parallel-dir-loading-trunk.patch > > > After hard reset due to power failure broker takes way too much time > recovering unflushed segments in a single thread. This could be easiliy > improved launching multiple threads (one per data dirrectory, assuming that > typically each data directory is on a dedicated drive). Localy we trie this > simple patch to LogManager.loadLogs and it seems to work, however I'm too new > to scala, so do not take it literally: > {code} > /** >* Recover and load all logs in the given data directories >*/ > private def loadLogs(dirs: Seq[File]) { > val threads : Array[Thread] = new Array[Thread](dirs.size) > var i: Int = 0 > val me = this > for(dir <- dirs) { > val thread = new Thread( new Runnable { > def run() > { > val recoveryPoints = me.recoveryPointCheckpoints(dir).read > /* load the logs */ > val subDirs = dir.listFiles() > if(subDirs != null) { > val cleanShutDownFile = new File(dir, Log.CleanShutdownFile) > if(cleanShutDownFile.exists()) > info("Found clean shutdown file. Skipping recovery for all logs > in data directory '%s'".format(dir.getAbsolutePath)) > for(dir <- subDirs) { > if(dir.isDirectory) { > info("Loading log '" + dir.getName + "'") > val topicPartition = Log.parseTopicPartitionName(dir.getName) > val config = topicConfigs.getOrElse(topicPartition.topic, > defaultConfig) > val log = new Log(dir, > config, > recoveryPoints.getOrElse(topicPartition, 0L), > scheduler, > time) > val previous = addLogWithLock(topicPartition, log) > if(previous != null) > throw new IllegalArgumentException("Duplicate log > directories found: %s, %s!".format(log.dir.getAbsolutePath, > previous.dir.getAbsolutePath)) > } > } > cleanShutDownFile.delete() > } > } > }) > thread.start() > threads(i) = thread > i = i + 1 > } > for(thread <- threads) { > thread.join() > } > } > def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = { > logCreationOrDeletionLock synchronized { > this.logs.put(topicPartition, log) > } > } > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (KAFKA-1406) Fix scaladoc/javadoc warnings
[ https://issues.apache.org/jira/browse/KAFKA-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Lee updated KAFKA-1406: Attachment: kafka-1406-v1.patch I've attached a patch file for the fix. There are still few (non variable type-argument ...) warnings during scaladoc build but it does not seem to be related to documentation. > Fix scaladoc/javadoc warnings > - > > Key: KAFKA-1406 > URL: https://issues.apache.org/jira/browse/KAFKA-1406 > Project: Kafka > Issue Type: Bug > Components: packaging >Reporter: Joel Koshy > Labels: build > Fix For: 0.8.2 > > Attachments: kafka-1406-v1.patch > > > ./gradlew docsJarAll > You will see a bunch of warnings mainly due to typos/incorrect use of > javadoc/scaladoc -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1530) howto update continuously
[ https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056339#comment-14056339 ] Guozhang Wang commented on KAFKA-1530: -- Not sure I understand your question, what do you mean by "we can't guarantee we would stop and restart nodes properly and in the right order "? > howto update continuously > - > > Key: KAFKA-1530 > URL: https://issues.apache.org/jira/browse/KAFKA-1530 > Project: Kafka > Issue Type: Wish >Reporter: Stanislav Gilmulin >Priority: Minor > Labels: operating_manual, performance > > Hi, > > Could I ask you a question about the Kafka update procedure? > Is there a way to update software, which doesn't require service interruption > or lead to data losses? > We can't stop message brokering during the update as we have a strict SLA. > > Best regards > Stanislav Gilmulin -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: topic's partition have no leader and isr
It seems your broker 1 is in a bad state, besides these two partitions you also have partition 10 whose Isr/Leader is not part of the replicas list: Topic: org.mobile_nginx Partition: 10 Leader: 2 Replicas: 1 Isr: 2 Maybe you can go to broker 1 and check its logs first. Guozhang On Wed, Jul 9, 2014 at 7:32 AM, Jun Rao wrote: > It's weird that you have replication factor 1, but two of the partitions > 25 and 31 have 2 assigned replicas. What's the command you used for > reassignment? > > Thanks, > > Jun > > > On Wed, Jul 9, 2014 at 12:10 AM, 鞠大升 wrote: > > > @Jun Rao, Kafka version: 0.8.1.1 > > > > @Guozhang Wang, I can not found the original controller log, but I can > give > > the controller log after execute ./bin/kafka-reassign-partitions.sh > > and ./bin/kafka-preferred-replica-election.sh > > > > Now I do not known how to recover leader for partition 25 and 31, any > idea? > > > > - controller log for ./bin/kafka-reassign-partitions.sh > > --- > > [2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on 5]: > > Partitions reassigned listener fired for path /admin/reassign_partitions. > > Record partitions to be reassigned > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25,"replicas":[6]},{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > > (kafka.controller.PartitionsReassignedListener) > > [2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add > Partition > > triggered > > > > > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > > for path /brokers/topics/org.mobile_nginx > > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > > [2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on 5]: > > Partitions reassigned listener fired for path /admin/reassign_partitions. > > Record partitions to be reassigned > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > > (kafka.controller.PartitionsReassignedListener) > > [2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add > Partition > > triggered > > > > > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > > for path /brokers/topics/org.mobile_nginx > > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > > > > - controller log for ./bin/kafka-reassign-partitions.sh > > --- > > [2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener on 5]: > > Preferred replica election listener fired for path > > /admin/preferred_replica_election. Record partitions to undergo preferred > > replica election > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25},{"topic":"org.mobile_nginx","partition":31}]} > > (kafka.controller.PreferredReplicaElectionListener) > > [2014-07-09 15:07:02,969] INFO [Controller 5]: Starting preferred replica > > leader election for partitions > [org.mobile_nginx,25],[org.mobile_nginx,31] > > (kafka.controller.KafkaController) > > [2014-07-09 15:07:02,969] INFO [Partition state machine on Controller 5]: > > Invoking state change to OnlinePartition for partitions > > [org.mobile_nginx,25],[org.mobile_nginx,31] > > (kafka.controller.PartitionStateMachine) > > [2014-07-09 15:07:02,972] INFO [PreferredReplicaPartitionLeaderSelector]: > > Current leader -1 for partition [org.mobile_nginx,25] is not the > preferred > > replica. Trigerring preferred replica leader election > > (kafka.controller.PreferredReplicaPartitionLeaderSelector) > > [2014-07-09 15:07:02,973] INFO [PreferredReplicaPartitionLeaderSelector]: > > Current leader -1 for partition [org.mobile_nginx,31] is not the > preferred > > replica. Trigerring preferred replica leader election > > (kafka.controller.PreferredReplicaPartitionLeaderSelector) > > [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition > > [org.mobile_nginx,25] failed to complete preferred replica leader > election. > > Leader is -1 (kafka.controller.KafkaController) > > [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition > > [org.mobile_nginx,31] failed to complete preferred replica leader > election. > > Leader is -1 (kafka.controller.KafkaController) > > > > > > On Sun, Jul 6, 2014 at 11:47 PM, Jun Rao wrote: > > > > > Also, which version of Kafka are you using? > > > > > > Thanks, > > > > > > Jun > > > > > > > > > On Thu, Jul 3, 2014 at 2:26 AM, 鞠大升 wrote: > > > > > > > hi, all > > > >
[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset
[ https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056358#comment-14056358 ] Alexey Ozeritskiy commented on KAFKA-1414: -- 1. Ok 2. If any thread fails we get ExecutionException at {code} jobs.foreach(_.get()) {code} We can get the original exception by calling e.getCause() and rethrow it. Is this ok ? > Speedup broker startup after hard reset > --- > > Key: KAFKA-1414 > URL: https://issues.apache.org/jira/browse/KAFKA-1414 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.2, 0.9.0, 0.8.1.1 >Reporter: Dmitry Bugaychenko >Assignee: Jay Kreps > Attachments: parallel-dir-loading-0.8.patch, > parallel-dir-loading-trunk-threadpool.patch, parallel-dir-loading-trunk.patch > > > After hard reset due to power failure broker takes way too much time > recovering unflushed segments in a single thread. This could be easiliy > improved launching multiple threads (one per data dirrectory, assuming that > typically each data directory is on a dedicated drive). Localy we trie this > simple patch to LogManager.loadLogs and it seems to work, however I'm too new > to scala, so do not take it literally: > {code} > /** >* Recover and load all logs in the given data directories >*/ > private def loadLogs(dirs: Seq[File]) { > val threads : Array[Thread] = new Array[Thread](dirs.size) > var i: Int = 0 > val me = this > for(dir <- dirs) { > val thread = new Thread( new Runnable { > def run() > { > val recoveryPoints = me.recoveryPointCheckpoints(dir).read > /* load the logs */ > val subDirs = dir.listFiles() > if(subDirs != null) { > val cleanShutDownFile = new File(dir, Log.CleanShutdownFile) > if(cleanShutDownFile.exists()) > info("Found clean shutdown file. Skipping recovery for all logs > in data directory '%s'".format(dir.getAbsolutePath)) > for(dir <- subDirs) { > if(dir.isDirectory) { > info("Loading log '" + dir.getName + "'") > val topicPartition = Log.parseTopicPartitionName(dir.getName) > val config = topicConfigs.getOrElse(topicPartition.topic, > defaultConfig) > val log = new Log(dir, > config, > recoveryPoints.getOrElse(topicPartition, 0L), > scheduler, > time) > val previous = addLogWithLock(topicPartition, log) > if(previous != null) > throw new IllegalArgumentException("Duplicate log > directories found: %s, %s!".format(log.dir.getAbsolutePath, > previous.dir.getAbsolutePath)) > } > } > cleanShutDownFile.delete() > } > } > }) > thread.start() > threads(i) = thread > i = i + 1 > } > for(thread <- threads) { > thread.join() > } > } > def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = { > logCreationOrDeletionLock synchronized { > this.logs.put(topicPartition, log) > } > } > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 23363: Patch for KAFKA-1325
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23363/ --- Review request for kafka. Bugs: KAFKA-1325 https://issues.apache.org/jira/browse/KAFKA-1325 Repository: kafka Description --- Added log.retention.ms and log.roll.ms in the server config. If both log.retention.ms and log.retention.minutes are given then log.retention.ms will be given priority. If both log.roll.ms and log.roll.hours are given then log.roll.ms property will be given prioroty. Diffs - core/src/main/scala/kafka/server/KafkaConfig.scala ef75b67b67676ae5b8931902cbc8c0c2cc72c0d3 core/src/main/scala/kafka/server/KafkaServer.scala c22e51e0412843ec993721ad3230824c0aadd2ba core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 6f4809da9968de293f365307ffd9cfe1d5c34ce0 Diff: https://reviews.apache.org/r/23363/diff/ Testing --- Thanks, Manikumar Reddy O
[jira] [Updated] (KAFKA-1325) Fix inconsistent per topic log configs
[ https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikumar Reddy updated KAFKA-1325: --- Attachment: KAFKA-1325.patch > Fix inconsistent per topic log configs > -- > > Key: KAFKA-1325 > URL: https://issues.apache.org/jira/browse/KAFKA-1325 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1 >Reporter: Neha Narkhede > Labels: usability > Attachments: KAFKA-1325.patch, KAFKA-1325.patch > > > Related thread from the user mailing list - > http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E > Our documentation is a little confusing on the log configs. > The log property for retention.ms is in millis but the server default it maps > to is in minutes. > Same is true for segment.ms as well. We could either improve the docs or > change the per-topic configs to be consistent with the server defaults. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1325) Fix inconsistent per topic log configs
[ https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056402#comment-14056402 ] Manikumar Reddy commented on KAFKA-1325: Created reviewboard https://reviews.apache.org/r/23363/diff/ against branch origin/trunk > Fix inconsistent per topic log configs > -- > > Key: KAFKA-1325 > URL: https://issues.apache.org/jira/browse/KAFKA-1325 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1 >Reporter: Neha Narkhede > Labels: usability > Attachments: KAFKA-1325.patch, KAFKA-1325.patch > > > Related thread from the user mailing list - > http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E > Our documentation is a little confusing on the log configs. > The log property for retention.ms is in millis but the server default it maps > to is in minutes. > Same is true for segment.ms as well. We could either improve the docs or > change the per-topic configs to be consistent with the server defaults. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1325) Fix inconsistent per topic log configs
[ https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056410#comment-14056410 ] Manikumar Reddy commented on KAFKA-1325: I added log.retention.ms and log.roll.ms in the server config. Among log.retention.ms, log.retention.minutes and log.retention.hours, log.retention.ms will be taken. Between log.roll.ms and log.roll.hours, log.roll.ms property will be taken. > Fix inconsistent per topic log configs > -- > > Key: KAFKA-1325 > URL: https://issues.apache.org/jira/browse/KAFKA-1325 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1 >Reporter: Neha Narkhede > Labels: usability > Attachments: KAFKA-1325.patch, KAFKA-1325.patch > > > Related thread from the user mailing list - > http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E > Our documentation is a little confusing on the log configs. > The log property for retention.ms is in millis but the server default it maps > to is in minutes. > Same is true for segment.ms as well. We could either improve the docs or > change the per-topic configs to be consistent with the server defaults. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1530) howto update continuously
[ https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056411#comment-14056411 ] Stanislav Gilmulin commented on KAFKA-1530: --- It means we have the risks. You're right. My question wansn't clear. Let me try to explain. First of all, accordint to business requirements we can't stop the service. So we can't stop all nodes before updating. And, as you've advised, our option would be updating step by step. But when we update without using the right procedure, we could lose an unknown amount of messages in the example case presented below. Let's consider this case for a example. We have 3 replicas of one partition with 2 of them lagging behind. Then we restart the leader. At that very moment one of the two lagging partitions become a new leader. After that, when the used-to-be-leader partiton starts working again (and which in fact has the newest data), it truncates all the newest data to match with now elected leader. This situation happens quite often when we restart a highly loaded Kafka cluster, so that we loose some part of our data. > howto update continuously > - > > Key: KAFKA-1530 > URL: https://issues.apache.org/jira/browse/KAFKA-1530 > Project: Kafka > Issue Type: Wish >Reporter: Stanislav Gilmulin >Priority: Minor > Labels: operating_manual, performance > > Hi, > > Could I ask you a question about the Kafka update procedure? > Is there a way to update software, which doesn't require service interruption > or lead to data losses? > We can't stop message brokering during the update as we have a strict SLA. > > Best regards > Stanislav Gilmulin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset
[ https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056417#comment-14056417 ] Jun Rao commented on KAFKA-1414: 2. Yes, that probably right. Could you verify that the original exception is indeed preserved? Also, should we use newFixedThreadPool instead of newScheduledThreadPool since we want to run those tasks immediately? > Speedup broker startup after hard reset > --- > > Key: KAFKA-1414 > URL: https://issues.apache.org/jira/browse/KAFKA-1414 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.2, 0.9.0, 0.8.1.1 >Reporter: Dmitry Bugaychenko >Assignee: Jay Kreps > Attachments: parallel-dir-loading-0.8.patch, > parallel-dir-loading-trunk-threadpool.patch, parallel-dir-loading-trunk.patch > > > After hard reset due to power failure broker takes way too much time > recovering unflushed segments in a single thread. This could be easiliy > improved launching multiple threads (one per data dirrectory, assuming that > typically each data directory is on a dedicated drive). Localy we trie this > simple patch to LogManager.loadLogs and it seems to work, however I'm too new > to scala, so do not take it literally: > {code} > /** >* Recover and load all logs in the given data directories >*/ > private def loadLogs(dirs: Seq[File]) { > val threads : Array[Thread] = new Array[Thread](dirs.size) > var i: Int = 0 > val me = this > for(dir <- dirs) { > val thread = new Thread( new Runnable { > def run() > { > val recoveryPoints = me.recoveryPointCheckpoints(dir).read > /* load the logs */ > val subDirs = dir.listFiles() > if(subDirs != null) { > val cleanShutDownFile = new File(dir, Log.CleanShutdownFile) > if(cleanShutDownFile.exists()) > info("Found clean shutdown file. Skipping recovery for all logs > in data directory '%s'".format(dir.getAbsolutePath)) > for(dir <- subDirs) { > if(dir.isDirectory) { > info("Loading log '" + dir.getName + "'") > val topicPartition = Log.parseTopicPartitionName(dir.getName) > val config = topicConfigs.getOrElse(topicPartition.topic, > defaultConfig) > val log = new Log(dir, > config, > recoveryPoints.getOrElse(topicPartition, 0L), > scheduler, > time) > val previous = addLogWithLock(topicPartition, log) > if(previous != null) > throw new IllegalArgumentException("Duplicate log > directories found: %s, %s!".format(log.dir.getAbsolutePath, > previous.dir.getAbsolutePath)) > } > } > cleanShutDownFile.delete() > } > } > }) > thread.start() > threads(i) = thread > i = i + 1 > } > for(thread <- threads) { > thread.join() > } > } > def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = { > logCreationOrDeletionLock synchronized { > this.logs.put(topicPartition, log) > } > } > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1530) howto update continuously
[ https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056422#comment-14056422 ] Oleg commented on KAFKA-1530: - Another situation which happened in our production: We have replication level 3. One out of 3 partition started lagging behind (due to network connectivity problems, etc.). Then while upgrading/restarting Kafka we restart the whole cluster. After upgrade Kafka starts electing leaders for each partition. It's highly likely it may elect the lagging behind partition as a leader. Which in result leads to truncating two other partitions. In this case we loose data. So we are seeking a means of restarting/upgrading Kafka without data loose. > howto update continuously > - > > Key: KAFKA-1530 > URL: https://issues.apache.org/jira/browse/KAFKA-1530 > Project: Kafka > Issue Type: Wish >Reporter: Stanislav Gilmulin >Priority: Minor > Labels: operating_manual, performance > > Hi, > > Could I ask you a question about the Kafka update procedure? > Is there a way to update software, which doesn't require service interruption > or lead to data losses? > We can't stop message brokering during the update as we have a strict SLA. > > Best regards > Stanislav Gilmulin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1507) Using GetOffsetShell against non-existent topic creates the topic unintentionally
[ https://issues.apache.org/jira/browse/KAFKA-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056424#comment-14056424 ] Jun Rao commented on KAFKA-1507: Thanks for the patch. I am currently working on KAFKA-1462, which I hope will standardize how we support multiple versions of a request on the server. Perhaps, you can wait until KAFKA-1462 is done, hopefully in a week or so. > Using GetOffsetShell against non-existent topic creates the topic > unintentionally > - > > Key: KAFKA-1507 > URL: https://issues.apache.org/jira/browse/KAFKA-1507 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8.1.1 > Environment: centos >Reporter: Luke Forehand >Assignee: Sriharsha Chintalapani >Priority: Minor > Labels: newbie > Attachments: KAFKA-1507.patch > > > A typo in using GetOffsetShell command can cause a > topic to be created which cannot be deleted (because deletion is still in > progress) > ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092 --topic typo --time 1 > ./kafka-topics.sh --zookeeper stormqa1/kafka-prod --describe --topic typo > Topic:typo PartitionCount:8ReplicationFactor:1 Configs: > Topic: typo Partition: 0Leader: 10 Replicas: 10 > Isr: 10 > ... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (KAFKA-1530) howto update continuously
[ https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056422#comment-14056422 ] Oleg edited comment on KAFKA-1530 at 7/9/14 4:33 PM: - Another situation which happened in our production: We have replication level 3. One out of 3 partition started lagging behind (due to network connectivity problems, etc.). Then while upgrading/restarting Kafka we restart the whole cluster. After upgrade the first node to start becomes a leader. It's highly likely it may be the lagging behind partition. Which in result leads to truncating two other partitions. In this case we loose data. So we are seeking a mean of restarting/upgrading Kafka without data loose. was (Author: ovgolovin): Another situation which happened in our production: We have replication level 3. One out of 3 partition started lagging behind (due to network connectivity problems, etc.). Then while upgrading/restarting Kafka we restart the whole cluster. After upgrade Kafka starts electing leaders for each partition. It's highly likely it may elect the lagging behind partition as a leader. Which in result leads to truncating two other partitions. In this case we loose data. So we are seeking a means of restarting/upgrading Kafka without data loose. > howto update continuously > - > > Key: KAFKA-1530 > URL: https://issues.apache.org/jira/browse/KAFKA-1530 > Project: Kafka > Issue Type: Wish >Reporter: Stanislav Gilmulin >Priority: Minor > Labels: operating_manual, performance > > Hi, > > Could I ask you a question about the Kafka update procedure? > Is there a way to update software, which doesn't require service interruption > or lead to data losses? > We can't stop message brokering during the update as we have a strict SLA. > > Best regards > Stanislav Gilmulin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (KAFKA-1515) Wake-up Sender upon blocked on fetching leader metadata
[ https://issues.apache.org/jira/browse/KAFKA-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Kreps resolved KAFKA-1515. -- Resolution: Fixed > Wake-up Sender upon blocked on fetching leader metadata > --- > > Key: KAFKA-1515 > URL: https://issues.apache.org/jira/browse/KAFKA-1515 > Project: Kafka > Issue Type: Bug >Reporter: Guozhang Wang >Assignee: Guozhang Wang > Fix For: 0.9.0 > > Attachments: KAFKA-1515_2014-07-03_10:19:28.patch, > KAFKA-1515_2014-07-03_16:43:05.patch, KAFKA-1515_2014-07-07_10:55:58.patch, > KAFKA-1515_2014-07-08_11:35:59.patch > > > Currently the new KafkaProducer will not wake up the sender thread upon > forcing metadata fetch, and hence if the sender is polling with a long > timeout (e.g. the metadata.age period) this wait will usually timeout and > fail. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1019) kafka-preferred-replica-election.sh will fail without clear error message if /brokers/topics/[topic]/partitions does not exist
[ https://issues.apache.org/jira/browse/KAFKA-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056440#comment-14056440 ] Mickael Hemri commented on KAFKA-1019: -- Hi Jun, We use ZK 3.4.6, can it cause this issue? How can we check that the topic watcher is registered by the controller? Thanks > kafka-preferred-replica-election.sh will fail without clear error message if > /brokers/topics/[topic]/partitions does not exist > -- > > Key: KAFKA-1019 > URL: https://issues.apache.org/jira/browse/KAFKA-1019 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8.1 >Reporter: Guozhang Wang > Labels: newbie > Fix For: 0.8.2 > > > From Libo Yu: > I tried to run kafka-preferred-replica-election.sh on our kafka cluster. > But I got this expection: > Failed to start preferred replica election > org.I0Itec.zkclient.exception.ZkNoNodeException: > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = > NoNode for /brokers/topics/uattoqaaa.default/partitions > I checked zookeeper and there is no > /brokers/topics/uattoqaaa.default/partitions. All I found is > /brokers/topics/uattoqaaa.default. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload
[ https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056441#comment-14056441 ] Jay Kreps commented on KAFKA-1532: -- This is a good idea, but unfortunately would break compatibility with existing clients. It would be good to save up enough of these fixes and do them all at once to minimize breakage. > Move CRC32 to AFTER the payload > --- > > Key: KAFKA-1532 > URL: https://issues.apache.org/jira/browse/KAFKA-1532 > Project: Kafka > Issue Type: Improvement > Components: core, producer >Reporter: Julian Morrison >Assignee: Jun Rao >Priority: Minor > > To support streaming a message of known length but unknown content, take the > CRC32 out of the message header and make it a message trailer. Then client > libraries can calculate it after streaming the message to Kafka, without > materializing the whole message in RAM. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1530) howto update continuously
[ https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056448#comment-14056448 ] Guozhang Wang commented on KAFKA-1530: -- Hi Stanislav/Oleg, Kafka server has a config called "controlled.shutdown.enable", and when it is turned on, the shutting down process will first wait for all the leaders of the current shutting down node to migrate to other nodes before shutting down the server (http://kafka.apache.org/documentation.html#brokerconfigs). For your first case, where the shutting down node is the only replica in ISR, the shutting down process will block until there are other nodes back in ISR and hence can take the partitions; for your second case where there are more than one node in ISR, then it is guaranteed that the leaders of the shutting down nodes will be moved to another ISR node. > howto update continuously > - > > Key: KAFKA-1530 > URL: https://issues.apache.org/jira/browse/KAFKA-1530 > Project: Kafka > Issue Type: Wish >Reporter: Stanislav Gilmulin >Priority: Minor > Labels: operating_manual, performance > > Hi, > > Could I ask you a question about the Kafka update procedure? > Is there a way to update software, which doesn't require service interruption > or lead to data losses? > We can't stop message brokering during the update as we have a strict SLA. > > Best regards > Stanislav Gilmulin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload
[ https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056444#comment-14056444 ] Jay Kreps commented on KAFKA-1532: -- Also since Kafka itself fully materializes messages in memory, fixing this won't actually let clients send arbitrarily large messages, as the broker will still choke. > Move CRC32 to AFTER the payload > --- > > Key: KAFKA-1532 > URL: https://issues.apache.org/jira/browse/KAFKA-1532 > Project: Kafka > Issue Type: Improvement > Components: core, producer >Reporter: Julian Morrison >Assignee: Jun Rao >Priority: Minor > > To support streaming a message of known length but unknown content, take the > CRC32 out of the message header and make it a message trailer. Then client > libraries can calculate it after streaming the message to Kafka, without > materializing the whole message in RAM. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1019) kafka-preferred-replica-election.sh will fail without clear error message if /brokers/topics/[topic]/partitions does not exist
[ https://issues.apache.org/jira/browse/KAFKA-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056456#comment-14056456 ] Jun Rao commented on KAFKA-1019: I am not sure how reliable ZK 3.4.6 is. If you check the ZK admin page, it shows you the command of listing all watchers registered. You want to check if there is a child watcher on /brokers/topics. Another thing is to avoid ZK session expiration, since it may expose some corner case bugs. > kafka-preferred-replica-election.sh will fail without clear error message if > /brokers/topics/[topic]/partitions does not exist > -- > > Key: KAFKA-1019 > URL: https://issues.apache.org/jira/browse/KAFKA-1019 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8.1 >Reporter: Guozhang Wang > Labels: newbie > Fix For: 0.8.2 > > > From Libo Yu: > I tried to run kafka-preferred-replica-election.sh on our kafka cluster. > But I got this expection: > Failed to start preferred replica election > org.I0Itec.zkclient.exception.ZkNoNodeException: > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = > NoNode for /brokers/topics/uattoqaaa.default/partitions > I checked zookeeper and there is no > /brokers/topics/uattoqaaa.default/partitions. All I found is > /brokers/topics/uattoqaaa.default. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1528) Normalize all the line endings
[ https://issues.apache.org/jira/browse/KAFKA-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056521#comment-14056521 ] Evgeny Vereshchagin commented on KAFKA-1528: [~junrao], what command do you run to apply patch? git apply --stat xyz-v1.patch ? I want to reproduce it. Looks like I send invalid patch. > Normalize all the line endings > -- > > Key: KAFKA-1528 > URL: https://issues.apache.org/jira/browse/KAFKA-1528 > Project: Kafka > Issue Type: Improvement >Reporter: Evgeny Vereshchagin >Priority: Trivial > Attachments: KAFKA-1528.patch, KAFKA-1528_2014-07-06_16:20:28.patch, > KAFKA-1528_2014-07-06_16:23:21.patch > > > Hi! > I add .gitattributes file and remove all '\r' from some .bat files > See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ and > https://help.github.com/articles/dealing-with-line-endings for explanation. > Maybe, add https://help.github.com/articles/dealing-with-line-endings to > contributing/commiting guide? -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: topic's partition have no leader and isr
that's because the topic have replication factor 2 at first, then I want to reassign all partitions 2 replicas to 1 replicas, so all partitions changed success except 25 and 31. I think before the reassign operation partitions 25 and 31 became to have no leader, cause the reassign operation failed. 2014年7月9日 PM10:33于 "Jun Rao" 写道: > It's weird that you have replication factor 1, but two of the partitions > 25 and 31 have 2 assigned replicas. What's the command you used for > reassignment? > > Thanks, > > Jun > > > On Wed, Jul 9, 2014 at 12:10 AM, 鞠大升 wrote: > > > @Jun Rao, Kafka version: 0.8.1.1 > > > > @Guozhang Wang, I can not found the original controller log, but I can > give > > the controller log after execute ./bin/kafka-reassign-partitions.sh > > and ./bin/kafka-preferred-replica-election.sh > > > > Now I do not known how to recover leader for partition 25 and 31, any > idea? > > > > - controller log for ./bin/kafka-reassign-partitions.sh > > --- > > [2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on 5]: > > Partitions reassigned listener fired for path /admin/reassign_partitions. > > Record partitions to be reassigned > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25,"replicas":[6]},{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > > (kafka.controller.PartitionsReassignedListener) > > [2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add > Partition > > triggered > > > > > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > > for path /brokers/topics/org.mobile_nginx > > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > > [2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on 5]: > > Partitions reassigned listener fired for path /admin/reassign_partitions. > > Record partitions to be reassigned > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > > (kafka.controller.PartitionsReassignedListener) > > [2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add > Partition > > triggered > > > > > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > > for path /brokers/topics/org.mobile_nginx > > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > > > > - controller log for ./bin/kafka-reassign-partitions.sh > > --- > > [2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener on 5]: > > Preferred replica election listener fired for path > > /admin/preferred_replica_election. Record partitions to undergo preferred > > replica election > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25},{"topic":"org.mobile_nginx","partition":31}]} > > (kafka.controller.PreferredReplicaElectionListener) > > [2014-07-09 15:07:02,969] INFO [Controller 5]: Starting preferred replica > > leader election for partitions > [org.mobile_nginx,25],[org.mobile_nginx,31] > > (kafka.controller.KafkaController) > > [2014-07-09 15:07:02,969] INFO [Partition state machine on Controller 5]: > > Invoking state change to OnlinePartition for partitions > > [org.mobile_nginx,25],[org.mobile_nginx,31] > > (kafka.controller.PartitionStateMachine) > > [2014-07-09 15:07:02,972] INFO [PreferredReplicaPartitionLeaderSelector]: > > Current leader -1 for partition [org.mobile_nginx,25] is not the > preferred > > replica. Trigerring preferred replica leader election > > (kafka.controller.PreferredReplicaPartitionLeaderSelector) > > [2014-07-09 15:07:02,973] INFO [PreferredReplicaPartitionLeaderSelector]: > > Current leader -1 for partition [org.mobile_nginx,31] is not the > preferred > > replica. Trigerring preferred replica leader election > > (kafka.controller.PreferredReplicaPartitionLeaderSelector) > > [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition > > [org.mobile_nginx,25] failed to complete preferred replica leader > election. > > Leader is -1 (kafka.controller.KafkaController) > > [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition > > [org.mobile_nginx,31] failed to complete preferred replica leader > election. > > Leader is -1 (kafka.controller.KafkaController) > > > > > > On Sun, Jul 6, 2014 at 11:47 PM, Jun Rao wrote: > > > > > Also, which version of Kafka are you using? > > > > > > Thanks, > > > > > > Jun > > > > > > > > > On Thu, Jul 3, 2014 at 2:26 AM, 鞠大升 wrote: > > > > > > > hi, all > > > > > > > >
Re: topic's partition have no leader and isr
thanks for your found, it is weird. i checked my broker 1, it is working fine, no error or warn log. I checked the data folder, and found replication 10's replicas, isr and leader is actually on broker 2, works fine. just now I execute a reassignment operation to move partition 10's replicas to broker 2, the controller log says"Partition [org.mobile_nginx,10] to be reassigned is already assigned to replicas 2. ignoring request for partition reassignment". then describe the topic shows partition10's replicas became to 2. 2014年7月9日 PM11:31于 "Guozhang Wang" 写道: > It seems your broker 1 is in a bad state, besides these two partitions you > also have partition 10 whose Isr/Leader is not part of the replicas list: > > Topic: org.mobile_nginx Partition: 10 Leader: 2 Replicas: 1 > Isr: 2 > > Maybe you can go to broker 1 and check its logs first. > > Guozhang > > > On Wed, Jul 9, 2014 at 7:32 AM, Jun Rao wrote: > > > It's weird that you have replication factor 1, but two of the partitions > > 25 and 31 have 2 assigned replicas. What's the command you used for > > reassignment? > > > > Thanks, > > > > Jun > > > > > > On Wed, Jul 9, 2014 at 12:10 AM, 鞠大升 wrote: > > > > > @Jun Rao, Kafka version: 0.8.1.1 > > > > > > @Guozhang Wang, I can not found the original controller log, but I can > > give > > > the controller log after execute ./bin/kafka-reassign-partitions.sh > > > and ./bin/kafka-preferred-replica-election.sh > > > > > > Now I do not known how to recover leader for partition 25 and 31, any > > idea? > > > > > > - controller log for ./bin/kafka-reassign-partitions.sh > > > --- > > > [2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on 5]: > > > Partitions reassigned listener fired for path > /admin/reassign_partitions. > > > Record partitions to be reassigned > > > > > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25,"replicas":[6]},{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > > > (kafka.controller.PartitionsReassignedListener) > > > [2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add > > Partition > > > triggered > > > > > > > > > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > > > for path /brokers/topics/org.mobile_nginx > > > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > > > [2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on 5]: > > > Partitions reassigned listener fired for path > /admin/reassign_partitions. > > > Record partitions to be reassigned > > > > > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > > > (kafka.controller.PartitionsReassignedListener) > > > [2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add > > Partition > > > triggered > > > > > > > > > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > > > for path /brokers/topics/org.mobile_nginx > > > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > > > > > > - controller log for ./bin/kafka-reassign-partitions.sh > > > --- > > > [2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener on > 5]: > > > Preferred replica election listener fired for path > > > /admin/preferred_replica_election. Record partitions to undergo > preferred > > > replica election > > > > > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25},{"topic":"org.mobile_nginx","partition":31}]} > > > (kafka.controller.PreferredReplicaElectionListener) > > > [2014-07-09 15:07:02,969] INFO [Controller 5]: Starting preferred > replica > > > leader election for partitions > > [org.mobile_nginx,25],[org.mobile_nginx,31] > > > (kafka.controller.KafkaController) > > > [2014-07-09 15:07:02,969] INFO [Partition state machine on Controller > 5]: > > > Invoking state change to OnlinePartition for partitions > > > [org.mobile_nginx,25],[org.mobile_nginx,31] > > > (kafka.controller.PartitionStateMachine) > > > [2014-07-09 15:07:02,972] INFO > [PreferredReplicaPartitionLeaderSelector]: > > > Current leader -1 for partition [org.mobile_nginx,25] is not the > > preferred > > > replica. Trigerring preferred replica leader election > > > (kafka.controller.PreferredReplicaPartitionLeaderSelector) > > > [2014-07-09 15:07:02,973] INFO > [PreferredReplicaPartitionLeaderSelector]: > > > Current leader -1 for partition [org.mobile_nginx,31] is no
[jira] [Commented] (KAFKA-1528) Normalize all the line endings
[ https://issues.apache.org/jira/browse/KAFKA-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056720#comment-14056720 ] Evgeny Vereshchagin commented on KAFKA-1528: kafka-patch-review.py squashes two commits to one giant diff. It's wrong for these changes. Maybe, format-patch and apply-patch would be better? > Normalize all the line endings > -- > > Key: KAFKA-1528 > URL: https://issues.apache.org/jira/browse/KAFKA-1528 > Project: Kafka > Issue Type: Improvement >Reporter: Evgeny Vereshchagin >Priority: Trivial > Attachments: KAFKA-1528.patch, KAFKA-1528_2014-07-06_16:20:28.patch, > KAFKA-1528_2014-07-06_16:23:21.patch > > > Hi! > I add .gitattributes file and remove all '\r' from some .bat files > See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ and > https://help.github.com/articles/dealing-with-line-endings for explanation. > Maybe, add https://help.github.com/articles/dealing-with-line-endings to > contributing/commiting guide? -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: topic's partition have no leader and isr
Dasheng, This is indeed wired. Could you easily re-produce the problem, i.e. starting the cluster with a topic' replication factor > 1, then change it to 1. And see if this issue shows up again. Guozhang On Wed, Jul 9, 2014 at 12:09 PM, DashengJu wrote: > thanks for your found, it is weird. > > i checked my broker 1, it is working fine, no error or warn log. > > I checked the data folder, and found replication 10's replicas, isr and > leader is actually on broker 2, works fine. > just now I execute a reassignment operation to move partition 10's replicas > to broker 2, the controller log says"Partition [org.mobile_nginx,10] to be > reassigned is already assigned to replicas 2. ignoring request for > partition reassignment". then describe the topic shows partition10's > replicas became to 2. > 2014年7月9日 PM11:31于 "Guozhang Wang" 写道: > > > It seems your broker 1 is in a bad state, besides these two partitions > you > > also have partition 10 whose Isr/Leader is not part of the replicas list: > > > > Topic: org.mobile_nginx Partition: 10 Leader: 2 Replicas: 1 > > Isr: 2 > > > > Maybe you can go to broker 1 and check its logs first. > > > > Guozhang > > > > > > On Wed, Jul 9, 2014 at 7:32 AM, Jun Rao wrote: > > > > > It's weird that you have replication factor 1, but two of the > partitions > > > 25 and 31 have 2 assigned replicas. What's the command you used for > > > reassignment? > > > > > > Thanks, > > > > > > Jun > > > > > > > > > On Wed, Jul 9, 2014 at 12:10 AM, 鞠大升 wrote: > > > > > > > @Jun Rao, Kafka version: 0.8.1.1 > > > > > > > > @Guozhang Wang, I can not found the original controller log, but I > can > > > give > > > > the controller log after execute ./bin/kafka-reassign-partitions.sh > > > > and ./bin/kafka-preferred-replica-election.sh > > > > > > > > Now I do not known how to recover leader for partition 25 and 31, any > > > idea? > > > > > > > > - controller log for > ./bin/kafka-reassign-partitions.sh > > > > --- > > > > [2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on 5]: > > > > Partitions reassigned listener fired for path > > /admin/reassign_partitions. > > > > Record partitions to be reassigned > > > > > > > > > > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25,"replicas":[6]},{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > > > > (kafka.controller.PartitionsReassignedListener) > > > > [2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add > > > Partition > > > > triggered > > > > > > > > > > > > > > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > > > > for path /brokers/topics/org.mobile_nginx > > > > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > > > > [2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on 5]: > > > > Partitions reassigned listener fired for path > > /admin/reassign_partitions. > > > > Record partitions to be reassigned > > > > > > > > > > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > > > > (kafka.controller.PartitionsReassignedListener) > > > > [2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add > > > Partition > > > > triggered > > > > > > > > > > > > > > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > > > > for path /brokers/topics/org.mobile_nginx > > > > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > > > > > > > > - controller log for > ./bin/kafka-reassign-partitions.sh > > > > --- > > > > [2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener on > > 5]: > > > > Preferred replica election listener fired for path > > > > /admin/preferred_replica_election. Record partitions to undergo > > preferred > > > > replica election > > > > > > > > > > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25},{"topic":"org.mobile_nginx","partition":31}]} > > > > (kafka.controller.PreferredReplicaElectionListener) > > > > [2014-07-09 15:07:02,969] INFO [Controller 5]: Starting preferred > > replica > > > > leader election for partitions > > > [org.mobile_nginx,25],[org.mobile_nginx,31] > > > > (kafka.controller.KafkaController) > > > > [2014-07-09 15:07:02,969] INFO [Partition state machine on Controller > > 5]: > > > > Invoking state change to OnlinePartition for partitions > > > > [org.mobile_nginx,25],[org.mobile_ngi
[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload
[ https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056792#comment-14056792 ] Julian Morrison commented on KAFKA-1532: Is "Kafka itself fully materializes messages in memory" unavoidable? (I am assuming writing it to file buffers is obvious.) > Move CRC32 to AFTER the payload > --- > > Key: KAFKA-1532 > URL: https://issues.apache.org/jira/browse/KAFKA-1532 > Project: Kafka > Issue Type: Improvement > Components: core, producer >Reporter: Julian Morrison >Assignee: Jun Rao >Priority: Minor > > To support streaming a message of known length but unknown content, take the > CRC32 out of the message header and make it a message trailer. Then client > libraries can calculate it after streaming the message to Kafka, without > materializing the whole message in RAM. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload
[ https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056802#comment-14056802 ] Jay Kreps commented on KAFKA-1532: -- It's not unavoidable, but it is pretty hard. To make this really work we have to be able to stream partial requests from the network layer, to the api/processing layer, and to the log layer. This is even more complicated because additional appends to the log have to be blocked until the message is completely written. > Move CRC32 to AFTER the payload > --- > > Key: KAFKA-1532 > URL: https://issues.apache.org/jira/browse/KAFKA-1532 > Project: Kafka > Issue Type: Improvement > Components: core, producer >Reporter: Julian Morrison >Assignee: Jun Rao >Priority: Minor > > To support streaming a message of known length but unknown content, take the > CRC32 out of the message header and make it a message trailer. Then client > libraries can calculate it after streaming the message to Kafka, without > materializing the whole message in RAM. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload
[ https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056806#comment-14056806 ] Jay Kreps commented on KAFKA-1532: -- Your approach where you buffer the message with a backing file and then do the full write to the log would also work and might be easier. But still a pretty big change. > Move CRC32 to AFTER the payload > --- > > Key: KAFKA-1532 > URL: https://issues.apache.org/jira/browse/KAFKA-1532 > Project: Kafka > Issue Type: Improvement > Components: core, producer >Reporter: Julian Morrison >Assignee: Jun Rao >Priority: Minor > > To support streaming a message of known length but unknown content, take the > CRC32 out of the message header and make it a message trailer. Then client > libraries can calculate it after streaming the message to Kafka, without > materializing the whole message in RAM. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload
[ https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056816#comment-14056816 ] Julian Morrison commented on KAFKA-1532: Streaming it to a scratch file (never sync'd, unlinked after use) avoids blocking the log - the blocking operation becomes "it's verified in the scratch file, copy it to the log." > Move CRC32 to AFTER the payload > --- > > Key: KAFKA-1532 > URL: https://issues.apache.org/jira/browse/KAFKA-1532 > Project: Kafka > Issue Type: Improvement > Components: core, producer >Reporter: Julian Morrison >Assignee: Jun Rao >Priority: Minor > > To support streaming a message of known length but unknown content, take the > CRC32 out of the message header and make it a message trailer. Then client > libraries can calculate it after streaming the message to Kafka, without > materializing the whole message in RAM. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload
[ https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056833#comment-14056833 ] Jay Kreps commented on KAFKA-1532: -- Yeah agreed. I think that is doable. > Move CRC32 to AFTER the payload > --- > > Key: KAFKA-1532 > URL: https://issues.apache.org/jira/browse/KAFKA-1532 > Project: Kafka > Issue Type: Improvement > Components: core, producer >Reporter: Julian Morrison >Assignee: Jun Rao >Priority: Minor > > To support streaming a message of known length but unknown content, take the > CRC32 out of the message header and make it a message trailer. Then client > libraries can calculate it after streaming the message to Kafka, without > materializing the whole message in RAM. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: topic's partition have no leader and isr
Guozhang, I will try to re-preduce the problem in our test environment. And the problem now exist in our production environment for 10 days, is there any log or memory dump or zookeeper information useful for you to analysis the problem? and any ideas to recover from the situation? restart the cluster? delete the topic manual? thx On Thu, Jul 10, 2014 at 5:27 AM, Guozhang Wang wrote: > Dasheng, > > This is indeed wired. Could you easily re-produce the problem, i.e. > starting the cluster with a topic' replication factor > 1, then change it > to 1. And see if this issue shows up again. > > Guozhang > > > On Wed, Jul 9, 2014 at 12:09 PM, DashengJu wrote: > > > thanks for your found, it is weird. > > > > i checked my broker 1, it is working fine, no error or warn log. > > > > I checked the data folder, and found replication 10's replicas, isr and > > leader is actually on broker 2, works fine. > > just now I execute a reassignment operation to move partition 10's > replicas > > to broker 2, the controller log says"Partition [org.mobile_nginx,10] to > be > > reassigned is already assigned to replicas 2. ignoring request for > > partition reassignment". then describe the topic shows partition10's > > replicas became to 2. > > 2014年7月9日 PM11:31于 "Guozhang Wang" 写道: > > > > > It seems your broker 1 is in a bad state, besides these two partitions > > you > > > also have partition 10 whose Isr/Leader is not part of the replicas > list: > > > > > > Topic: org.mobile_nginx Partition: 10 Leader: 2 Replicas: 1 > > > Isr: 2 > > > > > > Maybe you can go to broker 1 and check its logs first. > > > > > > Guozhang > > > > > > > > > On Wed, Jul 9, 2014 at 7:32 AM, Jun Rao wrote: > > > > > > > It's weird that you have replication factor 1, but two of the > > partitions > > > > 25 and 31 have 2 assigned replicas. What's the command you used for > > > > reassignment? > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > > > > > On Wed, Jul 9, 2014 at 12:10 AM, 鞠大升 wrote: > > > > > > > > > @Jun Rao, Kafka version: 0.8.1.1 > > > > > > > > > > @Guozhang Wang, I can not found the original controller log, but I > > can > > > > give > > > > > the controller log after execute ./bin/kafka-reassign-partitions.sh > > > > > and ./bin/kafka-preferred-replica-election.sh > > > > > > > > > > Now I do not known how to recover leader for partition 25 and 31, > any > > > > idea? > > > > > > > > > > - controller log for > > ./bin/kafka-reassign-partitions.sh > > > > > --- > > > > > [2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on > 5]: > > > > > Partitions reassigned listener fired for path > > > /admin/reassign_partitions. > > > > > Record partitions to be reassigned > > > > > > > > > > > > > > > > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":25,"replicas":[6]},{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > > > > > (kafka.controller.PartitionsReassignedListener) > > > > > [2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add > > > > Partition > > > > > triggered > > > > > > > > > > > > > > > > > > > > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > > > > > for path /brokers/topics/org.mobile_nginx > > > > > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > > > > > [2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on > 5]: > > > > > Partitions reassigned listener fired for path > > > /admin/reassign_partitions. > > > > > Record partitions to be reassigned > > > > > > > > > > > > > > > > > > > > {"version":1,"partitions":[{"topic":"org.mobile_nginx","partition":31,"replicas":[3]}]} > > > > > (kafka.controller.PartitionsReassignedListener) > > > > > [2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add > > > > Partition > > > > > triggered > > > > > > > > > > > > > > > > > > > > {"version":1,"partitions":{"12":[1],"8":[3],"19":[5],"23":[5],"4":[3],"15":[2],"11":[2],"9":[4],"22":[4],"26":[2],"13":[2],"24":[6],"16":[4],"5":[4],"10":[1],"21":[3],"6":[5],"1":[4],"17":[5],"25":[6,1],"14":[4],"31":[3,1],"0":[3],"20":[2],"27":[3],"2":[5],"18":[6],"30":[6],"7":[6],"29":[5],"3":[6],"28":[4]}} > > > > > for path /brokers/topics/org.mobile_nginx > > > > > (kafka.controller.PartitionStateMachine$AddPartitionsListener) > > > > > > > > > > - controller log for > > ./bin/kafka-reassign-partitions.sh > > > > > --- > > > > > [2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener > on > > > 5]: > > > > > Preferred replica election listener fired for path > > > > > /admin/preferred_replica_election. Record partitions to undergo > > > preferred > > >