[ https://issues.apache.org/jira/browse/KAFKA-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17427097#comment-17427097 ]
Viktor Somogyi-Vass commented on KAFKA-6668: -------------------------------------------- [~little brother ma], [~alchimie], [~borisvu] have any of you figured out the problem here? Was it related to a disk issue or something in Kafka? > Broker crashes on restart ,got a CorruptRecordException: Record size is > smaller than minimum record overhead(14) > ---------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-6668 > URL: https://issues.apache.org/jira/browse/KAFKA-6668 > Project: Kafka > Issue Type: Bug > Components: log > Affects Versions: 0.11.0.1 > Environment: Linux version : > 3.10.0-514.26.2.el7.x86_64 (mockbuild@cgslv5.buildsys213) (gcc version 4.8.5 > 20150623 (Red Hat 4.8.5-11) > docker version: > Client: > Version: 1.12.6 > API version: 1.24 > Go version: go1.7.5 > Git commit: ad3fef854be2172454902950240a9a9778d24345 > Built: Mon Jan 15 22:01:14 2018 > OS/Arch: linux/amd64 > Reporter: little brother ma > Priority: Major > > There is a kafka cluster with a broker ,running in docker container. Because > of disk full, the container crashes, and gets restarted again, and crashes > again....... > log when disk full : > > {code:java} > [2018-03-14 00:11:40,764] INFO Rolled new log segment for 'oem-debug-log-1' > in 1 ms. (kafka.log.Log) [2018-03-14 00:11:40,765] ERROR Uncaught exception > in scheduled task 'flush-log' (kafka.utils.KafkaScheduler) > java.io.IOException: I/O error at sun.nio.ch.FileDispatcherImpl.force0(Native > Method) at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76) at > sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388) at > org.apache.kafka.common.record.FileRecords.flush(FileRecords.java:162) at > kafka.log.LogSegment$$anonfun$flush$1.apply$mcV$sp(LogSegment.scala:377) at > kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:376) at > kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:376) at > kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31) at > kafka.log.LogSegment.flush(LogSegment.scala:376) at > kafka.log.Log$$anonfun$flush$2.apply(Log.scala:1312) at > kafka.log.Log$$anonfun$flush$2.apply(Log.scala:1311) at > scala.collection.Iterator$class.foreach(Iterator.scala:891) at > scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at > scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at > scala.collection.AbstractIterable.foreach(Iterable.scala:54) at > kafka.log.Log.flush(Log.scala:1311) at > kafka.log.Log$$anonfun$roll$1.apply$mcV$sp(Log.scala:1283) at > kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57) at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at > java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) [2018-03-14 00:11:56,514] ERROR > [KafkaApi-7] Error when handling request > {replica_id=-1,max_wait_time=100,min_bytes=1,topics=[{topic=oem-debug- > log,partitions=[{partition=0,fetch_offset=0,max_bytes=1048576},{partition=1,fetch_offset=131382630,max_bytes=1048576}]}]} > (kafka.server.KafkaAp is) > org.apache.kafka.common.errors.CorruptRecordException: Record size is smaller > than minimum record overhead (14). > {code} > > > > log when resolved issue of disk full,and kafka restart: > > {code:java} > [2018-03-15 23:00:08,998] WARN Found a corrupted index file due to > requirement failed: Corrupt index found, index file > (/kafka/kafka-logs/__consumer > _offsets-19/00000000000003396188.index) has non-zero size but the last offset > is 3396188 which is no larger than the base offset 3396188.}. deleting > /kafka/kafka-logs/__consumer_offsets-19/00000000000003396188.timeindex, > /kafka/kafka-logs/__consumer_offsets-19/00000000000003396188.index, and /ka > fka/kafka-logs/__consumer_offsets-19/00000000000003396188.txnindex and > rebuilding index... (kafka.log.Log) > [2018-03-15 23:00:08,999] INFO Loading producer state from snapshot file > '/kafka/kafka-logs/__consumer_offsets-19/00000000000003396188.snapshot' for > partition __consumer_offsets-19 (kafka.log.ProducerStateManager) > [2018-03-15 23:00:09,242] INFO Recovering unflushed segment 3396188 in log > __consumer_offsets-19. (kafka.log.Log) > [2018-03-15 23:00:09,243] INFO Loading producer state from snapshot file > '/kafka/kafka-logs/__consumer_offsets-19/00000000000003396188.snapshot' for > partition __consumer_offsets-19 (kafka.log.ProducerStateManager) > [2018-03-15 23:00:09,497] INFO Loading producer state from offset 3576788 for > partition __consumer_offsets-19 with message format version 2 (kafka.l > og.Log) > [2018-03-15 23:00:09,497] INFO Loading producer state from snapshot file > '/kafka/kafka-logs/__consumer_offsets-19/00000000000003576788.snapshot' for > partition __consumer_offsets-19 (kafka.log.ProducerStateManager) > [2018-03-15 23:00:09,498] INFO Completed load of log __consumer_offsets-19 > with 3 log segments, log start offset 0 and log end offset 3576788 in 501 > ms (kafka.log.Log) > [2018-03-15 23:00:09,503] ERROR Could not find offset index file > corresponding to log file > /kafka/kafka-logs/__consumer_offsets-28/00000000000004649 > 658.log, rebuilding index... (kafka.log.Log) > [2018-03-15 23:00:09,503] INFO Loading producer state from snapshot file > '/kafka/kafka-logs/__consumer_offsets-28/00000000000004649658.snapshot' for > partition __consumer_offsets-28 (kafka.log.ProducerStateManager) > [2018-03-15 23:00:09,505] WARN Found a corrupted index file due to > requirement failed: Corrupt index found, index file > (/kafka/kafka-logs/__consumer > _offsets-38/00000000000001866889.index) has non-zero size but the last offset > is 1866889 which is no larger than the base offset 1866889.}. deleting > /kafka/kafka-logs/__consumer_offsets-38/00000000000001866889.timeindex, > /kafka/kafka-logs/__consumer_offsets-38/00000000000001866889.index, and /ka > fka/kafka-logs/__consumer_offsets-38/00000000000001866889.txnindex and > rebuilding index... (kafka.log.Log) > [2018-03-15 23:00:09,506] INFO Loading producer state from snapshot file > '/kafka/kafka-logs/__consumer_offsets-38/00000000000001244589.snapshot' for > partition __consumer_offsets-38 (kafka.log.ProducerStateManager) > [2018-03-15 23:00:09,507] ERROR There was an error in one of the threads > during logs loading: org.apache.kafka.common.errors.CorruptRecordException: > Record size is smaller than minimum record overhead (14). > (kafka.log.LogManager) > [2018-03-15 23:00:09,509] FATAL [Kafka Server 7], Fatal error during > KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) > org.apache.kafka.common.errors.CorruptRecordException: Record size is smaller > than minimum record overhead (14). > [2018-03-15 23:00:09,514] INFO [Kafka Server 7], shutting down > (kafka.server.KafkaServer) > [2018-03-15 23:00:09,519] INFO Terminate ZkClient event thread. > (org.I0Itec.zkclient.ZkEventThread) > [2018-03-15 23:00:09,522] INFO Session: 0x162286e30d40711 closed > (org.apache.zookeeper.ZooKeeper) > [2018-03-15 23:00:09,525] INFO EventThread shut down for session: > 0x162286e30d40711 (org.apache.zookeeper.ClientCnxn) > [2018-03-15 23:00:09,532] INFO [Kafka Server 7], shut down completed > (kafka.server.KafkaServer) > [2018-03-15 23:00:09,532] FATAL Exiting Kafka. > (kafka.server.KafkaServerStartable) > [2018-03-15 23:00:09,550] INFO [Kafka Server 7], shutting down > (kafka.server.KafkaServer) > {code} > > And the .snapshot file is empty: > {code:java} > -rw-r--r-- 1 root root 0 Mar 13 09:50 00000000000000000000.index > -rw-r--r-- 1 root root 653 Mar 13 09:50 00000000000000000000.log > -rw-r--r-- 1 root root 12 Mar 13 09:50 00000000000000000000.timeindex > -rw-r--r-- 1 root root 0 Mar 13 22:08 00000000000020480273.index > -rw-r--r-- 1 root root 821 Mar 13 22:08 00000000000020480273.log > -rw-r--r-- 1 root root 10 Mar 13 09:50 00000000000020480273.snapshot > -rw-r--r-- 1 root root 12 Mar 13 22:08 00000000000020480273.timeindex > -rw-r--r-- 1 root root 10485760 Mar 16 09:28 00000000000021450679.index > -rw-r--r-- 1 root root 86736662 Mar 15 15:23 00000000000021450679.log > -rw-r--r-- 1 root root 10 Mar 13 22:08 00000000000021450679.snapshot > -rw-r--r-- 1 root root 10485756 Mar 16 09:28 00000000000021450679.timeindex > -rw-r--r-- 1 root root 10 Mar 16 09:28 00000000000022253427.snapshot > -rw-r--r-- 1 root root 8 Mar 1 09:17 leader-epoch-checkpoint > {code} > I resolved this by executing command: > > {code:java} > find /paasdata/commsrvkafka/data/broker/kafka-logs -name *.snapshot |xargs rm > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)