/tmp is not a good location for storing files. It will get cleaned up periodically, depending on your linux distribution.
Radu On 22 June 2016 at 19:33, Misra, Rahul <[email protected]> wrote: > Hi Madhukar, > > Thanks for your quick response. The path is "/tmp/kafka-logs/". But the > servers have not been restarted any time lately. The uptime for all the 3 > servers is almost 67 days. > > Regards, > Rahul Misra > > > -----Original Message----- > From: Madhukar Bharti [mailto:[email protected]] > Sent: Wednesday, June 22, 2016 8:37 PM > To: [email protected] > Subject: Re: Kafka broker crash > > Hi Rahul, > > Whether the path is "/tmp/kafka-logs/" or "/temp/kafka-logs" ? > > Mostly if path is set to "/tmp/" then in case machine restart it may > delete the files. So it is throwing FileNotFoundException. > you can change the file location to some other path and restart all broker. > This might fix the issue. > > Regrads, > Madhukar > > On Wed, Jun 22, 2016 at 1:40 PM, Misra, Rahul <[email protected]> > wrote: > > > Hi, > > > > I'm facing a strange issue in my Kafka cluster. Could anybody please > > help me with it. The issue is as follows: > > > > We have a 3 node kafka cluster. We installed the zookeeper separately > > and have pointed the brokers to it. The zookeeper is also 3 node, but > > for our POC setup, the zookeeper nodes are on the same machines as the > > Kafka brokers. > > > > While receiving messages from an existing topic using a new groupId, 2 > > of the brokers crashed with same FATAL errors: > > > > -------------------------------------------------------- > > <<<<<<<<<<<<<---- [server 2 logs] ---->>>>>>>>>>>>>>> > > > > [2016-06-21 23:09:14,697] INFO [GroupCoordinator 1]: Stabilized group > > pocTestNew11 generation 1 (kafka.coordinator.Gro > > upCoordinator) > > [2016-06-21 23:09:15,006] INFO [GroupCoordinator 1]: Assignment > > received from leader for group pocTestNew11 for genera tion 1 > > (kafka.coordinator.GroupCoordinator) > > [2016-06-21 23:09:20,335] FATAL [Replica Manager on Broker 1]: Halting > > due to unrecoverable I/O error while handling p roduce request: > > (kafka.server.ReplicaManager) > > kafka.common.KafkaStorageException: I/O exception in append to log > > '__consumer_offsets-4' > > at kafka.log.Log.append(Log.scala:318) > > at kafka.cluster.Partition$$anonfun$9.apply(Partition.scala:442) > > at kafka.cluster.Partition$$anonfun$9.apply(Partition.scala:428) > > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262) > > at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:268) > > at > > kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:428) > > at > > > kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:401) > > at > > > kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:386) > > at > > > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) > > at > > > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) > > at scala.collection.immutable.Map$Map1.foreach(Map.scala:116) > > at > > scala.collection.TraversableLike$class.map(TraversableLike.scala:245) > > at > scala.collection.AbstractTraversable.map(Traversable.scala:104) > > at > > kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:386) > > at > > kafka.server.ReplicaManager.appendMessages(ReplicaManager.scala:322) > > at > > > kafka.coordinator.GroupMetadataManager.store(GroupMetadataManager.scala:228) > > at > > > kafka.coordinator.GroupCoordinator$$anonfun$handleCommitOffsets$9.apply(GroupCoordinator.scala:429) > > at > > > kafka.coordinator.GroupCoordinator$$anonfun$handleCommitOffsets$9.apply(GroupCoordinator.scala:429) > > at scala.Option.foreach(Option.scala:257) > > at > > > kafka.coordinator.GroupCoordinator.handleCommitOffsets(GroupCoordinator.scala:429) > > at > > kafka.server.KafkaApis.handleOffsetCommitRequest(KafkaApis.scala:280) > > at kafka.server.KafkaApis.handle(KafkaApis.scala:76) > > at > > kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60) > > at java.lang.Thread.run(Thread.java:745) > > Caused by: java.io.FileNotFoundException: > > /tmp/kafka-logs/__consumer_offsets-4/00000000000000000000.index (No > > such file or directory) > > at java.io.RandomAccessFile.open0(Native Method) > > at java.io.RandomAccessFile.open(RandomAccessFile.java:316) > > at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243) > > at > > kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:277) > > at > > kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:276) > > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262) > > at kafka.log.OffsetIndex.resize(OffsetIndex.scala:276) > > at > > > kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply$mcV$sp(OffsetIndex.scala:265) > > at > > > kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265) > > at > > > kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265) > > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262) > > at kafka.log.OffsetIndex.trimToValidSize(OffsetIndex.scala:264) > > at kafka.log.Log.roll(Log.scala:627) > > at kafka.log.Log.maybeRoll(Log.scala:602) > > at kafka.log.Log.append(Log.scala:357) > > > > ---------------------------------------------- > > <<<<<<<<<<<<<---- [server 3 logs] ---->>>>>>>>>>>>>>> > > > > [2016-06-21 23:08:49,796] FATAL [ReplicaFetcherThread-0-0], Disk error > > while replicating data. (kafka.server.ReplicaFe > > tcherThread) > > kafka.common.KafkaStorageException: I/O exception in append to log > > '__consumer_offsets-4' > > at kafka.log.Log.append(Log.scala:318) > > at > > > kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:113) > > at > > > kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:42) > > at > > > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2. > > apply(AbstractFetcherThread.scala:138) > > at > > > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2. > > apply(AbstractFetcherThread.scala:122) > > at scala.Option.foreach(Option.scala:257) > > at > > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$ano > > nfun$apply$mcV$sp$1.apply(AbstractFet > > cherThread.scala:122) > > at > > > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:120) > > at > > > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) > > at > > > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) > > at > > > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) > > at > scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) > > at scala.collection.mutable.HashMap.foreach(HashMap.scala:99) > > at > > > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:120) > > at > > > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:120) > > at > > > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:120) > > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262) > > at > > > kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118) > > at > > kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:93) > > at > > kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > > Caused by: java.io.FileNotFoundException: > > /tmp/kafka-logs/__consumer_offsets-4/00000000000000000000.index (No > > such file or directory) > > at java.io.RandomAccessFile.open0(Native Method) > > at java.io.RandomAccessFile.open(RandomAccessFile.java:316) > > at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243) > > at > > kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:277) > > at > > kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:276) > > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262) > > at kafka.log.OffsetIndex.resize(OffsetIndex.scala:276) > > at > > > kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply$mcV$sp(OffsetIndex.scala:265) > > at > > > kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265) > > at > > > kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265) > > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262) > > at kafka.log.OffsetIndex.trimToValidSize(OffsetIndex.scala:264) > > at kafka.log.Log.roll(Log.scala:627) > > at kafka.log.Log.maybeRoll(Log.scala:602) > > at kafka.log.Log.append(Log.scala:357) > > ... 19 more > > > > > > > > For the topic "__consumer_offsets" which is used to commit consumer > > offsets the default number of partitions is 50 and the replication > > factor is 3. > > So ideally all the 3 brokers should have logs for all partitions for > > "__consumer_offsets". > > I checked the "/temp/kafka-logs" directory for each server and except > > for the broker 1, the other 2 brokers (server 2 and 3) do not contain > > replicas for all the partitions for "__consumer_offsets". There are > > log directories missing for many partitions for "__consumer_offsets" > > on brokers 2 and 3 (including partition 4 which resulted in the above > crash). > > > > What could be the cause for this crash. Is there any mis-configuration > > for the broker that can cause this? > > > > Regards, > > Rahul Misra > > > > Technical Lead > > Altisource(tm) > > Mobile: 9886141541 | Ext: 298269 > > [email protected]<mailto:[email protected]> | > > www.Altisource.com<http://www.altisource.com/> > > > > This email message and any attachments are intended solely for the use > > of the addressee. If you are not the intended recipient, you are > > prohibited from reading, disclosing, reproducing, distributing, > > disseminating or otherwise using this transmission. If you have > > received this message in error, please promptly notify the sender by > > reply email and immediately delete this message from your system. This > > message and any attachments may contain information that is > > confidential, privileged or exempt from disclosure. Delivery of this > > message to any person other than the intended recipient is not > > intended to waive any right or privilege. Message transmission is not > guaranteed to be secure or free of software viruses. > > > > ********************************************************************** > > ************************************************* > > > This email message and any attachments are intended solely for the use of > the addressee. If you are not the intended recipient, you are prohibited > from reading, disclosing, reproducing, distributing, disseminating or > otherwise using this transmission. If you have received this message in > error, please promptly notify the sender by reply email and immediately > delete this message from your system. This message and any attachments may > contain information that is confidential, privileged or exempt from > disclosure. Delivery of this message to any person other than the intended > recipient is not intended to waive any right or privilege. Message > transmission is not guaranteed to be secure or free of software viruses. > > *********************************************************************************************************************** >
