/tmp is not a good location for storing files. It will get cleaned up
periodically, depending on your linux distribution.

Radu

On 22 June 2016 at 19:33, Misra, Rahul <rahul.mi...@altisource.com> wrote:

> Hi Madhukar,
>
> Thanks for your quick response. The path is "/tmp/kafka-logs/". But the
> servers have not been restarted any time lately. The uptime for all the 3
> servers is almost 67 days.
>
> Regards,
> Rahul Misra
>
>
> -----Original Message-----
> From: Madhukar Bharti [mailto:bhartimadhu...@gmail.com]
> Sent: Wednesday, June 22, 2016 8:37 PM
> To: users@kafka.apache.org
> Subject: Re: Kafka broker crash
>
> Hi Rahul,
>
> Whether the path is  "/tmp/kafka-logs/" or "/temp/kafka-logs" ?
>
> Mostly if path is set to "/tmp/" then in case machine restart it may
> delete the files. So it is throwing FileNotFoundException.
> you can change the file location to some other path and restart all broker.
> This might fix the issue.
>
> Regrads,
> Madhukar
>
> On Wed, Jun 22, 2016 at 1:40 PM, Misra, Rahul <rahul.mi...@altisource.com>
> wrote:
>
> > Hi,
> >
> > I'm facing a strange issue in my Kafka cluster. Could anybody please
> > help me with it. The issue is as follows:
> >
> > We have a 3 node kafka cluster. We installed the zookeeper separately
> > and have pointed the brokers to it. The zookeeper is also 3 node, but
> > for our POC setup, the zookeeper nodes are on the same machines as the
> > Kafka brokers.
> >
> > While receiving messages from an existing topic using a new groupId, 2
> > of the brokers crashed with same FATAL errors:
> >
> > --------------------------------------------------------
> > <<<<<<<<<<<<<---- [server 2 logs] ---->>>>>>>>>>>>>>>
> >
> > [2016-06-21 23:09:14,697] INFO [GroupCoordinator 1]: Stabilized group
> > pocTestNew11 generation 1 (kafka.coordinator.Gro
> > upCoordinator)
> > [2016-06-21 23:09:15,006] INFO [GroupCoordinator 1]: Assignment
> > received from leader for group pocTestNew11 for genera tion 1
> > (kafka.coordinator.GroupCoordinator)
> > [2016-06-21 23:09:20,335] FATAL [Replica Manager on Broker 1]: Halting
> > due to unrecoverable I/O error while handling p roduce request:
> > (kafka.server.ReplicaManager)
> > kafka.common.KafkaStorageException: I/O exception in append to log
> > '__consumer_offsets-4'
> >         at kafka.log.Log.append(Log.scala:318)
> >         at kafka.cluster.Partition$$anonfun$9.apply(Partition.scala:442)
> >         at kafka.cluster.Partition$$anonfun$9.apply(Partition.scala:428)
> >         at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
> >         at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:268)
> >         at
> > kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:428)
> >         at
> >
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:401)
> >         at
> >
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:386)
> >         at
> >
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
> >         at
> >
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
> >         at scala.collection.immutable.Map$Map1.foreach(Map.scala:116)
> >         at
> > scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
> >         at
> scala.collection.AbstractTraversable.map(Traversable.scala:104)
> >         at
> > kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:386)
> >         at
> > kafka.server.ReplicaManager.appendMessages(ReplicaManager.scala:322)
> >         at
> >
> kafka.coordinator.GroupMetadataManager.store(GroupMetadataManager.scala:228)
> >         at
> >
> kafka.coordinator.GroupCoordinator$$anonfun$handleCommitOffsets$9.apply(GroupCoordinator.scala:429)
> >         at
> >
> kafka.coordinator.GroupCoordinator$$anonfun$handleCommitOffsets$9.apply(GroupCoordinator.scala:429)
> >         at scala.Option.foreach(Option.scala:257)
> >         at
> >
> kafka.coordinator.GroupCoordinator.handleCommitOffsets(GroupCoordinator.scala:429)
> >         at
> > kafka.server.KafkaApis.handleOffsetCommitRequest(KafkaApis.scala:280)
> >         at kafka.server.KafkaApis.handle(KafkaApis.scala:76)
> >         at
> > kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> >         at java.lang.Thread.run(Thread.java:745)
> > Caused by: java.io.FileNotFoundException:
> > /tmp/kafka-logs/__consumer_offsets-4/00000000000000000000.index (No
> > such file or directory)
> >         at java.io.RandomAccessFile.open0(Native Method)
> >         at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
> >         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
> >         at
> > kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:277)
> >         at
> > kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:276)
> >         at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
> >         at kafka.log.OffsetIndex.resize(OffsetIndex.scala:276)
> >         at
> >
> kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply$mcV$sp(OffsetIndex.scala:265)
> >         at
> >
> kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265)
> >         at
> >
> kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265)
> >         at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
> >         at kafka.log.OffsetIndex.trimToValidSize(OffsetIndex.scala:264)
> >         at kafka.log.Log.roll(Log.scala:627)
> >         at kafka.log.Log.maybeRoll(Log.scala:602)
> >         at kafka.log.Log.append(Log.scala:357)
> >
> > ----------------------------------------------
> > <<<<<<<<<<<<<---- [server 3 logs] ---->>>>>>>>>>>>>>>
> >
> > [2016-06-21 23:08:49,796] FATAL [ReplicaFetcherThread-0-0], Disk error
> > while replicating data. (kafka.server.ReplicaFe
> > tcherThread)
> > kafka.common.KafkaStorageException: I/O exception in append to log
> > '__consumer_offsets-4'
> >         at kafka.log.Log.append(Log.scala:318)
> >         at
> >
> kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:113)
> >         at
> >
> kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:42)
> >         at
> >
> kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.
> > apply(AbstractFetcherThread.scala:138)
> >         at
> >
> kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.
> > apply(AbstractFetcherThread.scala:122)
> >         at scala.Option.foreach(Option.scala:257)
> >         at
> > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$ano
> > nfun$apply$mcV$sp$1.apply(AbstractFet
> > cherThread.scala:122)
> >         at
> >
> kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:120)
> >         at
> >
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> >         at
> >
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> >         at
> >
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
> >         at
> scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> >         at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> >         at
> >
> kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:120)
> >         at
> >
> kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:120)
> >         at
> >
> kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:120)
> >         at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
> >         at
> >
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
> >         at
> > kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:93)
> >         at
> > kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> > Caused by: java.io.FileNotFoundException:
> > /tmp/kafka-logs/__consumer_offsets-4/00000000000000000000.index (No
> > such file or directory)
> >         at java.io.RandomAccessFile.open0(Native Method)
> >         at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
> >         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
> >         at
> > kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:277)
> >         at
> > kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:276)
> >         at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
> >         at kafka.log.OffsetIndex.resize(OffsetIndex.scala:276)
> >         at
> >
> kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply$mcV$sp(OffsetIndex.scala:265)
> >         at
> >
> kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265)
> >         at
> >
> kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265)
> >         at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
> >         at kafka.log.OffsetIndex.trimToValidSize(OffsetIndex.scala:264)
> >         at kafka.log.Log.roll(Log.scala:627)
> >         at kafka.log.Log.maybeRoll(Log.scala:602)
> >         at kafka.log.Log.append(Log.scala:357)
> >         ... 19 more
> >
> >
> >
> > For the topic "__consumer_offsets" which is used to commit consumer
> > offsets the default number of partitions is 50 and the replication
> > factor is 3.
> > So ideally all the 3 brokers should have logs for all partitions for
> > "__consumer_offsets".
> > I checked the "/temp/kafka-logs" directory for each server and except
> > for the broker 1, the other 2 brokers (server 2 and 3) do not contain
> > replicas for all the partitions for "__consumer_offsets". There are
> > log directories missing for many partitions for "__consumer_offsets"
> > on brokers 2 and 3 (including partition 4 which resulted in the above
> crash).
> >
> > What could be the cause for this crash. Is there any mis-configuration
> > for the broker that can cause this?
> >
> > Regards,
> > Rahul Misra
> >
> > Technical Lead
> > Altisource(tm)
> > Mobile: 9886141541 | Ext: 298269
> > rahul.mi...@altisource.com<mailto:rahul.mi...@altisource.com> |
> > www.Altisource.com<http://www.altisource.com/>
> >
> > This email message and any attachments are intended solely for the use
> > of the addressee. If you are not the intended recipient, you are
> > prohibited from reading, disclosing, reproducing, distributing,
> > disseminating or otherwise using this transmission. If you have
> > received this message in error, please promptly notify the sender by
> > reply email and immediately delete this message from your system. This
> > message and any attachments may contain information that is
> > confidential, privileged or exempt from disclosure. Delivery of this
> > message to any person other than the intended recipient is not
> > intended to waive any right or privilege. Message transmission is not
> guaranteed to be secure or free of software viruses.
> >
> > **********************************************************************
> > *************************************************
> >
> This email message and any attachments are intended solely for the use of
> the addressee. If you are not the intended recipient, you are prohibited
> from reading, disclosing, reproducing, distributing, disseminating or
> otherwise using this transmission. If you have received this message in
> error, please promptly notify the sender by reply email and immediately
> delete this message from your system. This message and any attachments may
> contain information that is confidential, privileged or exempt from
> disclosure. Delivery of this message to any person other than the intended
> recipient is not intended to waive any right or privilege. Message
> transmission is not guaranteed to be secure or free of software viruses.
>
> ***********************************************************************************************************************
>

Reply via email to