[ https://issues.apache.org/jira/browse/KAFKA-8145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
prehistoricpenguin updated KAFKA-8145: -------------------------------------- Description: When consumer offset log cleaning or Kafka log rolling triggered on Windows(The default rolling time is 168 hours, eg 1 week). Kafka broker will shut down due to file renaming fail, on Windows is't invalid to rename a file when it's opened, so this issue is Windows specific. A topical error log during log rolling: {quote}at kafka.log.Log$$anonfun$deleteSegments$1$$anonfun$apply$mcI$sp$1.apply(Log.scala:1170) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at kafka.log.Log$$anonfun$deleteSegments$1.apply$mcI$sp(Log.scala:1170) at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:1161) at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:1161) at kafka.log.Log.maybeHandleIOException(Log.scala:1678) at kafka.log.Log.deleteSegments(Log.scala:1161) at kafka.log.Log.deleteOldSegments(Log.scala:1156) at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1228) at kafka.log.Log.deleteOldSegments(Log.scala:1222) at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:854) at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:852) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) at kafka.log.LogManager.cleanupLogs(LogManager.scala:852) at kafka.log.LogManager$$anonfun$startup$1.apply$mcV$sp(LogManager.scala:385) at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110) at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Suppressed: java.nio.file.FileSystemException: Z:\tmp\kafka-logs\test-0\00000000000000000000.log -> Z:\tmp\kafka-logs\test-0\00000000000000000000.log.deleted: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileCopy.move(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.move(Unknown Source) at java.nio.file.Files.move(Unknown Source) at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:694) ... 32 more {quote} And the log from consumer offset log cleaner: {quote}[2019-02-26 02:16:05,151] ERROR Failed to clean up log for __consumer_offsets-29 in dir C:\Program Files (x86)\...\tmp\kafka-logs due to IOException (kafka.server.LogDirFailureChannel) java.nio.file.FileSystemException: C:\Program Files (x86)\...\tmp\kafka-logs__consumer_offsets-29\00000000000000000000.log.cleaned -> C:\Program Files (x86)...\tmp\kafka-logs__consumer_offsets-29\00000000000000000000.log.swap: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:387) at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287) at java.nio.file.Files.move(Files.java:1395) at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:697) at org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:212) at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:415) at kafka.log.Log.replaceSegments(Log.scala:1644) at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:535) at kafka.log.Cleaner$$anonfun$doClean$4.apply(LogCleaner.scala:462) at kafka.log.Cleaner$$anonfun$doClean$4.apply(LogCleaner.scala:461) at scala.collection.immutable.List.foreach(List.scala:392) at kafka.log.Cleaner.doClean(LogCleaner.scala:461) at kafka.log.Cleaner.clean(LogCleaner.scala:438) at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:305) at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:291) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) Suppressed: java.nio.file.FileSystemException: C:\Program Files (x86)\...\tmp\kafka-logs__consumer_offsets-29\00000000000000000000.log.cleaned -> C:\Program Files (x86)\...\tmp\kafka-logs__consumer_offsets-29\00000000000000000000.log.swap: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:301) at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287) at java.nio.file.Files.move(Files.java:1395) at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:694) ... 12 more {quote} Some users have already added the comments in [KAFKA-6188|https://issues.apache.org/jira/browse/KAFKA-6188] , but there is no feedback from the community. was: When consumer offset log cleaning or Kafka log rolling triggered on Windows(The default rolling time is 168 hours, eg 1 week). Kafka broker will shut down due to file renaming fail, on Windows is't invalid to rename a file when it's opened, so this issue is Windows specific. A topical error log during log rolling: {quote}at kafka.log.Log$$anonfun$deleteSegments$1$$anonfun$apply$mcI$sp$1.apply(Log.scala:1170) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at kafka.log.Log$$anonfun$deleteSegments$1.apply$mcI$sp(Log.scala:1170) at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:1161) at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:1161) at kafka.log.Log.maybeHandleIOException(Log.scala:1678) at kafka.log.Log.deleteSegments(Log.scala:1161) at kafka.log.Log.deleteOldSegments(Log.scala:1156) at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1228) at kafka.log.Log.deleteOldSegments(Log.scala:1222) at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:854) at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:852) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) at kafka.log.LogManager.cleanupLogs(LogManager.scala:852) at kafka.log.LogManager$$anonfun$startup$1.apply$mcV$sp(LogManager.scala:385) at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110) at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Suppressed: java.nio.file.FileSystemException: Z:\tmp\kafka-logs\test-0\00000000000000000000.log -> Z:\tmp\kafka-logs\test-0\00000000000000000000.log.deleted: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileCopy.move(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.move(Unknown Source) at java.nio.file.Files.move(Unknown Source) at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:694) ... 32 more{quote} And the log from consumer offset log cleaner: {quote}[2019-02-26 02:16:05,151] ERROR Failed to clean up log for __consumer_offsets-29 in dir C:\Program Files (x86)\...\tmp\kafka-logs due to IOException (kafka.server.LogDirFailureChannel) java.nio.file.FileSystemException: C:\Program Files (x86)\...\tmp\kafka-logs\__consumer_offsets-29\00000000000000000000.log.cleaned -> C:\Program Files (x86)...\tmp\kafka-logs\__consumer_offsets-29\00000000000000000000.log.swap: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:387) at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287) at java.nio.file.Files.move(Files.java:1395) at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:697) at org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:212) at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:415) at kafka.log.Log.replaceSegments(Log.scala:1644) at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:535) at kafka.log.Cleaner$$anonfun$doClean$4.apply(LogCleaner.scala:462) at kafka.log.Cleaner$$anonfun$doClean$4.apply(LogCleaner.scala:461) at scala.collection.immutable.List.foreach(List.scala:392) at kafka.log.Cleaner.doClean(LogCleaner.scala:461) at kafka.log.Cleaner.clean(LogCleaner.scala:438) at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:305) at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:291) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) Suppressed: java.nio.file.FileSystemException: C:\Program Files (x86)\...\tmp\kafka-logs\__consumer_offsets-29\00000000000000000000.log.cleaned -> C:\Program Files (x86)\...\tmp\kafka-logs\__consumer_offsets-29\00000000000000000000.log.swap: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:301) at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287) at java.nio.file.Files.move(Files.java:1395) at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:694) ... 12 more{quote} Some users have already added the comments in[ KAFKA-6188|https://issues.apache.org/jira/browse/KAFKA-6188] , but there is no feedback from the community. > Broker fails with FATAL Shutdown on Windows, log or index renaming fail > ----------------------------------------------------------------------- > > Key: KAFKA-8145 > URL: https://issues.apache.org/jira/browse/KAFKA-8145 > Project: Kafka > Issue Type: Bug > Components: core, log > Affects Versions: 1.1.0, 1.1.1, 2.0.1, 2.1.1 > Reporter: prehistoricpenguin > Priority: Major > Labels: windows > Original Estimate: 168h > Remaining Estimate: 168h > > When consumer offset log cleaning or Kafka log rolling triggered on > Windows(The default rolling time is 168 hours, eg 1 week). Kafka broker will > shut down due to file renaming fail, on Windows is't invalid to rename a file > when it's opened, so this issue is Windows specific. > A topical error log during log rolling: > {quote}at > kafka.log.Log$$anonfun$deleteSegments$1$$anonfun$apply$mcI$sp$1.apply(Log.scala:1170) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at kafka.log.Log$$anonfun$deleteSegments$1.apply$mcI$sp(Log.scala:1170) > at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:1161) > at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:1161) > at kafka.log.Log.maybeHandleIOException(Log.scala:1678) > at kafka.log.Log.deleteSegments(Log.scala:1161) > at kafka.log.Log.deleteOldSegments(Log.scala:1156) > at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1228) > at kafka.log.Log.deleteOldSegments(Log.scala:1222) > at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:854) > at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:852) > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > at scala.collection.immutable.List.foreach(List.scala:392) > at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > at kafka.log.LogManager.cleanupLogs(LogManager.scala:852) > at kafka.log.LogManager$$anonfun$startup$1.apply$mcV$sp(LogManager.scala:385) > at > kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62) > at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) > at java.util.concurrent.FutureTask.runAndReset(Unknown Source) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown > Source) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown > Source) > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Unknown Source) > Suppressed: java.nio.file.FileSystemException: > Z:\tmp\kafka-logs\test-0\00000000000000000000.log -> > Z:\tmp\kafka-logs\test-0\00000000000000000000.log.deleted: The process cannot > access the file because it is being used by another process. > at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) > at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) > at sun.nio.fs.WindowsFileCopy.move(Unknown Source) > at sun.nio.fs.WindowsFileSystemProvider.move(Unknown Source) > at java.nio.file.Files.move(Unknown Source) > at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:694) > ... 32 more > {quote} > > And the log from consumer offset log cleaner: > {quote}[2019-02-26 02:16:05,151] ERROR Failed to clean up log for > __consumer_offsets-29 in dir C:\Program Files (x86)\...\tmp\kafka-logs due to > IOException (kafka.server.LogDirFailureChannel) > java.nio.file.FileSystemException: C:\Program Files > (x86)\...\tmp\kafka-logs__consumer_offsets-29\00000000000000000000.log.cleaned > -> C:\Program Files > (x86)...\tmp\kafka-logs__consumer_offsets-29\00000000000000000000.log.swap: > The process cannot access the file because it is being used by another > process. > at > sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) > at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) > at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:387) > at > sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287) > at java.nio.file.Files.move(Files.java:1395) > at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:697) > at org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:212) > at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:415) > at kafka.log.Log.replaceSegments(Log.scala:1644) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:535) > at kafka.log.Cleaner$$anonfun$doClean$4.apply(LogCleaner.scala:462) > at kafka.log.Cleaner$$anonfun$doClean$4.apply(LogCleaner.scala:461) > at scala.collection.immutable.List.foreach(List.scala:392) > at kafka.log.Cleaner.doClean(LogCleaner.scala:461) > at kafka.log.Cleaner.clean(LogCleaner.scala:438) > at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:305) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:291) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) > Suppressed: java.nio.file.FileSystemException: C:\Program Files > (x86)\...\tmp\kafka-logs__consumer_offsets-29\00000000000000000000.log.cleaned > -> C:\Program Files > (x86)\...\tmp\kafka-logs__consumer_offsets-29\00000000000000000000.log.swap: > The process cannot access the file because it is being used by another > process. > at > sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) > at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) > at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:301) > at > sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287) > at java.nio.file.Files.move(Files.java:1395) > at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:694) > ... 12 more > {quote} > Some users have already added the comments in > [KAFKA-6188|https://issues.apache.org/jira/browse/KAFKA-6188] , but there is > no feedback from the community. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)