[
https://issues.apache.org/jira/browse/SPARK-33133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18046486#comment-18046486
]
Ram commented on SPARK-33133:
-----------------------------
I am getting this error in 3.5.4 tried different ways but its not getting
fixed.Why its generating with_1 not understanding.
> History server fails when loading invalid rolling event logs
> ------------------------------------------------------------
>
> Key: SPARK-33133
> URL: https://issues.apache.org/jira/browse/SPARK-33133
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 3.0.1
> Reporter: Adam Binford
> Priority: Major
>
> We have run into an issue where our history server fails to load new
> applications, and when restarted, fails to load any applications at all. This
> happens when it encounters invalid rolling event log files. We encounter this
> with long running streaming applications. There seems to be two issues here
> that lead to problems:
> * It looks like our long running streaming applications event log directory
> is being cleaned up. The next time the application logs event data, it
> recreates the event log directory but without recreating the "appstatus"
> file. I don't know the full extent of this behavior or if something "wrong"
> is happening here.
> * The history server then reads this new folder, and throws an exception
> because the "appstatus" file doesn't exist in the rolling event log folder.
> This exception breaks the entire listing process, so no new applications will
> be read, and if restarted no applications at all will be read.
> There seems like a couple ways to go about fixing this, and I'm curious
> anyone's thoughts who knows more about how the history server works,
> specifically with rolling event logs:
> * Don't completely fail checking for new applications if one bad rolling
> event log folder is encountered. This seems like the simplest fix and makes
> sense to me, it already checks for a few other errors and ignores them. It
> doesn't necessarily fix the underlying issue that leads to this happening
> though.
> * Figure out why the in progress event log folder is being deleted and make
> sure that doesn't happen. Maybe this is supposed to happen? Or maybe we don't
> want to delete the top level folder and only delete event log files within
> the folders? Again I don't know the exact current behavior here with this.
> * When writing new event log data, make sure the folder and appstatus file
> exist every time, creating them again if not.
> Here's the stack trace we encounter when this happens, from 3.0.1 with a
> couple extra MRs backported that I hoped would fix the issue:
> {{2020-10-13 12:10:31,751 ERROR history.FsHistoryProvider: Exception in
> checking for event log updates2020-10-13 12:10:31,751 ERROR
> history.FsHistoryProvider: Exception in checking for event log
> updatesjava.lang.IllegalArgumentException: requirement failed: Log directory
> must contain an appstatus file! at scala.Predef$.require(Predef.scala:281) at
> org.apache.spark.deploy.history.RollingEventLogFilesFileReader.files$lzycompute(EventLogFileReaders.scala:214)
> at
> org.apache.spark.deploy.history.RollingEventLogFilesFileReader.files(EventLogFileReaders.scala:211)
> at
> org.apache.spark.deploy.history.RollingEventLogFilesFileReader.eventLogFiles$lzycompute(EventLogFileReaders.scala:221)
> at
> org.apache.spark.deploy.history.RollingEventLogFilesFileReader.eventLogFiles(EventLogFileReaders.scala:220)
> at
> org.apache.spark.deploy.history.RollingEventLogFilesFileReader.lastEventLogFile(EventLogFileReaders.scala:272)
> at
> org.apache.spark.deploy.history.RollingEventLogFilesFileReader.fileSizeForLastIndex(EventLogFileReaders.scala:240)
> at
> org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$checkForLogs$7(FsHistoryProvider.scala:524)
> at
> org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$checkForLogs$7$adapted(FsHistoryProvider.scala:466)
> at
> scala.collection.TraversableLike.$anonfun$filterImpl$1(TraversableLike.scala:256)
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at
> scala.collection.TraversableLike.filterImpl(TraversableLike.scala:255) at
> scala.collection.TraversableLike.filterImpl$(TraversableLike.scala:249) at
> scala.collection.AbstractTraversable.filterImpl(Traversable.scala:108) at
> scala.collection.TraversableLike.filter(TraversableLike.scala:347) at
> scala.collection.TraversableLike.filter$(TraversableLike.scala:347) at
> scala.collection.AbstractTraversable.filter(Traversable.scala:108) at
> org.apache.spark.deploy.history.FsHistoryProvider.checkForLogs(FsHistoryProvider.scala:466)
> at
> org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$startPolling$3(FsHistoryProvider.scala:287)
> at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1302) at
> org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$getRunner$1(FsHistoryProvider.scala:210)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]