[ https://issues.apache.org/jira/browse/SPARK-24150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
William Montaz updated SPARK-24150: ----------------------------------- Priority: Major (was: Minor) > Race condition in FsHistoryProvider > ----------------------------------- > > Key: SPARK-24150 > URL: https://issues.apache.org/jira/browse/SPARK-24150 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.2.0 > Reporter: William Montaz > Priority: Major > > There exist a race condition in checkLogs method between threads of > replayExecutor. They use the field "applications" to synchronise, but they > also update that field. > The problem is that threads will eventually synchronise on different monitors > (because they will synchronise on different objects which references that > have been assigned to "applications"), breaking the initial synchronisation > intent. This has even greater chance to reproduce when number_new_log_files > > replayExecutor_pool_size > Workaround: > * use a permanent object as a monitor on which to synchronise (or > synchronise on `this`) > * keep volatile field for all other read accesses -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org