[spark] branch master updated (ec34a00 -> 77a8efb)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ec34a00 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 add 77a8efb [SPARK-32932][SQL] Do not use local shuffle reader at final stage on write command No new revisions were added by this update. Summary of changes: .../execution/adaptive/AdaptiveSparkPlanExec.scala | 14 +- .../adaptive/AdaptiveQueryExecSuite.scala | 51 +- 2 files changed, 63 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ec34a00 -> 77a8efb)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ec34a00 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 add 77a8efb [SPARK-32932][SQL] Do not use local shuffle reader at final stage on write command No new revisions were added by this update. Summary of changes: .../execution/adaptive/AdaptiveSparkPlanExec.scala | 14 +- .../adaptive/AdaptiveQueryExecSuite.scala | 51 +- 2 files changed, 63 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ec34a00 -> 77a8efb)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ec34a00 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 add 77a8efb [SPARK-32932][SQL] Do not use local shuffle reader at final stage on write command No new revisions were added by this update. Summary of changes: .../execution/adaptive/AdaptiveSparkPlanExec.scala | 14 +- .../adaptive/AdaptiveQueryExecSuite.scala | 51 +- 2 files changed, 63 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ec34a00 -> 77a8efb)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ec34a00 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 add 77a8efb [SPARK-32932][SQL] Do not use local shuffle reader at final stage on write command No new revisions were added by this update. Summary of changes: .../execution/adaptive/AdaptiveSparkPlanExec.scala | 14 +- .../adaptive/AdaptiveQueryExecSuite.scala | 51 +- 2 files changed, 63 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ec34a00 -> 77a8efb)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ec34a00 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 add 77a8efb [SPARK-32932][SQL] Do not use local shuffle reader at final stage on write command No new revisions were added by this update. Summary of changes: .../execution/adaptive/AdaptiveSparkPlanExec.scala | 14 +- .../adaptive/AdaptiveQueryExecSuite.scala | 51 +- 2 files changed, 63 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS"
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new e40c147 Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS" e40c147 is described below commit e40c147a5d194adbba13f12590959dc68347ec14 Author: Dongjoon Hyun AuthorDate: Wed Oct 14 21:47:46 2020 -0700 Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS" This reverts commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5. --- .../spark/deploy/history/FsHistoryProvider.scala | 3 -- .../deploy/history/FsHistoryProviderSuite.scala| 49 -- 2 files changed, 52 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala index 5970708..c262152 100644 --- a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala +++ b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala @@ -526,9 +526,6 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) reader.fileSizeForLastIndex > 0 } catch { case _: FileNotFoundException => false -case NonFatal(e) => - logWarning(s"Error while reading new log ${reader.rootPath}", e) - false } case _: FileNotFoundException => diff --git a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala index f3beb35..c2f34fc 100644 --- a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala @@ -1470,55 +1470,6 @@ class FsHistoryProviderSuite extends SparkFunSuite with Matchers with Logging { } } - test("SPARK-33146: don't let one bad rolling log folder prevent loading other applications") { -withTempDir { dir => - val conf = createTestConf(true) - conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath) - val hadoopConf = SparkHadoopUtil.newConfiguration(conf) - val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf) - - val provider = new FsHistoryProvider(conf) - - val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, conf, hadoopConf) - writer.start() - - writeEventsToRollingWriter(writer, Seq( -SparkListenerApplicationStart("app", Some("app"), 0, "user", None), -SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) - provider.checkForLogs() - provider.cleanLogs() - assert(dir.listFiles().size === 1) - assert(provider.getListing.length === 1) - - // Manually delete the appstatus file to make an invalid rolling event log - val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new Path(writer.logPath), -"app", None, true) - fs.delete(appStatusPath, false) - provider.checkForLogs() - provider.cleanLogs() - assert(provider.getListing.length === 0) - - // Create a new application - val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, conf, hadoopConf) - writer2.start() - writeEventsToRollingWriter(writer2, Seq( -SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None), -SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) - - // Both folders exist but only one application found - provider.checkForLogs() - provider.cleanLogs() - assert(provider.getListing.length === 1) - assert(dir.listFiles().size === 2) - - // Make sure a new provider sees the valid application - provider.stop() - val newProvider = new FsHistoryProvider(conf) - newProvider.checkForLogs() - assert(newProvider.getListing.length === 1) -} - } - /** * Asks the provider to check for logs and calls a function to perform checks on the updated * app list. Example: - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS"
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new e40c147 Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS" e40c147 is described below commit e40c147a5d194adbba13f12590959dc68347ec14 Author: Dongjoon Hyun AuthorDate: Wed Oct 14 21:47:46 2020 -0700 Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS" This reverts commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5. --- .../spark/deploy/history/FsHistoryProvider.scala | 3 -- .../deploy/history/FsHistoryProviderSuite.scala| 49 -- 2 files changed, 52 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala index 5970708..c262152 100644 --- a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala +++ b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala @@ -526,9 +526,6 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) reader.fileSizeForLastIndex > 0 } catch { case _: FileNotFoundException => false -case NonFatal(e) => - logWarning(s"Error while reading new log ${reader.rootPath}", e) - false } case _: FileNotFoundException => diff --git a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala index f3beb35..c2f34fc 100644 --- a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala @@ -1470,55 +1470,6 @@ class FsHistoryProviderSuite extends SparkFunSuite with Matchers with Logging { } } - test("SPARK-33146: don't let one bad rolling log folder prevent loading other applications") { -withTempDir { dir => - val conf = createTestConf(true) - conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath) - val hadoopConf = SparkHadoopUtil.newConfiguration(conf) - val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf) - - val provider = new FsHistoryProvider(conf) - - val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, conf, hadoopConf) - writer.start() - - writeEventsToRollingWriter(writer, Seq( -SparkListenerApplicationStart("app", Some("app"), 0, "user", None), -SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) - provider.checkForLogs() - provider.cleanLogs() - assert(dir.listFiles().size === 1) - assert(provider.getListing.length === 1) - - // Manually delete the appstatus file to make an invalid rolling event log - val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new Path(writer.logPath), -"app", None, true) - fs.delete(appStatusPath, false) - provider.checkForLogs() - provider.cleanLogs() - assert(provider.getListing.length === 0) - - // Create a new application - val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, conf, hadoopConf) - writer2.start() - writeEventsToRollingWriter(writer2, Seq( -SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None), -SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) - - // Both folders exist but only one application found - provider.checkForLogs() - provider.cleanLogs() - assert(provider.getListing.length === 1) - assert(dir.listFiles().size === 2) - - // Make sure a new provider sees the valid application - provider.stop() - val newProvider = new FsHistoryProvider(conf) - newProvider.checkForLogs() - assert(newProvider.getListing.length === 1) -} - } - /** * Asks the provider to check for logs and calls a function to perform checks on the updated * app list. Example: - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS"
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new e40c147 Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS" e40c147 is described below commit e40c147a5d194adbba13f12590959dc68347ec14 Author: Dongjoon Hyun AuthorDate: Wed Oct 14 21:47:46 2020 -0700 Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS" This reverts commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5. --- .../spark/deploy/history/FsHistoryProvider.scala | 3 -- .../deploy/history/FsHistoryProviderSuite.scala| 49 -- 2 files changed, 52 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala index 5970708..c262152 100644 --- a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala +++ b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala @@ -526,9 +526,6 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) reader.fileSizeForLastIndex > 0 } catch { case _: FileNotFoundException => false -case NonFatal(e) => - logWarning(s"Error while reading new log ${reader.rootPath}", e) - false } case _: FileNotFoundException => diff --git a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala index f3beb35..c2f34fc 100644 --- a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala @@ -1470,55 +1470,6 @@ class FsHistoryProviderSuite extends SparkFunSuite with Matchers with Logging { } } - test("SPARK-33146: don't let one bad rolling log folder prevent loading other applications") { -withTempDir { dir => - val conf = createTestConf(true) - conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath) - val hadoopConf = SparkHadoopUtil.newConfiguration(conf) - val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf) - - val provider = new FsHistoryProvider(conf) - - val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, conf, hadoopConf) - writer.start() - - writeEventsToRollingWriter(writer, Seq( -SparkListenerApplicationStart("app", Some("app"), 0, "user", None), -SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) - provider.checkForLogs() - provider.cleanLogs() - assert(dir.listFiles().size === 1) - assert(provider.getListing.length === 1) - - // Manually delete the appstatus file to make an invalid rolling event log - val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new Path(writer.logPath), -"app", None, true) - fs.delete(appStatusPath, false) - provider.checkForLogs() - provider.cleanLogs() - assert(provider.getListing.length === 0) - - // Create a new application - val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, conf, hadoopConf) - writer2.start() - writeEventsToRollingWriter(writer2, Seq( -SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None), -SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) - - // Both folders exist but only one application found - provider.checkForLogs() - provider.cleanLogs() - assert(provider.getListing.length === 1) - assert(dir.listFiles().size === 2) - - // Make sure a new provider sees the valid application - provider.stop() - val newProvider = new FsHistoryProvider(conf) - newProvider.checkForLogs() - assert(newProvider.getListing.length === 1) -} - } - /** * Asks the provider to check for logs and calls a function to perform checks on the updated * app list. Example: - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS"
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new e40c147 Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS" e40c147 is described below commit e40c147a5d194adbba13f12590959dc68347ec14 Author: Dongjoon Hyun AuthorDate: Wed Oct 14 21:47:46 2020 -0700 Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS" This reverts commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5. --- .../spark/deploy/history/FsHistoryProvider.scala | 3 -- .../deploy/history/FsHistoryProviderSuite.scala| 49 -- 2 files changed, 52 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala index 5970708..c262152 100644 --- a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala +++ b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala @@ -526,9 +526,6 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) reader.fileSizeForLastIndex > 0 } catch { case _: FileNotFoundException => false -case NonFatal(e) => - logWarning(s"Error while reading new log ${reader.rootPath}", e) - false } case _: FileNotFoundException => diff --git a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala index f3beb35..c2f34fc 100644 --- a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala @@ -1470,55 +1470,6 @@ class FsHistoryProviderSuite extends SparkFunSuite with Matchers with Logging { } } - test("SPARK-33146: don't let one bad rolling log folder prevent loading other applications") { -withTempDir { dir => - val conf = createTestConf(true) - conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath) - val hadoopConf = SparkHadoopUtil.newConfiguration(conf) - val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf) - - val provider = new FsHistoryProvider(conf) - - val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, conf, hadoopConf) - writer.start() - - writeEventsToRollingWriter(writer, Seq( -SparkListenerApplicationStart("app", Some("app"), 0, "user", None), -SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) - provider.checkForLogs() - provider.cleanLogs() - assert(dir.listFiles().size === 1) - assert(provider.getListing.length === 1) - - // Manually delete the appstatus file to make an invalid rolling event log - val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new Path(writer.logPath), -"app", None, true) - fs.delete(appStatusPath, false) - provider.checkForLogs() - provider.cleanLogs() - assert(provider.getListing.length === 0) - - // Create a new application - val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, conf, hadoopConf) - writer2.start() - writeEventsToRollingWriter(writer2, Seq( -SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None), -SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) - - // Both folders exist but only one application found - provider.checkForLogs() - provider.cleanLogs() - assert(provider.getListing.length === 1) - assert(dir.listFiles().size === 2) - - // Make sure a new provider sees the valid application - provider.stop() - val newProvider = new FsHistoryProvider(conf) - newProvider.checkForLogs() - assert(newProvider.getListing.length === 1) -} - } - /** * Asks the provider to check for logs and calls a function to perform checks on the updated * app list. Example: - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS"
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new e40c147 Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS" e40c147 is described below commit e40c147a5d194adbba13f12590959dc68347ec14 Author: Dongjoon Hyun AuthorDate: Wed Oct 14 21:47:46 2020 -0700 Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS" This reverts commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5. --- .../spark/deploy/history/FsHistoryProvider.scala | 3 -- .../deploy/history/FsHistoryProviderSuite.scala| 49 -- 2 files changed, 52 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala index 5970708..c262152 100644 --- a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala +++ b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala @@ -526,9 +526,6 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) reader.fileSizeForLastIndex > 0 } catch { case _: FileNotFoundException => false -case NonFatal(e) => - logWarning(s"Error while reading new log ${reader.rootPath}", e) - false } case _: FileNotFoundException => diff --git a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala index f3beb35..c2f34fc 100644 --- a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala @@ -1470,55 +1470,6 @@ class FsHistoryProviderSuite extends SparkFunSuite with Matchers with Logging { } } - test("SPARK-33146: don't let one bad rolling log folder prevent loading other applications") { -withTempDir { dir => - val conf = createTestConf(true) - conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath) - val hadoopConf = SparkHadoopUtil.newConfiguration(conf) - val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf) - - val provider = new FsHistoryProvider(conf) - - val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, conf, hadoopConf) - writer.start() - - writeEventsToRollingWriter(writer, Seq( -SparkListenerApplicationStart("app", Some("app"), 0, "user", None), -SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) - provider.checkForLogs() - provider.cleanLogs() - assert(dir.listFiles().size === 1) - assert(provider.getListing.length === 1) - - // Manually delete the appstatus file to make an invalid rolling event log - val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new Path(writer.logPath), -"app", None, true) - fs.delete(appStatusPath, false) - provider.checkForLogs() - provider.cleanLogs() - assert(provider.getListing.length === 0) - - // Create a new application - val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, conf, hadoopConf) - writer2.start() - writeEventsToRollingWriter(writer2, Seq( -SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None), -SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) - - // Both folders exist but only one application found - provider.checkForLogs() - provider.cleanLogs() - assert(provider.getListing.length === 1) - assert(dir.listFiles().size === 2) - - // Make sure a new provider sees the valid application - provider.stop() - val newProvider = new FsHistoryProvider(conf) - newProvider.checkForLogs() - assert(newProvider.getListing.length === 1) -} - } - /** * Asks the provider to check for logs and calls a function to perform checks on the updated * app list. Example: - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 0b7b811 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 0b7b811 is described below commit 0b7b811c6464ac8a0fe5230dc49aefc2f5507db8 Author: Dongjoon Hyun AuthorDate: Wed Oct 14 20:48:13 2020 -0700 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 ### What changes were proposed in this pull request? This PR aims to ignore Apache Spark 2.4.x distribution in HiveExternalCatalogVersionsSuite if Python version is 3.8 or 3.9. ### Why are the changes needed? Currently, `HiveExternalCatalogVersionsSuite` is broken on the latest OS like `Ubuntu 20.04` because its default Python version is 3.8. PySpark 2.4.x doesn't work on Python 3.8 due to SPARK-29536. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. ``` $ python3 --version Python 3.8.5 $ build/sbt "hive/testOnly *.HiveExternalCatalogVersionsSuite" ... [info] All tests passed. [info] Passed: Total 1, Failed 0, Errors 0, Passed 1 ``` Closes #30044 from dongjoon-hyun/SPARK-33153. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun (cherry picked from commit ec34a001ad0ef57a496f29a6523d905128875b17) Signed-off-by: Dongjoon Hyun --- core/src/main/scala/org/apache/spark/TestUtils.scala| 13 + .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala | 3 ++- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/core/src/main/scala/org/apache/spark/TestUtils.scala b/core/src/main/scala/org/apache/spark/TestUtils.scala index 1e00769..054e7b0 100644 --- a/core/src/main/scala/org/apache/spark/TestUtils.scala +++ b/core/src/main/scala/org/apache/spark/TestUtils.scala @@ -249,6 +249,19 @@ private[spark] object TestUtils { attempt.isSuccess && attempt.get == 0 } + def isPythonVersionAtLeast38(): Boolean = { +val attempt = if (Utils.isWindows) { + Try(Process(Seq("cmd.exe", "/C", "python3 --version")) +.run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9"))) +.exitValue()) +} else { + Try(Process(Seq("sh", "-c", "python3 --version")) +.run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9"))) +.exitValue()) +} +attempt.isSuccess && attempt.get == 0 + } + /** * Returns the response code from an HTTP(S) URL. */ diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala index cbfdb7f..b81b7e8 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala @@ -234,7 +234,7 @@ object PROCESS_TABLES extends QueryTest with SQLTestUtils { // Tests the latest version of every release line. val testingVersions: Seq[String] = { import scala.io.Source -try { +val versions: Seq[String] = try { Source.fromURL(s"${releaseMirror}/spark").mkString .split("\n") .filter(_.contains(""" Seq("3.0.1", "2.4.7") // A temporary fallback to use a specific version } +versions.filter(v => v.startsWith("3") || !TestUtils.isPythonVersionAtLeast38()) } protected var spark: SparkSession = _ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 0b7b811 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 0b7b811 is described below commit 0b7b811c6464ac8a0fe5230dc49aefc2f5507db8 Author: Dongjoon Hyun AuthorDate: Wed Oct 14 20:48:13 2020 -0700 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 ### What changes were proposed in this pull request? This PR aims to ignore Apache Spark 2.4.x distribution in HiveExternalCatalogVersionsSuite if Python version is 3.8 or 3.9. ### Why are the changes needed? Currently, `HiveExternalCatalogVersionsSuite` is broken on the latest OS like `Ubuntu 20.04` because its default Python version is 3.8. PySpark 2.4.x doesn't work on Python 3.8 due to SPARK-29536. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. ``` $ python3 --version Python 3.8.5 $ build/sbt "hive/testOnly *.HiveExternalCatalogVersionsSuite" ... [info] All tests passed. [info] Passed: Total 1, Failed 0, Errors 0, Passed 1 ``` Closes #30044 from dongjoon-hyun/SPARK-33153. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun (cherry picked from commit ec34a001ad0ef57a496f29a6523d905128875b17) Signed-off-by: Dongjoon Hyun --- core/src/main/scala/org/apache/spark/TestUtils.scala| 13 + .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala | 3 ++- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/core/src/main/scala/org/apache/spark/TestUtils.scala b/core/src/main/scala/org/apache/spark/TestUtils.scala index 1e00769..054e7b0 100644 --- a/core/src/main/scala/org/apache/spark/TestUtils.scala +++ b/core/src/main/scala/org/apache/spark/TestUtils.scala @@ -249,6 +249,19 @@ private[spark] object TestUtils { attempt.isSuccess && attempt.get == 0 } + def isPythonVersionAtLeast38(): Boolean = { +val attempt = if (Utils.isWindows) { + Try(Process(Seq("cmd.exe", "/C", "python3 --version")) +.run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9"))) +.exitValue()) +} else { + Try(Process(Seq("sh", "-c", "python3 --version")) +.run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9"))) +.exitValue()) +} +attempt.isSuccess && attempt.get == 0 + } + /** * Returns the response code from an HTTP(S) URL. */ diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala index cbfdb7f..b81b7e8 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala @@ -234,7 +234,7 @@ object PROCESS_TABLES extends QueryTest with SQLTestUtils { // Tests the latest version of every release line. val testingVersions: Seq[String] = { import scala.io.Source -try { +val versions: Seq[String] = try { Source.fromURL(s"${releaseMirror}/spark").mkString .split("\n") .filter(_.contains(""" Seq("3.0.1", "2.4.7") // A temporary fallback to use a specific version } +versions.filter(v => v.startsWith("3") || !TestUtils.isPythonVersionAtLeast38()) } protected var spark: SparkSession = _ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0ec4 -> ec34a00)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0ec4 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS add ec34a00 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/TestUtils.scala| 13 + .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala | 3 ++- 2 files changed, 15 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 0b7b811 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 0b7b811 is described below commit 0b7b811c6464ac8a0fe5230dc49aefc2f5507db8 Author: Dongjoon Hyun AuthorDate: Wed Oct 14 20:48:13 2020 -0700 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 ### What changes were proposed in this pull request? This PR aims to ignore Apache Spark 2.4.x distribution in HiveExternalCatalogVersionsSuite if Python version is 3.8 or 3.9. ### Why are the changes needed? Currently, `HiveExternalCatalogVersionsSuite` is broken on the latest OS like `Ubuntu 20.04` because its default Python version is 3.8. PySpark 2.4.x doesn't work on Python 3.8 due to SPARK-29536. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. ``` $ python3 --version Python 3.8.5 $ build/sbt "hive/testOnly *.HiveExternalCatalogVersionsSuite" ... [info] All tests passed. [info] Passed: Total 1, Failed 0, Errors 0, Passed 1 ``` Closes #30044 from dongjoon-hyun/SPARK-33153. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun (cherry picked from commit ec34a001ad0ef57a496f29a6523d905128875b17) Signed-off-by: Dongjoon Hyun --- core/src/main/scala/org/apache/spark/TestUtils.scala| 13 + .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala | 3 ++- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/core/src/main/scala/org/apache/spark/TestUtils.scala b/core/src/main/scala/org/apache/spark/TestUtils.scala index 1e00769..054e7b0 100644 --- a/core/src/main/scala/org/apache/spark/TestUtils.scala +++ b/core/src/main/scala/org/apache/spark/TestUtils.scala @@ -249,6 +249,19 @@ private[spark] object TestUtils { attempt.isSuccess && attempt.get == 0 } + def isPythonVersionAtLeast38(): Boolean = { +val attempt = if (Utils.isWindows) { + Try(Process(Seq("cmd.exe", "/C", "python3 --version")) +.run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9"))) +.exitValue()) +} else { + Try(Process(Seq("sh", "-c", "python3 --version")) +.run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9"))) +.exitValue()) +} +attempt.isSuccess && attempt.get == 0 + } + /** * Returns the response code from an HTTP(S) URL. */ diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala index cbfdb7f..b81b7e8 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala @@ -234,7 +234,7 @@ object PROCESS_TABLES extends QueryTest with SQLTestUtils { // Tests the latest version of every release line. val testingVersions: Seq[String] = { import scala.io.Source -try { +val versions: Seq[String] = try { Source.fromURL(s"${releaseMirror}/spark").mkString .split("\n") .filter(_.contains(""" Seq("3.0.1", "2.4.7") // A temporary fallback to use a specific version } +versions.filter(v => v.startsWith("3") || !TestUtils.isPythonVersionAtLeast38()) } protected var spark: SparkSession = _ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0ec4 -> ec34a00)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0ec4 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS add ec34a00 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/TestUtils.scala| 13 + .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala | 3 ++- 2 files changed, 15 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 0b7b811 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 0b7b811 is described below commit 0b7b811c6464ac8a0fe5230dc49aefc2f5507db8 Author: Dongjoon Hyun AuthorDate: Wed Oct 14 20:48:13 2020 -0700 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 ### What changes were proposed in this pull request? This PR aims to ignore Apache Spark 2.4.x distribution in HiveExternalCatalogVersionsSuite if Python version is 3.8 or 3.9. ### Why are the changes needed? Currently, `HiveExternalCatalogVersionsSuite` is broken on the latest OS like `Ubuntu 20.04` because its default Python version is 3.8. PySpark 2.4.x doesn't work on Python 3.8 due to SPARK-29536. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. ``` $ python3 --version Python 3.8.5 $ build/sbt "hive/testOnly *.HiveExternalCatalogVersionsSuite" ... [info] All tests passed. [info] Passed: Total 1, Failed 0, Errors 0, Passed 1 ``` Closes #30044 from dongjoon-hyun/SPARK-33153. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun (cherry picked from commit ec34a001ad0ef57a496f29a6523d905128875b17) Signed-off-by: Dongjoon Hyun --- core/src/main/scala/org/apache/spark/TestUtils.scala| 13 + .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala | 3 ++- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/core/src/main/scala/org/apache/spark/TestUtils.scala b/core/src/main/scala/org/apache/spark/TestUtils.scala index 1e00769..054e7b0 100644 --- a/core/src/main/scala/org/apache/spark/TestUtils.scala +++ b/core/src/main/scala/org/apache/spark/TestUtils.scala @@ -249,6 +249,19 @@ private[spark] object TestUtils { attempt.isSuccess && attempt.get == 0 } + def isPythonVersionAtLeast38(): Boolean = { +val attempt = if (Utils.isWindows) { + Try(Process(Seq("cmd.exe", "/C", "python3 --version")) +.run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9"))) +.exitValue()) +} else { + Try(Process(Seq("sh", "-c", "python3 --version")) +.run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9"))) +.exitValue()) +} +attempt.isSuccess && attempt.get == 0 + } + /** * Returns the response code from an HTTP(S) URL. */ diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala index cbfdb7f..b81b7e8 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala @@ -234,7 +234,7 @@ object PROCESS_TABLES extends QueryTest with SQLTestUtils { // Tests the latest version of every release line. val testingVersions: Seq[String] = { import scala.io.Source -try { +val versions: Seq[String] = try { Source.fromURL(s"${releaseMirror}/spark").mkString .split("\n") .filter(_.contains(""" Seq("3.0.1", "2.4.7") // A temporary fallback to use a specific version } +versions.filter(v => v.startsWith("3") || !TestUtils.isPythonVersionAtLeast38()) } protected var spark: SparkSession = _ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0ec4 -> ec34a00)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0ec4 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS add ec34a00 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/TestUtils.scala| 13 + .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala | 3 ++- 2 files changed, 15 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 0b7b811 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 0b7b811 is described below commit 0b7b811c6464ac8a0fe5230dc49aefc2f5507db8 Author: Dongjoon Hyun AuthorDate: Wed Oct 14 20:48:13 2020 -0700 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 ### What changes were proposed in this pull request? This PR aims to ignore Apache Spark 2.4.x distribution in HiveExternalCatalogVersionsSuite if Python version is 3.8 or 3.9. ### Why are the changes needed? Currently, `HiveExternalCatalogVersionsSuite` is broken on the latest OS like `Ubuntu 20.04` because its default Python version is 3.8. PySpark 2.4.x doesn't work on Python 3.8 due to SPARK-29536. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. ``` $ python3 --version Python 3.8.5 $ build/sbt "hive/testOnly *.HiveExternalCatalogVersionsSuite" ... [info] All tests passed. [info] Passed: Total 1, Failed 0, Errors 0, Passed 1 ``` Closes #30044 from dongjoon-hyun/SPARK-33153. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun (cherry picked from commit ec34a001ad0ef57a496f29a6523d905128875b17) Signed-off-by: Dongjoon Hyun --- core/src/main/scala/org/apache/spark/TestUtils.scala| 13 + .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala | 3 ++- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/core/src/main/scala/org/apache/spark/TestUtils.scala b/core/src/main/scala/org/apache/spark/TestUtils.scala index 1e00769..054e7b0 100644 --- a/core/src/main/scala/org/apache/spark/TestUtils.scala +++ b/core/src/main/scala/org/apache/spark/TestUtils.scala @@ -249,6 +249,19 @@ private[spark] object TestUtils { attempt.isSuccess && attempt.get == 0 } + def isPythonVersionAtLeast38(): Boolean = { +val attempt = if (Utils.isWindows) { + Try(Process(Seq("cmd.exe", "/C", "python3 --version")) +.run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9"))) +.exitValue()) +} else { + Try(Process(Seq("sh", "-c", "python3 --version")) +.run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9"))) +.exitValue()) +} +attempt.isSuccess && attempt.get == 0 + } + /** * Returns the response code from an HTTP(S) URL. */ diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala index cbfdb7f..b81b7e8 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala @@ -234,7 +234,7 @@ object PROCESS_TABLES extends QueryTest with SQLTestUtils { // Tests the latest version of every release line. val testingVersions: Seq[String] = { import scala.io.Source -try { +val versions: Seq[String] = try { Source.fromURL(s"${releaseMirror}/spark").mkString .split("\n") .filter(_.contains(""" Seq("3.0.1", "2.4.7") // A temporary fallback to use a specific version } +versions.filter(v => v.startsWith("3") || !TestUtils.isPythonVersionAtLeast38()) } protected var spark: SparkSession = _ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0ec4 -> ec34a00)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0ec4 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS add ec34a00 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/TestUtils.scala| 13 + .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala | 3 ++- 2 files changed, 15 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0ec4 -> ec34a00)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0ec4 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS add ec34a00 [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9 No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/TestUtils.scala| 13 + .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala | 3 ++- 2 files changed, 15 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 816ffd7 [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved 816ffd7 is described below commit 816ffd7f10eaaaba334ca90554cda75beca8083f Author: Jungtaek Lim (HeartSaVioR) AuthorDate: Wed Oct 14 20:09:37 2020 -0700 [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved ### What changes were proposed in this pull request? This PR proposes to fix a bug on calling `DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters in `AppendData.outputResolved`. The order of parameters for `DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says that the right order of matching variables are `inAttr` and `outAttr`. ### Why are the changes needed? Although the problematic part is a dead code, once we know there's a bug, preferably we'd like to fix that. ### Does this PR introduce _any_ user-facing change? No, as this fixes the dead code. ### How was this patch tested? This patch fixes the dead code, which is not easy to craft a test. (The test in original commit is no longer valid here.) Closes #30043 from HeartSaVioR/SPARK-33136-branch-2.4. Authored-by: Jungtaek Lim (HeartSaVioR) Signed-off-by: Dongjoon Hyun --- .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala index 1c25586..a0086c1 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala @@ -378,7 +378,7 @@ case class AppendData( case (inAttr, outAttr) => // names and types must match, nullability must be compatible inAttr.name == outAttr.name && -DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, inAttr.dataType) && +DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, outAttr.dataType) && (outAttr.nullable || !inAttr.nullable) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 816ffd7 [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved 816ffd7 is described below commit 816ffd7f10eaaaba334ca90554cda75beca8083f Author: Jungtaek Lim (HeartSaVioR) AuthorDate: Wed Oct 14 20:09:37 2020 -0700 [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved ### What changes were proposed in this pull request? This PR proposes to fix a bug on calling `DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters in `AppendData.outputResolved`. The order of parameters for `DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says that the right order of matching variables are `inAttr` and `outAttr`. ### Why are the changes needed? Although the problematic part is a dead code, once we know there's a bug, preferably we'd like to fix that. ### Does this PR introduce _any_ user-facing change? No, as this fixes the dead code. ### How was this patch tested? This patch fixes the dead code, which is not easy to craft a test. (The test in original commit is no longer valid here.) Closes #30043 from HeartSaVioR/SPARK-33136-branch-2.4. Authored-by: Jungtaek Lim (HeartSaVioR) Signed-off-by: Dongjoon Hyun --- .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala index 1c25586..a0086c1 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala @@ -378,7 +378,7 @@ case class AppendData( case (inAttr, outAttr) => // names and types must match, nullability must be compatible inAttr.name == outAttr.name && -DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, inAttr.dataType) && +DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, outAttr.dataType) && (outAttr.nullable || !inAttr.nullable) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new d9669bd [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS d9669bd is described below commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5 Author: Adam Binford AuthorDate: Thu Oct 15 11:59:29 2020 +0900 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS ### What changes were proposed in this pull request? Adds an additional check for non-fatal errors when attempting to add a new entry to the history server application listing. ### Why are the changes needed? A bad rolling event log folder (missing appstatus file or no log files) would cause no applications to be loaded by the Spark history server. Figuring out why invalid event log folders are created in the first place will be addressed in separate issues, this just lets the history server skip the invalid folder and successfully load all the valid applications. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? New UT Closes #30037 from Kimahriman/bug/rolling-log-crashing-history. Authored-by: Adam Binford Signed-off-by: Jungtaek Lim (HeartSaVioR) (cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90) Signed-off-by: Jungtaek Lim (HeartSaVioR) --- .../spark/deploy/history/FsHistoryProvider.scala | 3 ++ .../deploy/history/FsHistoryProviderSuite.scala| 49 ++ 2 files changed, 52 insertions(+) diff --git a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala index c262152..5970708 100644 --- a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala +++ b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala @@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) reader.fileSizeForLastIndex > 0 } catch { case _: FileNotFoundException => false +case NonFatal(e) => + logWarning(s"Error while reading new log ${reader.rootPath}", e) + false } case _: FileNotFoundException => diff --git a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala index c2f34fc..f3beb35 100644 --- a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala @@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with Matchers with Logging { } } + test("SPARK-33146: don't let one bad rolling log folder prevent loading other applications") { +withTempDir { dir => + val conf = createTestConf(true) + conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath) + val hadoopConf = SparkHadoopUtil.newConfiguration(conf) + val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf) + + val provider = new FsHistoryProvider(conf) + + val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, conf, hadoopConf) + writer.start() + + writeEventsToRollingWriter(writer, Seq( +SparkListenerApplicationStart("app", Some("app"), 0, "user", None), +SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) + provider.checkForLogs() + provider.cleanLogs() + assert(dir.listFiles().size === 1) + assert(provider.getListing.length === 1) + + // Manually delete the appstatus file to make an invalid rolling event log + val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new Path(writer.logPath), +"app", None, true) + fs.delete(appStatusPath, false) + provider.checkForLogs() + provider.cleanLogs() + assert(provider.getListing.length === 0) + + // Create a new application + val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, conf, hadoopConf) + writer2.start() + writeEventsToRollingWriter(writer2, Seq( +SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None), +SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) + + // Both folders exist but only one application found + provider.checkForLogs() + provider.cleanLogs() + assert(provider.getListing.length === 1) + assert(dir.listFiles().size === 2) + + // Make sure a new provider sees the valid application + provider.stop() + val newProvider = new FsHistor
[spark] branch branch-2.4 updated: [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 816ffd7 [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved 816ffd7 is described below commit 816ffd7f10eaaaba334ca90554cda75beca8083f Author: Jungtaek Lim (HeartSaVioR) AuthorDate: Wed Oct 14 20:09:37 2020 -0700 [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved ### What changes were proposed in this pull request? This PR proposes to fix a bug on calling `DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters in `AppendData.outputResolved`. The order of parameters for `DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says that the right order of matching variables are `inAttr` and `outAttr`. ### Why are the changes needed? Although the problematic part is a dead code, once we know there's a bug, preferably we'd like to fix that. ### Does this PR introduce _any_ user-facing change? No, as this fixes the dead code. ### How was this patch tested? This patch fixes the dead code, which is not easy to craft a test. (The test in original commit is no longer valid here.) Closes #30043 from HeartSaVioR/SPARK-33136-branch-2.4. Authored-by: Jungtaek Lim (HeartSaVioR) Signed-off-by: Dongjoon Hyun --- .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala index 1c25586..a0086c1 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala @@ -378,7 +378,7 @@ case class AppendData( case (inAttr, outAttr) => // names and types must match, nullability must be compatible inAttr.name == outAttr.name && -DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, inAttr.dataType) && +DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, outAttr.dataType) && (outAttr.nullable || !inAttr.nullable) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new d9669bd [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS d9669bd is described below commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5 Author: Adam Binford AuthorDate: Thu Oct 15 11:59:29 2020 +0900 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS ### What changes were proposed in this pull request? Adds an additional check for non-fatal errors when attempting to add a new entry to the history server application listing. ### Why are the changes needed? A bad rolling event log folder (missing appstatus file or no log files) would cause no applications to be loaded by the Spark history server. Figuring out why invalid event log folders are created in the first place will be addressed in separate issues, this just lets the history server skip the invalid folder and successfully load all the valid applications. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? New UT Closes #30037 from Kimahriman/bug/rolling-log-crashing-history. Authored-by: Adam Binford Signed-off-by: Jungtaek Lim (HeartSaVioR) (cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90) Signed-off-by: Jungtaek Lim (HeartSaVioR) --- .../spark/deploy/history/FsHistoryProvider.scala | 3 ++ .../deploy/history/FsHistoryProviderSuite.scala| 49 ++ 2 files changed, 52 insertions(+) diff --git a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala index c262152..5970708 100644 --- a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala +++ b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala @@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) reader.fileSizeForLastIndex > 0 } catch { case _: FileNotFoundException => false +case NonFatal(e) => + logWarning(s"Error while reading new log ${reader.rootPath}", e) + false } case _: FileNotFoundException => diff --git a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala index c2f34fc..f3beb35 100644 --- a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala @@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with Matchers with Logging { } } + test("SPARK-33146: don't let one bad rolling log folder prevent loading other applications") { +withTempDir { dir => + val conf = createTestConf(true) + conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath) + val hadoopConf = SparkHadoopUtil.newConfiguration(conf) + val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf) + + val provider = new FsHistoryProvider(conf) + + val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, conf, hadoopConf) + writer.start() + + writeEventsToRollingWriter(writer, Seq( +SparkListenerApplicationStart("app", Some("app"), 0, "user", None), +SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) + provider.checkForLogs() + provider.cleanLogs() + assert(dir.listFiles().size === 1) + assert(provider.getListing.length === 1) + + // Manually delete the appstatus file to make an invalid rolling event log + val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new Path(writer.logPath), +"app", None, true) + fs.delete(appStatusPath, false) + provider.checkForLogs() + provider.cleanLogs() + assert(provider.getListing.length === 0) + + // Create a new application + val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, conf, hadoopConf) + writer2.start() + writeEventsToRollingWriter(writer2, Seq( +SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None), +SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) + + // Both folders exist but only one application found + provider.checkForLogs() + provider.cleanLogs() + assert(provider.getListing.length === 1) + assert(dir.listFiles().size === 2) + + // Make sure a new provider sees the valid application + provider.stop() + val newProvider = new FsHistor
[spark] branch master updated (f3ad32f -> 9ab0ec4)
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f3ad32f [SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows add 9ab0ec4 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS No new revisions were added by this update. Summary of changes: .../spark/deploy/history/FsHistoryProvider.scala | 3 ++ .../deploy/history/FsHistoryProviderSuite.scala| 49 ++ 2 files changed, 52 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 816ffd7 [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved 816ffd7 is described below commit 816ffd7f10eaaaba334ca90554cda75beca8083f Author: Jungtaek Lim (HeartSaVioR) AuthorDate: Wed Oct 14 20:09:37 2020 -0700 [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved ### What changes were proposed in this pull request? This PR proposes to fix a bug on calling `DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters in `AppendData.outputResolved`. The order of parameters for `DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says that the right order of matching variables are `inAttr` and `outAttr`. ### Why are the changes needed? Although the problematic part is a dead code, once we know there's a bug, preferably we'd like to fix that. ### Does this PR introduce _any_ user-facing change? No, as this fixes the dead code. ### How was this patch tested? This patch fixes the dead code, which is not easy to craft a test. (The test in original commit is no longer valid here.) Closes #30043 from HeartSaVioR/SPARK-33136-branch-2.4. Authored-by: Jungtaek Lim (HeartSaVioR) Signed-off-by: Dongjoon Hyun --- .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala index 1c25586..a0086c1 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala @@ -378,7 +378,7 @@ case class AppendData( case (inAttr, outAttr) => // names and types must match, nullability must be compatible inAttr.name == outAttr.name && -DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, inAttr.dataType) && +DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, outAttr.dataType) && (outAttr.nullable || !inAttr.nullable) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new d9669bd [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS d9669bd is described below commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5 Author: Adam Binford AuthorDate: Thu Oct 15 11:59:29 2020 +0900 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS ### What changes were proposed in this pull request? Adds an additional check for non-fatal errors when attempting to add a new entry to the history server application listing. ### Why are the changes needed? A bad rolling event log folder (missing appstatus file or no log files) would cause no applications to be loaded by the Spark history server. Figuring out why invalid event log folders are created in the first place will be addressed in separate issues, this just lets the history server skip the invalid folder and successfully load all the valid applications. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? New UT Closes #30037 from Kimahriman/bug/rolling-log-crashing-history. Authored-by: Adam Binford Signed-off-by: Jungtaek Lim (HeartSaVioR) (cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90) Signed-off-by: Jungtaek Lim (HeartSaVioR) --- .../spark/deploy/history/FsHistoryProvider.scala | 3 ++ .../deploy/history/FsHistoryProviderSuite.scala| 49 ++ 2 files changed, 52 insertions(+) diff --git a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala index c262152..5970708 100644 --- a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala +++ b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala @@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) reader.fileSizeForLastIndex > 0 } catch { case _: FileNotFoundException => false +case NonFatal(e) => + logWarning(s"Error while reading new log ${reader.rootPath}", e) + false } case _: FileNotFoundException => diff --git a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala index c2f34fc..f3beb35 100644 --- a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala @@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with Matchers with Logging { } } + test("SPARK-33146: don't let one bad rolling log folder prevent loading other applications") { +withTempDir { dir => + val conf = createTestConf(true) + conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath) + val hadoopConf = SparkHadoopUtil.newConfiguration(conf) + val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf) + + val provider = new FsHistoryProvider(conf) + + val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, conf, hadoopConf) + writer.start() + + writeEventsToRollingWriter(writer, Seq( +SparkListenerApplicationStart("app", Some("app"), 0, "user", None), +SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) + provider.checkForLogs() + provider.cleanLogs() + assert(dir.listFiles().size === 1) + assert(provider.getListing.length === 1) + + // Manually delete the appstatus file to make an invalid rolling event log + val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new Path(writer.logPath), +"app", None, true) + fs.delete(appStatusPath, false) + provider.checkForLogs() + provider.cleanLogs() + assert(provider.getListing.length === 0) + + // Create a new application + val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, conf, hadoopConf) + writer2.start() + writeEventsToRollingWriter(writer2, Seq( +SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None), +SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) + + // Both folders exist but only one application found + provider.checkForLogs() + provider.cleanLogs() + assert(provider.getListing.length === 1) + assert(dir.listFiles().size === 2) + + // Make sure a new provider sees the valid application + provider.stop() + val newProvider = new FsHistor
[spark] branch master updated (f3ad32f -> 9ab0ec4)
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f3ad32f [SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows add 9ab0ec4 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS No new revisions were added by this update. Summary of changes: .../spark/deploy/history/FsHistoryProvider.scala | 3 ++ .../deploy/history/FsHistoryProviderSuite.scala| 49 ++ 2 files changed, 52 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 816ffd7 [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved 816ffd7 is described below commit 816ffd7f10eaaaba334ca90554cda75beca8083f Author: Jungtaek Lim (HeartSaVioR) AuthorDate: Wed Oct 14 20:09:37 2020 -0700 [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved ### What changes were proposed in this pull request? This PR proposes to fix a bug on calling `DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters in `AppendData.outputResolved`. The order of parameters for `DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says that the right order of matching variables are `inAttr` and `outAttr`. ### Why are the changes needed? Although the problematic part is a dead code, once we know there's a bug, preferably we'd like to fix that. ### Does this PR introduce _any_ user-facing change? No, as this fixes the dead code. ### How was this patch tested? This patch fixes the dead code, which is not easy to craft a test. (The test in original commit is no longer valid here.) Closes #30043 from HeartSaVioR/SPARK-33136-branch-2.4. Authored-by: Jungtaek Lim (HeartSaVioR) Signed-off-by: Dongjoon Hyun --- .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala index 1c25586..a0086c1 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala @@ -378,7 +378,7 @@ case class AppendData( case (inAttr, outAttr) => // names and types must match, nullability must be compatible inAttr.name == outAttr.name && -DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, inAttr.dataType) && +DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, outAttr.dataType) && (outAttr.nullable || !inAttr.nullable) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new d9669bd [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS d9669bd is described below commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5 Author: Adam Binford AuthorDate: Thu Oct 15 11:59:29 2020 +0900 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS ### What changes were proposed in this pull request? Adds an additional check for non-fatal errors when attempting to add a new entry to the history server application listing. ### Why are the changes needed? A bad rolling event log folder (missing appstatus file or no log files) would cause no applications to be loaded by the Spark history server. Figuring out why invalid event log folders are created in the first place will be addressed in separate issues, this just lets the history server skip the invalid folder and successfully load all the valid applications. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? New UT Closes #30037 from Kimahriman/bug/rolling-log-crashing-history. Authored-by: Adam Binford Signed-off-by: Jungtaek Lim (HeartSaVioR) (cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90) Signed-off-by: Jungtaek Lim (HeartSaVioR) --- .../spark/deploy/history/FsHistoryProvider.scala | 3 ++ .../deploy/history/FsHistoryProviderSuite.scala| 49 ++ 2 files changed, 52 insertions(+) diff --git a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala index c262152..5970708 100644 --- a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala +++ b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala @@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) reader.fileSizeForLastIndex > 0 } catch { case _: FileNotFoundException => false +case NonFatal(e) => + logWarning(s"Error while reading new log ${reader.rootPath}", e) + false } case _: FileNotFoundException => diff --git a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala index c2f34fc..f3beb35 100644 --- a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala @@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with Matchers with Logging { } } + test("SPARK-33146: don't let one bad rolling log folder prevent loading other applications") { +withTempDir { dir => + val conf = createTestConf(true) + conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath) + val hadoopConf = SparkHadoopUtil.newConfiguration(conf) + val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf) + + val provider = new FsHistoryProvider(conf) + + val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, conf, hadoopConf) + writer.start() + + writeEventsToRollingWriter(writer, Seq( +SparkListenerApplicationStart("app", Some("app"), 0, "user", None), +SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) + provider.checkForLogs() + provider.cleanLogs() + assert(dir.listFiles().size === 1) + assert(provider.getListing.length === 1) + + // Manually delete the appstatus file to make an invalid rolling event log + val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new Path(writer.logPath), +"app", None, true) + fs.delete(appStatusPath, false) + provider.checkForLogs() + provider.cleanLogs() + assert(provider.getListing.length === 0) + + // Create a new application + val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, conf, hadoopConf) + writer2.start() + writeEventsToRollingWriter(writer2, Seq( +SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None), +SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) + + // Both folders exist but only one application found + provider.checkForLogs() + provider.cleanLogs() + assert(provider.getListing.length === 1) + assert(dir.listFiles().size === 2) + + // Make sure a new provider sees the valid application + provider.stop() + val newProvider = new FsHistor
[spark] branch master updated (f3ad32f -> 9ab0ec4)
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f3ad32f [SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows add 9ab0ec4 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS No new revisions were added by this update. Summary of changes: .../spark/deploy/history/FsHistoryProvider.scala | 3 ++ .../deploy/history/FsHistoryProviderSuite.scala| 49 ++ 2 files changed, 52 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new d9669bd [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS d9669bd is described below commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5 Author: Adam Binford AuthorDate: Thu Oct 15 11:59:29 2020 +0900 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS ### What changes were proposed in this pull request? Adds an additional check for non-fatal errors when attempting to add a new entry to the history server application listing. ### Why are the changes needed? A bad rolling event log folder (missing appstatus file or no log files) would cause no applications to be loaded by the Spark history server. Figuring out why invalid event log folders are created in the first place will be addressed in separate issues, this just lets the history server skip the invalid folder and successfully load all the valid applications. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? New UT Closes #30037 from Kimahriman/bug/rolling-log-crashing-history. Authored-by: Adam Binford Signed-off-by: Jungtaek Lim (HeartSaVioR) (cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90) Signed-off-by: Jungtaek Lim (HeartSaVioR) --- .../spark/deploy/history/FsHistoryProvider.scala | 3 ++ .../deploy/history/FsHistoryProviderSuite.scala| 49 ++ 2 files changed, 52 insertions(+) diff --git a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala index c262152..5970708 100644 --- a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala +++ b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala @@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) reader.fileSizeForLastIndex > 0 } catch { case _: FileNotFoundException => false +case NonFatal(e) => + logWarning(s"Error while reading new log ${reader.rootPath}", e) + false } case _: FileNotFoundException => diff --git a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala index c2f34fc..f3beb35 100644 --- a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala @@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with Matchers with Logging { } } + test("SPARK-33146: don't let one bad rolling log folder prevent loading other applications") { +withTempDir { dir => + val conf = createTestConf(true) + conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath) + val hadoopConf = SparkHadoopUtil.newConfiguration(conf) + val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf) + + val provider = new FsHistoryProvider(conf) + + val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, conf, hadoopConf) + writer.start() + + writeEventsToRollingWriter(writer, Seq( +SparkListenerApplicationStart("app", Some("app"), 0, "user", None), +SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) + provider.checkForLogs() + provider.cleanLogs() + assert(dir.listFiles().size === 1) + assert(provider.getListing.length === 1) + + // Manually delete the appstatus file to make an invalid rolling event log + val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new Path(writer.logPath), +"app", None, true) + fs.delete(appStatusPath, false) + provider.checkForLogs() + provider.cleanLogs() + assert(provider.getListing.length === 0) + + // Create a new application + val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, conf, hadoopConf) + writer2.start() + writeEventsToRollingWriter(writer2, Seq( +SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None), +SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false) + + // Both folders exist but only one application found + provider.checkForLogs() + provider.cleanLogs() + assert(provider.getListing.length === 1) + assert(dir.listFiles().size === 2) + + // Make sure a new provider sees the valid application + provider.stop() + val newProvider = new FsHistor
[spark] branch master updated (f3ad32f -> 9ab0ec4)
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f3ad32f [SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows add 9ab0ec4 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS No new revisions were added by this update. Summary of changes: .../spark/deploy/history/FsHistoryProvider.scala | 3 ++ .../deploy/history/FsHistoryProviderSuite.scala| 49 ++ 2 files changed, 52 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (f3ad32f -> 9ab0ec4)
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f3ad32f [SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows add 9ab0ec4 [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS No new revisions were added by this update. Summary of changes: .../spark/deploy/history/FsHistoryProvider.scala | 3 ++ .../deploy/history/FsHistoryProviderSuite.scala| 49 ++ 2 files changed, 52 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8e5cb1d -> f3ad32f)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8e5cb1d [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved add f3ad32f [SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows No new revisions were added by this update. Summary of changes: .../spark/sql/execution/exchange/BroadcastExchangeExec.scala | 8 .../org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8e5cb1d -> f3ad32f)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8e5cb1d [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved add f3ad32f [SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows No new revisions were added by this update. Summary of changes: .../spark/sql/execution/exchange/BroadcastExchangeExec.scala | 8 .../org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8e5cb1d -> f3ad32f)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8e5cb1d [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved add f3ad32f [SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows No new revisions were added by this update. Summary of changes: .../spark/sql/execution/exchange/BroadcastExchangeExec.scala | 8 .../org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8e5cb1d -> f3ad32f)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8e5cb1d [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved add f3ad32f [SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows No new revisions were added by this update. Summary of changes: .../spark/sql/execution/exchange/BroadcastExchangeExec.scala | 8 .../org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new f3ad32f [SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows f3ad32f is described below commit f3ad32f4b6fc55e89e7fb222ed565ad3e32d47c6 Author: Wenchen Fan AuthorDate: Wed Oct 14 16:17:28 2020 + [SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows ### What changes were proposed in this pull request? Follow the convention and rename the metrics `numRows` to `numOutputRows` ### Why are the changes needed? `FilterExec`, `HashAggregateExec`, etc. all use `numOutputRows` ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests Closes #30039 from cloud-fan/minor. Authored-by: Wenchen Fan Signed-off-by: Wenchen Fan --- .../spark/sql/execution/exchange/BroadcastExchangeExec.scala | 8 .../org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala index 4b884df..0c5fee2 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala @@ -78,7 +78,7 @@ case class BroadcastExchangeExec( override lazy val metrics = Map( "dataSize" -> SQLMetrics.createSizeMetric(sparkContext, "data size"), -"numRows" -> SQLMetrics.createMetric(sparkContext, "number of rows"), +"numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output rows"), "collectTime" -> SQLMetrics.createTimingMetric(sparkContext, "time to collect"), "buildTime" -> SQLMetrics.createTimingMetric(sparkContext, "time to build"), "broadcastTime" -> SQLMetrics.createTimingMetric(sparkContext, "time to broadcast")) @@ -91,8 +91,8 @@ case class BroadcastExchangeExec( override def runtimeStatistics: Statistics = { val dataSize = metrics("dataSize").value -val numRows = metrics("numRows").value -Statistics(dataSize, Some(numRows)) +val rowCount = metrics("numOutputRows").value +Statistics(dataSize, Some(rowCount)) } @transient @@ -116,11 +116,11 @@ case class BroadcastExchangeExec( val beforeCollect = System.nanoTime() // Use executeCollect/executeCollectIterator to avoid conversion to Scala types val (numRows, input) = child.executeCollectIterator() +longMetric("numOutputRows") += numRows if (numRows >= MAX_BROADCAST_TABLE_ROWS) { throw new SparkException( s"Cannot broadcast the table over $MAX_BROADCAST_TABLE_ROWS rows: $numRows rows") } -longMetric("numRows") += numRows val beforeBuild = System.nanoTime() longMetric("collectTime") += NANOSECONDS.toMillis(beforeBuild - beforeCollect) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala index e404e46..4872906 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala @@ -751,7 +751,7 @@ class SQLMetricsSuite extends SharedSparkSession with SQLMetricsTestUtils } assert(exchanges.size === 1) -testMetricsInSparkPlanOperator(exchanges.head, Map("numRows" -> 2)) +testMetricsInSparkPlanOperator(exchanges.head, Map("numOutputRows" -> 2)) } } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 2ebea13 [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved 2ebea13 is described below commit 2ebea135a1f4e4cb3187ffc8e77d3f52d3b3a91a Author: Jungtaek Lim (HeartSaVioR) AuthorDate: Wed Oct 14 08:30:03 2020 -0700 [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved ### What changes were proposed in this pull request? This PR proposes to fix a bug on calling `DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters in `V2WriteCommand.outputResolved`. The order of parameters for `DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says that the right order of matching variables are `inAttr` and `outAttr`. ### Why are the changes needed? Spark throws AnalysisException due to unresolved operator in v2 write, while the operator is unresolved due to a bug that parameters to call `DataType.equalsIgnoreCompatibleNullability` in `outputResolved` have been swapped. ### Does this PR introduce _any_ user-facing change? Yes, end users no longer suffer on unresolved operator in v2 write if they're trying to write dataframe containing non-nullable complex types against table matching complex types as nullable. ### How was this patch tested? New UT added. Closes #30033 from HeartSaVioR/SPARK-33136. Authored-by: Jungtaek Lim (HeartSaVioR) Signed-off-by: Dongjoon Hyun (cherry picked from commit 8e5cb1d276686ec428e4e6aa1c3cfd6bb99e4e9a) Signed-off-by: Dongjoon Hyun --- .../sql/catalyst/plans/logical/v2Commands.scala| 2 +- .../apache/spark/sql/DataFrameWriterV2Suite.scala | 87 +- 2 files changed, 84 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala index b4120d9..d2830a9 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala @@ -45,7 +45,7 @@ trait V2WriteCommand extends Command { case (inAttr, outAttr) => // names and types must match, nullability must be compatible inAttr.name == outAttr.name && - DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, inAttr.dataType) && + DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, outAttr.dataType) && (outAttr.nullable || !inAttr.nullable) } } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala index 508eefa..ff5c624 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala @@ -23,16 +23,15 @@ import scala.collection.JavaConverters._ import org.scalatest.BeforeAndAfter -import org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, NoSuchTableException, TableAlreadyExistsException} -import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, OverwriteByExpression, OverwritePartitionsDynamic} +import org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, NamedRelation, NoSuchTableException, TableAlreadyExistsException} +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, OverwriteByExpression, OverwritePartitionsDynamic, V2WriteCommand} import org.apache.spark.sql.connector.{InMemoryTable, InMemoryTableCatalog} import org.apache.spark.sql.connector.catalog.{Identifier, TableCatalog} import org.apache.spark.sql.connector.expressions.{BucketTransform, DaysTransform, FieldReference, HoursTransform, IdentityTransform, LiteralValue, MonthsTransform, YearsTransform} import org.apache.spark.sql.execution.QueryExecution import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation import org.apache.spark.sql.test.SharedSparkSession -import org.apache.spark.sql.types.{IntegerType, LongType, StringType, StructType} -import org.apache.spark.sql.types.TimestampType +import org.apache.spark.sql.types.{ArrayType, DataType, IntegerType, LongType, MapType, StringType, StructField, StructType, TimestampType} import org.apache.spark.sql.util.QueryExecutionListener import org.apache.spark.unsafe.types.UTF8String import org.apache.spark.util.Utils @@ -101,6 +100,86 @@ class DataFrameWriterV2Suite extends QueryTest with SharedSparkSessio
[spark] branch master updated (d8c4a47 -> 8e5cb1d)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d8c4a47 [SPARK-33061][SQL] Expose inverse hyperbolic trig functions through sql.functions API add 8e5cb1d [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved No new revisions were added by this update. Summary of changes: .../sql/catalyst/plans/logical/v2Commands.scala| 2 +- .../apache/spark/sql/DataFrameWriterV2Suite.scala | 87 +- 2 files changed, 84 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 2ebea13 [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved 2ebea13 is described below commit 2ebea135a1f4e4cb3187ffc8e77d3f52d3b3a91a Author: Jungtaek Lim (HeartSaVioR) AuthorDate: Wed Oct 14 08:30:03 2020 -0700 [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved ### What changes were proposed in this pull request? This PR proposes to fix a bug on calling `DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters in `V2WriteCommand.outputResolved`. The order of parameters for `DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says that the right order of matching variables are `inAttr` and `outAttr`. ### Why are the changes needed? Spark throws AnalysisException due to unresolved operator in v2 write, while the operator is unresolved due to a bug that parameters to call `DataType.equalsIgnoreCompatibleNullability` in `outputResolved` have been swapped. ### Does this PR introduce _any_ user-facing change? Yes, end users no longer suffer on unresolved operator in v2 write if they're trying to write dataframe containing non-nullable complex types against table matching complex types as nullable. ### How was this patch tested? New UT added. Closes #30033 from HeartSaVioR/SPARK-33136. Authored-by: Jungtaek Lim (HeartSaVioR) Signed-off-by: Dongjoon Hyun (cherry picked from commit 8e5cb1d276686ec428e4e6aa1c3cfd6bb99e4e9a) Signed-off-by: Dongjoon Hyun --- .../sql/catalyst/plans/logical/v2Commands.scala| 2 +- .../apache/spark/sql/DataFrameWriterV2Suite.scala | 87 +- 2 files changed, 84 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala index b4120d9..d2830a9 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala @@ -45,7 +45,7 @@ trait V2WriteCommand extends Command { case (inAttr, outAttr) => // names and types must match, nullability must be compatible inAttr.name == outAttr.name && - DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, inAttr.dataType) && + DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, outAttr.dataType) && (outAttr.nullable || !inAttr.nullable) } } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala index 508eefa..ff5c624 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala @@ -23,16 +23,15 @@ import scala.collection.JavaConverters._ import org.scalatest.BeforeAndAfter -import org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, NoSuchTableException, TableAlreadyExistsException} -import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, OverwriteByExpression, OverwritePartitionsDynamic} +import org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, NamedRelation, NoSuchTableException, TableAlreadyExistsException} +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, OverwriteByExpression, OverwritePartitionsDynamic, V2WriteCommand} import org.apache.spark.sql.connector.{InMemoryTable, InMemoryTableCatalog} import org.apache.spark.sql.connector.catalog.{Identifier, TableCatalog} import org.apache.spark.sql.connector.expressions.{BucketTransform, DaysTransform, FieldReference, HoursTransform, IdentityTransform, LiteralValue, MonthsTransform, YearsTransform} import org.apache.spark.sql.execution.QueryExecution import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation import org.apache.spark.sql.test.SharedSparkSession -import org.apache.spark.sql.types.{IntegerType, LongType, StringType, StructType} -import org.apache.spark.sql.types.TimestampType +import org.apache.spark.sql.types.{ArrayType, DataType, IntegerType, LongType, MapType, StringType, StructField, StructType, TimestampType} import org.apache.spark.sql.util.QueryExecutionListener import org.apache.spark.unsafe.types.UTF8String import org.apache.spark.util.Utils @@ -101,6 +100,86 @@ class DataFrameWriterV2Suite extends QueryTest with SharedSparkSessio
[spark] branch master updated (d8c4a47 -> 8e5cb1d)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d8c4a47 [SPARK-33061][SQL] Expose inverse hyperbolic trig functions through sql.functions API add 8e5cb1d [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved No new revisions were added by this update. Summary of changes: .../sql/catalyst/plans/logical/v2Commands.scala| 2 +- .../apache/spark/sql/DataFrameWriterV2Suite.scala | 87 +- 2 files changed, 84 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 2ebea13 [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved 2ebea13 is described below commit 2ebea135a1f4e4cb3187ffc8e77d3f52d3b3a91a Author: Jungtaek Lim (HeartSaVioR) AuthorDate: Wed Oct 14 08:30:03 2020 -0700 [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved ### What changes were proposed in this pull request? This PR proposes to fix a bug on calling `DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters in `V2WriteCommand.outputResolved`. The order of parameters for `DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says that the right order of matching variables are `inAttr` and `outAttr`. ### Why are the changes needed? Spark throws AnalysisException due to unresolved operator in v2 write, while the operator is unresolved due to a bug that parameters to call `DataType.equalsIgnoreCompatibleNullability` in `outputResolved` have been swapped. ### Does this PR introduce _any_ user-facing change? Yes, end users no longer suffer on unresolved operator in v2 write if they're trying to write dataframe containing non-nullable complex types against table matching complex types as nullable. ### How was this patch tested? New UT added. Closes #30033 from HeartSaVioR/SPARK-33136. Authored-by: Jungtaek Lim (HeartSaVioR) Signed-off-by: Dongjoon Hyun (cherry picked from commit 8e5cb1d276686ec428e4e6aa1c3cfd6bb99e4e9a) Signed-off-by: Dongjoon Hyun --- .../sql/catalyst/plans/logical/v2Commands.scala| 2 +- .../apache/spark/sql/DataFrameWriterV2Suite.scala | 87 +- 2 files changed, 84 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala index b4120d9..d2830a9 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala @@ -45,7 +45,7 @@ trait V2WriteCommand extends Command { case (inAttr, outAttr) => // names and types must match, nullability must be compatible inAttr.name == outAttr.name && - DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, inAttr.dataType) && + DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, outAttr.dataType) && (outAttr.nullable || !inAttr.nullable) } } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala index 508eefa..ff5c624 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala @@ -23,16 +23,15 @@ import scala.collection.JavaConverters._ import org.scalatest.BeforeAndAfter -import org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, NoSuchTableException, TableAlreadyExistsException} -import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, OverwriteByExpression, OverwritePartitionsDynamic} +import org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, NamedRelation, NoSuchTableException, TableAlreadyExistsException} +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, OverwriteByExpression, OverwritePartitionsDynamic, V2WriteCommand} import org.apache.spark.sql.connector.{InMemoryTable, InMemoryTableCatalog} import org.apache.spark.sql.connector.catalog.{Identifier, TableCatalog} import org.apache.spark.sql.connector.expressions.{BucketTransform, DaysTransform, FieldReference, HoursTransform, IdentityTransform, LiteralValue, MonthsTransform, YearsTransform} import org.apache.spark.sql.execution.QueryExecution import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation import org.apache.spark.sql.test.SharedSparkSession -import org.apache.spark.sql.types.{IntegerType, LongType, StringType, StructType} -import org.apache.spark.sql.types.TimestampType +import org.apache.spark.sql.types.{ArrayType, DataType, IntegerType, LongType, MapType, StringType, StructField, StructType, TimestampType} import org.apache.spark.sql.util.QueryExecutionListener import org.apache.spark.unsafe.types.UTF8String import org.apache.spark.util.Utils @@ -101,6 +100,86 @@ class DataFrameWriterV2Suite extends QueryTest with SharedSparkSessio
[spark] branch master updated (d8c4a47 -> 8e5cb1d)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d8c4a47 [SPARK-33061][SQL] Expose inverse hyperbolic trig functions through sql.functions API add 8e5cb1d [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved No new revisions were added by this update. Summary of changes: .../sql/catalyst/plans/logical/v2Commands.scala| 2 +- .../apache/spark/sql/DataFrameWriterV2Suite.scala | 87 +- 2 files changed, 84 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 2ebea13 [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved 2ebea13 is described below commit 2ebea135a1f4e4cb3187ffc8e77d3f52d3b3a91a Author: Jungtaek Lim (HeartSaVioR) AuthorDate: Wed Oct 14 08:30:03 2020 -0700 [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved ### What changes were proposed in this pull request? This PR proposes to fix a bug on calling `DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters in `V2WriteCommand.outputResolved`. The order of parameters for `DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says that the right order of matching variables are `inAttr` and `outAttr`. ### Why are the changes needed? Spark throws AnalysisException due to unresolved operator in v2 write, while the operator is unresolved due to a bug that parameters to call `DataType.equalsIgnoreCompatibleNullability` in `outputResolved` have been swapped. ### Does this PR introduce _any_ user-facing change? Yes, end users no longer suffer on unresolved operator in v2 write if they're trying to write dataframe containing non-nullable complex types against table matching complex types as nullable. ### How was this patch tested? New UT added. Closes #30033 from HeartSaVioR/SPARK-33136. Authored-by: Jungtaek Lim (HeartSaVioR) Signed-off-by: Dongjoon Hyun (cherry picked from commit 8e5cb1d276686ec428e4e6aa1c3cfd6bb99e4e9a) Signed-off-by: Dongjoon Hyun --- .../sql/catalyst/plans/logical/v2Commands.scala| 2 +- .../apache/spark/sql/DataFrameWriterV2Suite.scala | 87 +- 2 files changed, 84 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala index b4120d9..d2830a9 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala @@ -45,7 +45,7 @@ trait V2WriteCommand extends Command { case (inAttr, outAttr) => // names and types must match, nullability must be compatible inAttr.name == outAttr.name && - DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, inAttr.dataType) && + DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, outAttr.dataType) && (outAttr.nullable || !inAttr.nullable) } } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala index 508eefa..ff5c624 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala @@ -23,16 +23,15 @@ import scala.collection.JavaConverters._ import org.scalatest.BeforeAndAfter -import org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, NoSuchTableException, TableAlreadyExistsException} -import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, OverwriteByExpression, OverwritePartitionsDynamic} +import org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, NamedRelation, NoSuchTableException, TableAlreadyExistsException} +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, OverwriteByExpression, OverwritePartitionsDynamic, V2WriteCommand} import org.apache.spark.sql.connector.{InMemoryTable, InMemoryTableCatalog} import org.apache.spark.sql.connector.catalog.{Identifier, TableCatalog} import org.apache.spark.sql.connector.expressions.{BucketTransform, DaysTransform, FieldReference, HoursTransform, IdentityTransform, LiteralValue, MonthsTransform, YearsTransform} import org.apache.spark.sql.execution.QueryExecution import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation import org.apache.spark.sql.test.SharedSparkSession -import org.apache.spark.sql.types.{IntegerType, LongType, StringType, StructType} -import org.apache.spark.sql.types.TimestampType +import org.apache.spark.sql.types.{ArrayType, DataType, IntegerType, LongType, MapType, StringType, StructField, StructType, TimestampType} import org.apache.spark.sql.util.QueryExecutionListener import org.apache.spark.unsafe.types.UTF8String import org.apache.spark.util.Utils @@ -101,6 +100,86 @@ class DataFrameWriterV2Suite extends QueryTest with SharedSparkSessio
[spark] branch master updated (d8c4a47 -> 8e5cb1d)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d8c4a47 [SPARK-33061][SQL] Expose inverse hyperbolic trig functions through sql.functions API add 8e5cb1d [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved No new revisions were added by this update. Summary of changes: .../sql/catalyst/plans/logical/v2Commands.scala| 2 +- .../apache/spark/sql/DataFrameWriterV2Suite.scala | 87 +- 2 files changed, 84 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 2ebea13 [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved 2ebea13 is described below commit 2ebea135a1f4e4cb3187ffc8e77d3f52d3b3a91a Author: Jungtaek Lim (HeartSaVioR) AuthorDate: Wed Oct 14 08:30:03 2020 -0700 [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved ### What changes were proposed in this pull request? This PR proposes to fix a bug on calling `DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters in `V2WriteCommand.outputResolved`. The order of parameters for `DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says that the right order of matching variables are `inAttr` and `outAttr`. ### Why are the changes needed? Spark throws AnalysisException due to unresolved operator in v2 write, while the operator is unresolved due to a bug that parameters to call `DataType.equalsIgnoreCompatibleNullability` in `outputResolved` have been swapped. ### Does this PR introduce _any_ user-facing change? Yes, end users no longer suffer on unresolved operator in v2 write if they're trying to write dataframe containing non-nullable complex types against table matching complex types as nullable. ### How was this patch tested? New UT added. Closes #30033 from HeartSaVioR/SPARK-33136. Authored-by: Jungtaek Lim (HeartSaVioR) Signed-off-by: Dongjoon Hyun (cherry picked from commit 8e5cb1d276686ec428e4e6aa1c3cfd6bb99e4e9a) Signed-off-by: Dongjoon Hyun --- .../sql/catalyst/plans/logical/v2Commands.scala| 2 +- .../apache/spark/sql/DataFrameWriterV2Suite.scala | 87 +- 2 files changed, 84 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala index b4120d9..d2830a9 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala @@ -45,7 +45,7 @@ trait V2WriteCommand extends Command { case (inAttr, outAttr) => // names and types must match, nullability must be compatible inAttr.name == outAttr.name && - DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, inAttr.dataType) && + DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, outAttr.dataType) && (outAttr.nullable || !inAttr.nullable) } } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala index 508eefa..ff5c624 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala @@ -23,16 +23,15 @@ import scala.collection.JavaConverters._ import org.scalatest.BeforeAndAfter -import org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, NoSuchTableException, TableAlreadyExistsException} -import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, OverwriteByExpression, OverwritePartitionsDynamic} +import org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, NamedRelation, NoSuchTableException, TableAlreadyExistsException} +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, OverwriteByExpression, OverwritePartitionsDynamic, V2WriteCommand} import org.apache.spark.sql.connector.{InMemoryTable, InMemoryTableCatalog} import org.apache.spark.sql.connector.catalog.{Identifier, TableCatalog} import org.apache.spark.sql.connector.expressions.{BucketTransform, DaysTransform, FieldReference, HoursTransform, IdentityTransform, LiteralValue, MonthsTransform, YearsTransform} import org.apache.spark.sql.execution.QueryExecution import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation import org.apache.spark.sql.test.SharedSparkSession -import org.apache.spark.sql.types.{IntegerType, LongType, StringType, StructType} -import org.apache.spark.sql.types.TimestampType +import org.apache.spark.sql.types.{ArrayType, DataType, IntegerType, LongType, MapType, StringType, StructField, StructType, TimestampType} import org.apache.spark.sql.util.QueryExecutionListener import org.apache.spark.unsafe.types.UTF8String import org.apache.spark.util.Utils @@ -101,6 +100,86 @@ class DataFrameWriterV2Suite extends QueryTest with SharedSparkSessio
[spark] branch master updated (d8c4a47 -> 8e5cb1d)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d8c4a47 [SPARK-33061][SQL] Expose inverse hyperbolic trig functions through sql.functions API add 8e5cb1d [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved No new revisions were added by this update. Summary of changes: .../sql/catalyst/plans/logical/v2Commands.scala| 2 +- .../apache/spark/sql/DataFrameWriterV2Suite.scala | 87 +- 2 files changed, 84 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (05a62dc -> d8c4a47)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 05a62dc [SPARK-33134][SQL] Return partial results only for root JSON objects add d8c4a47 [SPARK-33061][SQL] Expose inverse hyperbolic trig functions through sql.functions API No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/functions.scala | 50 +- .../org/apache/spark/sql/MathFunctionsSuite.scala | 15 +++ 2 files changed, 64 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (05a62dc -> d8c4a47)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 05a62dc [SPARK-33134][SQL] Return partial results only for root JSON objects add d8c4a47 [SPARK-33061][SQL] Expose inverse hyperbolic trig functions through sql.functions API No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/functions.scala | 50 +- .../org/apache/spark/sql/MathFunctionsSuite.scala | 15 +++ 2 files changed, 64 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (05a62dc -> d8c4a47)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 05a62dc [SPARK-33134][SQL] Return partial results only for root JSON objects add d8c4a47 [SPARK-33061][SQL] Expose inverse hyperbolic trig functions through sql.functions API No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/functions.scala | 50 +- .../org/apache/spark/sql/MathFunctionsSuite.scala | 15 +++ 2 files changed, 64 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (05a62dc -> d8c4a47)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 05a62dc [SPARK-33134][SQL] Return partial results only for root JSON objects add d8c4a47 [SPARK-33061][SQL] Expose inverse hyperbolic trig functions through sql.functions API No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/functions.scala | 50 +- .../org/apache/spark/sql/MathFunctionsSuite.scala | 15 +++ 2 files changed, 64 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (05a62dc -> d8c4a47)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 05a62dc [SPARK-33134][SQL] Return partial results only for root JSON objects add d8c4a47 [SPARK-33061][SQL] Expose inverse hyperbolic trig functions through sql.functions API No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/functions.scala | 50 +- .../org/apache/spark/sql/MathFunctionsSuite.scala | 15 +++ 2 files changed, 64 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org