[ https://issues.apache.org/jira/browse/SPARK-23390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun updated SPARK-23390: ---------------------------------- Summary: Flaky test: FileBasedDataSourceSuite (was: Flaky test: FileBasedDataSourceSuite in Spark 2.3/hadoop 2.7) > Flaky test: FileBasedDataSourceSuite > ------------------------------------ > > Key: SPARK-23390 > URL: https://issues.apache.org/jira/browse/SPARK-23390 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.3.0 > Reporter: Sameer Agarwal > Assignee: Wenchen Fan > Priority: Major > > We're seeing multiple failures in {{FileBasedDataSourceSuite}} in > {{spark-branch-2.3-test-sbt-hadoop-2.7}}: > {code:java} > org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to > eventually never returned normally. Attempted 15 times over > 10.012158059999999 seconds. Last failure message: There are 1 possibly leaked > file streams.. > {code} > Here's the full history: > [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-sbt-hadoop-2.7/189/testReport/org.apache.spark.sql/FileBasedDataSourceSuite/history/] > From a very quick look, these failures seem to be correlated with > [https://github.com/apache/spark/pull/20479] (cc [~dongjoon]) as evident from > the following stack trace (full logs > [here|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-sbt-hadoop-2.7/189/console]): > {code:java} > [info] - Enabling/disabling ignoreMissingFiles using orc (648 milliseconds) > 15:55:58.673 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in > stage 61.0 (TID 85, localhost, executor driver): TaskKilled (Stage cancelled) > 15:55:58.674 WARN org.apache.spark.DebugFilesystem: Leaked filesystem > connection created at: > java.lang.Throwable > at > org.apache.spark.DebugFilesystem$.addOpenStream(DebugFilesystem.scala:36) > at org.apache.spark.DebugFilesystem.open(DebugFilesystem.scala:70) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769) > at > org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.open(RecordReaderUtils.java:173) > at > org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:254) > at org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:633) > at > org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initialize(OrcColumnarBatchReader.java:138) > {code} > Also, while this might be just a false correlation but the frequency of these > test failures have increased considerably in > [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-sbt-hadoop-2.7/] > after [https://github.com/apache/spark/pull/20562] (cc > [~feng...@databricks.com]) was merged. > The following is Parquet leakage. > {code:java} > Caused by: sbt.ForkMain$ForkError: java.lang.Throwable: null > at > org.apache.spark.DebugFilesystem$.addOpenStream(DebugFilesystem.scala:36) > at org.apache.spark.DebugFilesystem.open(DebugFilesystem.scala:70) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769) > at > org.apache.parquet.hadoop.ParquetFileReader.<init>(ParquetFileReader.java:538) > at > org.apache.spark.sql.execution.datasources.parquet.SpecificParquetRecordReaderBase.initialize(SpecificParquetRecordReaderBase.java:149) > at > org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initialize(VectorizedParquetRecordReader.java:133) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(ParquetFileFormat.scala:400) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(ParquetFileFormat.scala:356) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:125) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:179) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:106) > {code} > - > [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-sbt-hadoop-2.7/322/] > (May 3rd) > - > [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-sbt-hadoop-2.7/331/] > (May 9th) > - [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90536] > (May 11st) > - > [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-sbt-hadoop-2.7/342/] > (May 16th) > - > [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-sbt-hadoop-2.7/347/] > (May 19th) > - > [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-sbt-hadoop-2.7/367/] > (June 2nd) -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org