[ https://issues.apache.org/jira/browse/SPARK-19628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jork Zijlstra updated SPARK-19628: ---------------------------------- Attachment: spark2.0.1.png spark2.1.0.png > Duplicate Spark jobs in 2.1.0 > ----------------------------- > > Key: SPARK-19628 > URL: https://issues.apache.org/jira/browse/SPARK-19628 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.1.0 > Reporter: Jork Zijlstra > Fix For: 2.0.1 > > Attachments: spark2.0.1.png, spark2.1.0.png > > > After upgrading to Spark 2.1.0 we noticed that they are duplicate jobs > executed. Going back to Spark 2.0.1 they are gone again > {code} > import org.apache.spark.sql._ > object DoubleJobs { > def main(args: Array[String]) { > System.setProperty("hadoop.home.dir", "/tmp"); > val sparkSession: SparkSession = SparkSession.builder > .master("local[4]") > .appName("spark session example") > .config("spark.driver.maxResultSize", "6G") > .config("spark.sql.orc.filterPushdown", true) > .config("spark.sql.hive.metastorePartitionPruning", true) > .getOrCreate() > sparkSession.sqlContext.setConf("spark.sql.orc.filterPushdown", "true") > val paths = Seq( > ""//some orc source > ) > def dataFrame(path: String): DataFrame = { > sparkSession.read.orc(path) > } > paths.foreach(path => { > dataFrame(path).show(20) > }) > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org