[jira] [Updated] (SPARK-6743) Join with empty projection on one side produces invalid results
[ https://issues.apache.org/jira/browse/SPARK-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-6743: -- Labels: correctness (was: ) > Join with empty projection on one side produces invalid results > --- > > Key: SPARK-6743 > URL: https://issues.apache.org/jira/browse/SPARK-6743 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.3.0 >Reporter: Santiago M. Mola >Assignee: Michael Armbrust >Priority: Critical > Labels: correctness > Fix For: 1.4.0 > > > {code:java} > val sqlContext = new SQLContext(sc) > val tab0 = sc.parallelize(Seq( > (83,0,38), > (26,0,79), > (43,81,24) > )) > sqlContext.registerDataFrameAsTable(sqlContext.createDataFrame(tab0), > "tab0") > sqlContext.cacheTable("tab0") > val df1 = sqlContext.sql("SELECT tab0._2, cor0._2 FROM tab0, tab0 cor0 GROUP > BY tab0._2, cor0._2") > val result1 = df1.collect() > val df2 = sqlContext.sql("SELECT cor0._2 FROM tab0, tab0 cor0 GROUP BY > cor0._2") > val result2 = df2.collect() > val df3 = sqlContext.sql("SELECT cor0._2 FROM tab0 cor0 GROUP BY cor0._2") > val result3 = df3.collect() > {code} > Given the previous code, result2 equals to Row(43), Row(83), Row(26), which > is wrong. These results correspond to cor0._1, instead of cor0._2. Correct > results would be Row(0), Row(81), which are ok for the third query. The first > query also produces valid results, and the only difference is that the left > side of the join is not empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-6743) Join with empty projection on one side produces invalid results
[ https://issues.apache.org/jira/browse/SPARK-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6743: - Assignee: Michael Armbrust Join with empty projection on one side produces invalid results --- Key: SPARK-6743 URL: https://issues.apache.org/jira/browse/SPARK-6743 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.3.0 Reporter: Santiago M. Mola Assignee: Michael Armbrust Priority: Critical Fix For: 1.4.0 {code:java} val sqlContext = new SQLContext(sc) val tab0 = sc.parallelize(Seq( (83,0,38), (26,0,79), (43,81,24) )) sqlContext.registerDataFrameAsTable(sqlContext.createDataFrame(tab0), tab0) sqlContext.cacheTable(tab0) val df1 = sqlContext.sql(SELECT tab0._2, cor0._2 FROM tab0, tab0 cor0 GROUP BY tab0._2, cor0._2) val result1 = df1.collect() val df2 = sqlContext.sql(SELECT cor0._2 FROM tab0, tab0 cor0 GROUP BY cor0._2) val result2 = df2.collect() val df3 = sqlContext.sql(SELECT cor0._2 FROM tab0 cor0 GROUP BY cor0._2) val result3 = df3.collect() {code} Given the previous code, result2 equals to Row(43), Row(83), Row(26), which is wrong. These results correspond to cor0._1, instead of cor0._2. Correct results would be Row(0), Row(81), which are ok for the third query. The first query also produces valid results, and the only difference is that the left side of the join is not empty. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-6743) Join with empty projection on one side produces invalid results
[ https://issues.apache.org/jira/browse/SPARK-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santiago M. Mola updated SPARK-6743: Priority: Critical (was: Major) Join with empty projection on one side produces invalid results --- Key: SPARK-6743 URL: https://issues.apache.org/jira/browse/SPARK-6743 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.3.0 Reporter: Santiago M. Mola Priority: Critical {code:java} val sqlContext = new SQLContext(sc) val tab0 = sc.parallelize(Seq( (83,0,38), (26,0,79), (43,81,24) )) sqlContext.registerDataFrameAsTable(sqlContext.createDataFrame(tab0), tab0) sqlContext.cacheTable(tab0) val df1 = sqlContext.sql(SELECT tab0._2, cor0._2 FROM tab0, tab0 cor0 GROUP BY tab0._2, cor0._2) val result1 = df1.collect() val df2 = sqlContext.sql(SELECT cor0._2 FROM tab0, tab0 cor0 GROUP BY cor0._2) val result2 = df2.collect() val df3 = sqlContext.sql(SELECT cor0._2 FROM tab0 cor0 GROUP BY cor0._2) val result3 = df3.collect() {code} Given the previous code, result2 equals to Row(43), Row(83), Row(26), which is wrong. These results correspond to cor0._1, instead of cor0._2. Correct results would be Row(0), Row(81), which are ok for the third query. The first query also produces valid results, and the only difference is that the left side of the join is not empty. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org