[jira] [Updated] (SPARK-6743) Join with empty projection on one side produces invalid results

2019-05-14 Thread Josh Rosen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Rosen updated SPARK-6743:
--
Labels: correctness  (was: )

> Join with empty projection on one side produces invalid results
> ---
>
> Key: SPARK-6743
> URL: https://issues.apache.org/jira/browse/SPARK-6743
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.3.0
>Reporter: Santiago M. Mola
>Assignee: Michael Armbrust
>Priority: Critical
>  Labels: correctness
> Fix For: 1.4.0
>
>
> {code:java}
> val sqlContext = new SQLContext(sc)
> val tab0 = sc.parallelize(Seq(
>   (83,0,38),
>   (26,0,79),
>   (43,81,24)
> ))
> sqlContext.registerDataFrameAsTable(sqlContext.createDataFrame(tab0), 
> "tab0")
> sqlContext.cacheTable("tab0")   
> val df1 = sqlContext.sql("SELECT tab0._2, cor0._2 FROM tab0, tab0 cor0 GROUP 
> BY tab0._2, cor0._2")
> val result1 = df1.collect()
> val df2 = sqlContext.sql("SELECT cor0._2 FROM tab0, tab0 cor0 GROUP BY 
> cor0._2")
> val result2 = df2.collect()
> val df3 = sqlContext.sql("SELECT cor0._2 FROM tab0 cor0 GROUP BY cor0._2")
> val result3 = df3.collect()
> {code}
> Given the previous code, result2 equals to Row(43), Row(83), Row(26), which 
> is wrong. These results correspond to cor0._1, instead of cor0._2. Correct 
> results would be Row(0), Row(81), which are ok for the third query. The first 
> query also produces valid results, and the only difference is that the left 
> side of the join is not empty.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-6743) Join with empty projection on one side produces invalid results

2015-05-25 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-6743:
-
Assignee: Michael Armbrust

 Join with empty projection on one side produces invalid results
 ---

 Key: SPARK-6743
 URL: https://issues.apache.org/jira/browse/SPARK-6743
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0
Reporter: Santiago M. Mola
Assignee: Michael Armbrust
Priority: Critical
 Fix For: 1.4.0


 {code:java}
 val sqlContext = new SQLContext(sc)
 val tab0 = sc.parallelize(Seq(
   (83,0,38),
   (26,0,79),
   (43,81,24)
 ))
 sqlContext.registerDataFrameAsTable(sqlContext.createDataFrame(tab0), 
 tab0)
 sqlContext.cacheTable(tab0)   
 val df1 = sqlContext.sql(SELECT tab0._2, cor0._2 FROM tab0, tab0 cor0 GROUP 
 BY tab0._2, cor0._2)
 val result1 = df1.collect()
 val df2 = sqlContext.sql(SELECT cor0._2 FROM tab0, tab0 cor0 GROUP BY 
 cor0._2)
 val result2 = df2.collect()
 val df3 = sqlContext.sql(SELECT cor0._2 FROM tab0 cor0 GROUP BY cor0._2)
 val result3 = df3.collect()
 {code}
 Given the previous code, result2 equals to Row(43), Row(83), Row(26), which 
 is wrong. These results correspond to cor0._1, instead of cor0._2. Correct 
 results would be Row(0), Row(81), which are ok for the third query. The first 
 query also produces valid results, and the only difference is that the left 
 side of the join is not empty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-6743) Join with empty projection on one side produces invalid results

2015-04-07 Thread Santiago M. Mola (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Santiago M. Mola updated SPARK-6743:

Priority: Critical  (was: Major)

 Join with empty projection on one side produces invalid results
 ---

 Key: SPARK-6743
 URL: https://issues.apache.org/jira/browse/SPARK-6743
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0
Reporter: Santiago M. Mola
Priority: Critical

 {code:java}
 val sqlContext = new SQLContext(sc)
 val tab0 = sc.parallelize(Seq(
   (83,0,38),
   (26,0,79),
   (43,81,24)
 ))
 sqlContext.registerDataFrameAsTable(sqlContext.createDataFrame(tab0), 
 tab0)
 sqlContext.cacheTable(tab0)   
 val df1 = sqlContext.sql(SELECT tab0._2, cor0._2 FROM tab0, tab0 cor0 GROUP 
 BY tab0._2, cor0._2)
 val result1 = df1.collect()
 val df2 = sqlContext.sql(SELECT cor0._2 FROM tab0, tab0 cor0 GROUP BY 
 cor0._2)
 val result2 = df2.collect()
 val df3 = sqlContext.sql(SELECT cor0._2 FROM tab0 cor0 GROUP BY cor0._2)
 val result3 = df3.collect()
 {code}
 Given the previous code, result2 equals to Row(43), Row(83), Row(26), which 
 is wrong. These results correspond to cor0._1, instead of cor0._2. Correct 
 results would be Row(0), Row(81), which are ok for the third query. The first 
 query also produces valid results, and the only difference is that the left 
 side of the join is not empty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org