[jira] [Commented] (SPARK-12989) Bad interaction between StarExpansion and ExtractWindowExpressions

2016-01-26 Thread Denton Cockburn (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117220#comment-15117220
 ] 

Denton Cockburn commented on SPARK-12989:
-

It should be noted that it works if given:

{code}
data.select($"Data.*", max("num").over(winSpec) as "max").explain(true)
{code}

> Bad interaction between StarExpansion and ExtractWindowExpressions
> --
>
> Key: SPARK-12989
> URL: https://issues.apache.org/jira/browse/SPARK-12989
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Michael Armbrust
>
> Reported initially here: 
> http://stackoverflow.com/questions/34995376/apache-spark-window-function-with-nested-column
> {code}
> import sqlContext.implicits._
> import org.apache.spark.sql.functions._
> import org.apache.spark.sql.expressions.Window
> sql("SET spark.sql.eagerAnalysis=false") // Let us see the error even though 
> we are constructing an invalid tree
> val data = Seq(("a", "b", "c", 3), ("c", "b", "a", 3)).toDF("A", "B", "C", 
> "num")
>   .withColumn("Data", struct("A", "B", "C"))
>   .drop("A")
>   .drop("B")
>   .drop("C")
> val winSpec = Window.partitionBy("Data.A", "Data.B").orderBy($"num".desc)
> data.select($"*", max("num").over(winSpec) as "max").explain(true)
> {code}
> When you run this, the analyzer inserts invalid columns into a projection, as 
> seen below:
> {code}
> == Parsed Logical Plan ==
> 'Project [*,'max('num) windowspecdefinition('Data.A,'Data.B,'num 
> DESC,UnspecifiedFrame) AS max#64928]
> +- Project [num#64926,Data#64927]
>+- Project [C#64925,num#64926,Data#64927]
>   +- Project [B#64924,C#64925,num#64926,Data#64927]
>  +- Project 
> [A#64923,B#64924,C#64925,num#64926,struct(A#64923,B#64924,C#64925) AS 
> Data#64927]
> +- Project [_1#64919 AS A#64923,_2#64920 AS B#64924,_3#64921 AS 
> C#64925,_4#64922 AS num#64926]
>+- LocalRelation [_1#64919,_2#64920,_3#64921,_4#64922], 
> [[a,b,c,3],[c,b,a,3]]
> == Analyzed Logical Plan ==
> num: int, Data: struct, max: int
> Project [num#64926,Data#64927,max#64928]
> +- Project [num#64926,Data#64927,A#64932,B#64933,max#64928,max#64928]
>+- Window [num#64926,Data#64927,A#64932,B#64933], 
> [HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax(num#64926)
>  windowspecdefinition(A#64932,B#64933,num#64926 DESC,RANGE BETWEEN UNBOUNDED 
> PRECEDING AND CURRENT ROW) AS max#64928], [A#64932,B#64933], [num#64926 DESC]
>   +- !Project [num#64926,Data#64927,A#64932,B#64933]
>  +- Project [num#64926,Data#64927]
> +- Project [C#64925,num#64926,Data#64927]
>+- Project [B#64924,C#64925,num#64926,Data#64927]
>   +- Project 
> [A#64923,B#64924,C#64925,num#64926,struct(A#64923,B#64924,C#64925) AS 
> Data#64927]
>  +- Project [_1#64919 AS A#64923,_2#64920 AS 
> B#64924,_3#64921 AS C#64925,_4#64922 AS num#64926]
> +- LocalRelation 
> [_1#64919,_2#64920,_3#64921,_4#64922], [[a,b,c,3],[c,b,a,3]]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12989) Bad interaction between StarExpansion and ExtractWindowExpressions

2016-01-26 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15118708#comment-15118708
 ] 

Xiao Li commented on SPARK-12989:
-

I will try to fix this problem tomorrow. Thanks!

> Bad interaction between StarExpansion and ExtractWindowExpressions
> --
>
> Key: SPARK-12989
> URL: https://issues.apache.org/jira/browse/SPARK-12989
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Michael Armbrust
>
> Reported initially here: 
> http://stackoverflow.com/questions/34995376/apache-spark-window-function-with-nested-column
> {code}
> import sqlContext.implicits._
> import org.apache.spark.sql.functions._
> import org.apache.spark.sql.expressions.Window
> sql("SET spark.sql.eagerAnalysis=false") // Let us see the error even though 
> we are constructing an invalid tree
> val data = Seq(("a", "b", "c", 3), ("c", "b", "a", 3)).toDF("A", "B", "C", 
> "num")
>   .withColumn("Data", struct("A", "B", "C"))
>   .drop("A")
>   .drop("B")
>   .drop("C")
> val winSpec = Window.partitionBy("Data.A", "Data.B").orderBy($"num".desc)
> data.select($"*", max("num").over(winSpec) as "max").explain(true)
> {code}
> When you run this, the analyzer inserts invalid columns into a projection, as 
> seen below:
> {code}
> == Parsed Logical Plan ==
> 'Project [*,'max('num) windowspecdefinition('Data.A,'Data.B,'num 
> DESC,UnspecifiedFrame) AS max#64928]
> +- Project [num#64926,Data#64927]
>+- Project [C#64925,num#64926,Data#64927]
>   +- Project [B#64924,C#64925,num#64926,Data#64927]
>  +- Project 
> [A#64923,B#64924,C#64925,num#64926,struct(A#64923,B#64924,C#64925) AS 
> Data#64927]
> +- Project [_1#64919 AS A#64923,_2#64920 AS B#64924,_3#64921 AS 
> C#64925,_4#64922 AS num#64926]
>+- LocalRelation [_1#64919,_2#64920,_3#64921,_4#64922], 
> [[a,b,c,3],[c,b,a,3]]
> == Analyzed Logical Plan ==
> num: int, Data: struct, max: int
> Project [num#64926,Data#64927,max#64928]
> +- Project [num#64926,Data#64927,A#64932,B#64933,max#64928,max#64928]
>+- Window [num#64926,Data#64927,A#64932,B#64933], 
> [HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax(num#64926)
>  windowspecdefinition(A#64932,B#64933,num#64926 DESC,RANGE BETWEEN UNBOUNDED 
> PRECEDING AND CURRENT ROW) AS max#64928], [A#64932,B#64933], [num#64926 DESC]
>   +- !Project [num#64926,Data#64927,A#64932,B#64933]
>  +- Project [num#64926,Data#64927]
> +- Project [C#64925,num#64926,Data#64927]
>+- Project [B#64924,C#64925,num#64926,Data#64927]
>   +- Project 
> [A#64923,B#64924,C#64925,num#64926,struct(A#64923,B#64924,C#64925) AS 
> Data#64927]
>  +- Project [_1#64919 AS A#64923,_2#64920 AS 
> B#64924,_3#64921 AS C#64925,_4#64922 AS num#64926]
> +- LocalRelation 
> [_1#64919,_2#64920,_3#64921,_4#64922], [[a,b,c,3],[c,b,a,3]]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12989) Bad interaction between StarExpansion and ExtractWindowExpressions

2016-01-27 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120677#comment-15120677
 ] 

Apache Spark commented on SPARK-12989:
--

User 'gatorsmile' has created a pull request for this issue:
https://github.com/apache/spark/pull/10963

> Bad interaction between StarExpansion and ExtractWindowExpressions
> --
>
> Key: SPARK-12989
> URL: https://issues.apache.org/jira/browse/SPARK-12989
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Michael Armbrust
>
> Reported initially here: 
> http://stackoverflow.com/questions/34995376/apache-spark-window-function-with-nested-column
> {code}
> import sqlContext.implicits._
> import org.apache.spark.sql.functions._
> import org.apache.spark.sql.expressions.Window
> sql("SET spark.sql.eagerAnalysis=false") // Let us see the error even though 
> we are constructing an invalid tree
> val data = Seq(("a", "b", "c", 3), ("c", "b", "a", 3)).toDF("A", "B", "C", 
> "num")
>   .withColumn("Data", struct("A", "B", "C"))
>   .drop("A")
>   .drop("B")
>   .drop("C")
> val winSpec = Window.partitionBy("Data.A", "Data.B").orderBy($"num".desc)
> data.select($"*", max("num").over(winSpec) as "max").explain(true)
> {code}
> When you run this, the analyzer inserts invalid columns into a projection, as 
> seen below:
> {code}
> == Parsed Logical Plan ==
> 'Project [*,'max('num) windowspecdefinition('Data.A,'Data.B,'num 
> DESC,UnspecifiedFrame) AS max#64928]
> +- Project [num#64926,Data#64927]
>+- Project [C#64925,num#64926,Data#64927]
>   +- Project [B#64924,C#64925,num#64926,Data#64927]
>  +- Project 
> [A#64923,B#64924,C#64925,num#64926,struct(A#64923,B#64924,C#64925) AS 
> Data#64927]
> +- Project [_1#64919 AS A#64923,_2#64920 AS B#64924,_3#64921 AS 
> C#64925,_4#64922 AS num#64926]
>+- LocalRelation [_1#64919,_2#64920,_3#64921,_4#64922], 
> [[a,b,c,3],[c,b,a,3]]
> == Analyzed Logical Plan ==
> num: int, Data: struct, max: int
> Project [num#64926,Data#64927,max#64928]
> +- Project [num#64926,Data#64927,A#64932,B#64933,max#64928,max#64928]
>+- Window [num#64926,Data#64927,A#64932,B#64933], 
> [HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax(num#64926)
>  windowspecdefinition(A#64932,B#64933,num#64926 DESC,RANGE BETWEEN UNBOUNDED 
> PRECEDING AND CURRENT ROW) AS max#64928], [A#64932,B#64933], [num#64926 DESC]
>   +- !Project [num#64926,Data#64927,A#64932,B#64933]
>  +- Project [num#64926,Data#64927]
> +- Project [C#64925,num#64926,Data#64927]
>+- Project [B#64924,C#64925,num#64926,Data#64927]
>   +- Project 
> [A#64923,B#64924,C#64925,num#64926,struct(A#64923,B#64924,C#64925) AS 
> Data#64927]
>  +- Project [_1#64919 AS A#64923,_2#64920 AS 
> B#64924,_3#64921 AS C#64925,_4#64922 AS num#64926]
> +- LocalRelation 
> [_1#64919,_2#64920,_3#64921,_4#64922], [[a,b,c,3],[c,b,a,3]]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org