[jira] [Commented] (SPARK-15888) Python UDF over aggregate fails

2016-06-15 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331266#comment-15331266
 ] 

Apache Spark commented on SPARK-15888:
--

User 'davies' has created a pull request for this issue:
https://github.com/apache/spark/pull/13682

> Python UDF over aggregate fails
> ---
>
> Key: SPARK-15888
> URL: https://issues.apache.org/jira/browse/SPARK-15888
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, SQL
>Affects Versions: 2.0.0
>Reporter: Vladimir Feinberg
>Assignee: Davies Liu
>Priority: Blocker
>
> This looks like a regression from 1.6.1.
> The following notebook runs without error in a Spark 1.6.1 cluster, but fails 
> in 2.0.0:
> https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/6001574963454425/3194562079278586/1653464426712019/latest.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15888) Python UDF over aggregate fails

2016-06-10 Thread Davies Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15325733#comment-15325733
 ] 

Davies Liu commented on SPARK-15888:


After some investigation, it turned out to be that the Python UDF over 
aggregate function could not be extracted and inserted BEFORE the aggregate, 
should be insert AFTER aggregate.

A logical aggregate will become multiple physical aggregates, maybe it's better 
to add another rule for logical plan  (keep the current rule for physical plan).

> Python UDF over aggregate fails
> ---
>
> Key: SPARK-15888
> URL: https://issues.apache.org/jira/browse/SPARK-15888
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, SQL
>Affects Versions: 2.0.0
>Reporter: Vladimir Feinberg
>
> This looks like a regression from 1.6.1.
> The following notebook runs without error in a Spark 1.6.1 cluster, but fails 
> in 2.0.0:
> https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/6001574963454425/3194562079278586/1653464426712019/latest.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org