[ https://issues.apache.org/jira/browse/SPARK-16237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vladimir Feinberg updated SPARK-16237: -------------------------------------- Description: To maintain feature parity, `gapply` functionality should be added to PySpark's {{GroupedData}} with an interface. The implementation already exists because it fulfilled a need in another package: https://github.com/vlad17/spark-sklearn/blob/master/python/spark_sklearn/group_apply.py It needs to be migrated (to become a {{GroupedData}} method, the first argument now to be called self). was: To maintain feature parity, `gapply` functionality should be added to `pyspark`'s `GroupedData` with an interface. The implementation already exists because it fulfilled a need in another package: https://github.com/vlad17/spark-sklearn/blob/master/python/spark_sklearn/group_apply.py It needs to be migrated (to become a GroupedData method, the first argument now to be called self). > PySpark gapply > -------------- > > Key: SPARK-16237 > URL: https://issues.apache.org/jira/browse/SPARK-16237 > Project: Spark > Issue Type: New Feature > Components: PySpark, SQL > Reporter: Vladimir Feinberg > > To maintain feature parity, `gapply` functionality should be added to > PySpark's {{GroupedData}} with an interface. > The implementation already exists because it fulfilled a need in another > package: > https://github.com/vlad17/spark-sklearn/blob/master/python/spark_sklearn/group_apply.py > It needs to be migrated (to become a {{GroupedData}} method, the first > argument now to be called self). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org