Repository: spark
Updated Branches:
  refs/heads/master 84b809445 -> 36282f78b


[SPARK-12184][PYTHON] Make python api doc for pivot consistant with scala doc

In SPARK-11946 the API for pivot was changed a bit and got updated doc, the doc 
changes were not made for the python api though. This PR updates the python doc 
to be consistent.

Author: Andrew Ray <ray.and...@gmail.com>

Closes #10176 from aray/sql-pivot-python-doc.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/36282f78
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/36282f78
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/36282f78

Branch: refs/heads/master
Commit: 36282f78b888743066843727426c6d806231aa97
Parents: 84b8094
Author: Andrew Ray <ray.and...@gmail.com>
Authored: Mon Dec 7 15:01:00 2015 -0800
Committer: Yin Huai <yh...@databricks.com>
Committed: Mon Dec 7 15:01:00 2015 -0800

----------------------------------------------------------------------
 python/pyspark/sql/group.py | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/36282f78/python/pyspark/sql/group.py
----------------------------------------------------------------------
diff --git a/python/pyspark/sql/group.py b/python/pyspark/sql/group.py
index 1911588..9ca303a 100644
--- a/python/pyspark/sql/group.py
+++ b/python/pyspark/sql/group.py
@@ -169,16 +169,20 @@ class GroupedData(object):
 
     @since(1.6)
     def pivot(self, pivot_col, values=None):
-        """Pivots a column of the current DataFrame and perform the specified 
aggregation.
+        """
+        Pivots a column of the current [[DataFrame]] and perform the specified 
aggregation.
+        There are two versions of pivot function: one that requires the caller 
to specify the list
+        of distinct values to pivot on, and one that does not. The latter is 
more concise but less
+        efficient, because Spark needs to first compute the list of distinct 
values internally.
 
-        :param pivot_col: Column to pivot
-        :param values: Optional list of values of pivot column that will be 
translated to columns in
-            the output DataFrame. If values are not provided the method will 
do an immediate call
-            to .distinct() on the pivot column.
+        :param pivot_col: Name of the column to pivot.
+        :param values: List of values that will be translated to columns in 
the output DataFrame.
 
+        // Compute the sum of earnings for each year by course with each 
course as a separate column
         >>> df4.groupBy("year").pivot("course", ["dotNET", 
"Java"]).sum("earnings").collect()
         [Row(year=2012, dotNET=15000, Java=20000), Row(year=2013, 
dotNET=48000, Java=30000)]
 
+        // Or without specifying column values (less efficient)
         >>> df4.groupBy("year").pivot("course").sum("earnings").collect()
         [Row(year=2012, Java=20000, dotNET=15000), Row(year=2013, Java=30000, 
dotNET=48000)]
         """


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to