spark git commit: [SPARK-16546][SQL][PYSPARK] update python dataframe.drop

rxin Thu, 14 Jul 2016 22:56:27 -0700

Repository: spark
Updated Branches:
  refs/heads/master 2e4075e2e -> 183242382



[SPARK-16546][SQL][PYSPARK] update python dataframe.drop

## What changes were proposed in this pull request?

Make `dataframe.drop` API in python support multi-columns parameters,
so that it is the same with scala API.

## How was this patch tested?

The doc test.

Author: WeichenXu <weichenxu...@outlook.com>

Closes #14203 from WeichenXu123/drop_python_api.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/18324238
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/18324238
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/18324238

Branch: refs/heads/master
Commit: 1832423827fd518853b63f91c321e4568a39107d
Parents: 2e4075e
Author: WeichenXu <weichenxu...@outlook.com>
Authored: Thu Jul 14 22:55:49 2016 -0700
Committer: Reynold Xin <r...@databricks.com>
Committed: Thu Jul 14 22:55:49 2016 -0700

----------------------------------------------------------------------
 python/pyspark/sql/dataframe.py | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/18324238/python/pyspark/sql/dataframe.py
----------------------------------------------------------------------
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index ab41e88..adf549d 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -1399,11 +1399,11 @@ class DataFrame(object):
 
     @since(1.4)
     @ignore_unicode_prefix
-    def drop(self, col):
+    def drop(self, *cols):
         """Returns a new :class:`DataFrame` that drops the specified column.
 
-        :param col: a string name of the column to drop, or a
-            :class:`Column` to drop.
+        :param cols: a string name of the column to drop, or a
+            :class:`Column` to drop, or a list of string name of the columns 
to drop.
 
         >>> df.drop('age').collect()
         [Row(name=u'Alice'), Row(name=u'Bob')]
@@ -1416,13 +1416,24 @@ class DataFrame(object):
 
         >>> df.join(df2, df.name == df2.name, 'inner').drop(df2.name).collect()
         [Row(age=5, name=u'Bob', height=85)]
+
+        >>> df.join(df2, 'name', 'inner').drop('age', 'height').collect()
+        [Row(name=u'Bob')]
         """
-        if isinstance(col, basestring):
-            jdf = self._jdf.drop(col)
-        elif isinstance(col, Column):
-            jdf = self._jdf.drop(col._jc)
+        if len(cols) == 1:
+            col = cols[0]
+            if isinstance(col, basestring):
+                jdf = self._jdf.drop(col)
+            elif isinstance(col, Column):
+                jdf = self._jdf.drop(col._jc)
+            else:
+                raise TypeError("col should be a string or a Column")
         else:
-            raise TypeError("col should be a string or a Column")
+            for col in cols:
+                if not isinstance(col, basestring):
+                    raise TypeError("each col in the param list should be a 
string")
+            jdf = self._jdf.drop(self._jseq(cols))
+
         return DataFrame(jdf, self.sql_ctx)
 
     @ignore_unicode_prefix


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-16546][SQL][PYSPARK] update python dataframe.drop

Reply via email to