Repository: spark
Updated Branches:
  refs/heads/master 457850e6f -> 9c4405e8e


[SPARK-19453][PYTHON][SQL][DOC] Correct and extend DataFrame.replace docstring

## What changes were proposed in this pull request?

- Provides correct description of the semantics of a `dict` argument passed as 
`to_replace`.
- Describes type requirements for collection arguments.
- Describes behavior with `to_replace: List[T]` and `value: T`

## How was this patch tested?

Manual testing, documentation build.

Author: zero323 <zero...@users.noreply.github.com>

Closes #16792 from zero323/SPARK-19453.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9c4405e8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9c4405e8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9c4405e8

Branch: refs/heads/master
Commit: 9c4405e8e801cbab3a5c78c9f4334775925dfcc4
Parents: 457850e
Author: zero323 <zero...@users.noreply.github.com>
Authored: Tue Feb 14 09:42:24 2017 -0800
Committer: Holden Karau <hol...@us.ibm.com>
Committed: Tue Feb 14 09:42:24 2017 -0800

----------------------------------------------------------------------
 python/pyspark/sql/dataframe.py | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/9c4405e8/python/pyspark/sql/dataframe.py
----------------------------------------------------------------------
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index 50373b8..188808b 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -1271,16 +1271,22 @@ class DataFrame(object):
         """Returns a new :class:`DataFrame` replacing a value with another 
value.
         :func:`DataFrame.replace` and :func:`DataFrameNaFunctions.replace` are
         aliases of each other.
-
-        :param to_replace: int, long, float, string, or list.
+        Values to_replace and value should contain either all numerics, all 
booleans,
+        or all strings. When replacing, the new value will be cast
+        to the type of the existing column.
+        For numeric replacements all values to be replaced should have unique
+        floating point representation. In case of conflicts (for example with 
`{42: -1, 42.0: 1}`)
+        and arbitrary replacement will be used.
+
+        :param to_replace: bool, int, long, float, string, list or dict.
             Value to be replaced.
             If the value is a dict, then `value` is ignored and `to_replace` 
must be a
-            mapping from column name (string) to replacement value. The value 
to be
-            replaced must be an int, long, float, or string.
+            mapping between a value and a replacement.
         :param value: int, long, float, string, or list.
-            Value to use to replace holes.
             The replacement value must be an int, long, float, or string. If 
`value` is a
-            list or tuple, `value` should be of the same length with 
`to_replace`.
+            list, `value` should be of the same length and type as 
`to_replace`.
+            If `value` is a scalar and `to_replace` is a sequence, then 
`value` is
+            used as a replacement for each item in `to_replace`.
         :param subset: optional list of column names to consider.
             Columns specified in subset that do not have matching data type 
are ignored.
             For example, if `value` is a string, and subset contains a 
non-string column,


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to