spark git commit: [MINOR][PYSPARK][DOC] Fix wrongly formatted examples in PySpark documentation

2016-07-06 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/master b1310425b -> 4e14199ff


[MINOR][PYSPARK][DOC] Fix wrongly formatted examples in PySpark documentation

## What changes were proposed in this pull request?

This PR fixes wrongly formatted examples in PySpark documentation as below:

- **`SparkSession`**

  - **Before**

![2016-07-06 11 34 
41](https://cloud.githubusercontent.com/assets/6477701/16605847/ae939526-436d-11e6-8ab8-6ad578362425.png)

  - **After**

![2016-07-06 11 33 
56](https://cloud.githubusercontent.com/assets/6477701/16605845/ace9ee78-436d-11e6-8923-b76d4fc3e7c3.png)

- **`Builder`**

  - **Before**
![2016-07-06 11 34 
44](https://cloud.githubusercontent.com/assets/6477701/16605844/aba60dbc-436d-11e6-990a-c87bc0281c6b.png)

  - **After**
![2016-07-06 1 26 
37](https://cloud.githubusercontent.com/assets/6477701/16607562/586704c0-437d-11e6-9483-e0af93d8f74e.png)

This PR also fixes several similar instances across the documentation in `sql` 
PySpark module.

## How was this patch tested?

N/A

Author: hyukjinkwon 

Closes #14063 from HyukjinKwon/minor-pyspark-builder.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4e14199f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4e14199f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4e14199f

Branch: refs/heads/master
Commit: 4e14199ff740ea186eb2cec2e5cf901b58c5f90e
Parents: b131042
Author: hyukjinkwon 
Authored: Wed Jul 6 10:45:51 2016 -0700
Committer: Reynold Xin 
Committed: Wed Jul 6 10:45:51 2016 -0700

--
 python/pyspark/mllib/clustering.py | 14 +++---
 python/pyspark/sql/dataframe.py|  8 
 python/pyspark/sql/functions.py|  8 
 python/pyspark/sql/group.py|  2 ++
 python/pyspark/sql/session.py  | 13 +++--
 python/pyspark/sql/types.py|  4 ++--
 6 files changed, 26 insertions(+), 23 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/4e14199f/python/pyspark/mllib/clustering.py
--
diff --git a/python/pyspark/mllib/clustering.py 
b/python/pyspark/mllib/clustering.py
index 93a0b64..c38c543 100644
--- a/python/pyspark/mllib/clustering.py
+++ b/python/pyspark/mllib/clustering.py
@@ -571,14 +571,14 @@ class PowerIterationClusteringModel(JavaModelWrapper, 
JavaSaveable, JavaLoader):
 
 >>> import math
 >>> def genCircle(r, n):
-...   points = []
-...   for i in range(0, n):
-... theta = 2.0 * math.pi * i / n
-... points.append((r * math.cos(theta), r * math.sin(theta)))
-...   return points
+... points = []
+... for i in range(0, n):
+... theta = 2.0 * math.pi * i / n
+... points.append((r * math.cos(theta), r * math.sin(theta)))
+... return points
 >>> def sim(x, y):
-...   dist2 = (x[0] - y[0]) * (x[0] - y[0]) + (x[1] - y[1]) * (x[1] - y[1])
-...   return math.exp(-dist2 / 2.0)
+... dist2 = (x[0] - y[0]) * (x[0] - y[0]) + (x[1] - y[1]) * (x[1] - 
y[1])
+... return math.exp(-dist2 / 2.0)
 >>> r1 = 1.0
 >>> n1 = 10
 >>> r2 = 4.0

http://git-wip-us.apache.org/repos/asf/spark/blob/4e14199f/python/pyspark/sql/dataframe.py
--
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index e44b01b..a0ac7a9 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -1045,10 +1045,10 @@ class DataFrame(object):
 :func:`drop_duplicates` is an alias for :func:`dropDuplicates`.
 
 >>> from pyspark.sql import Row
->>> df = sc.parallelize([ \
-Row(name='Alice', age=5, height=80), \
-Row(name='Alice', age=5, height=80), \
-Row(name='Alice', age=10, height=80)]).toDF()
+>>> df = sc.parallelize([ \\
+... Row(name='Alice', age=5, height=80), \\
+... Row(name='Alice', age=5, height=80), \\
+... Row(name='Alice', age=10, height=80)]).toDF()
 >>> df.dropDuplicates().show()
 +---+--+-+
 |age|height| name|

http://git-wip-us.apache.org/repos/asf/spark/blob/4e14199f/python/pyspark/sql/functions.py
--
diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py
index 7a73451..92d709e 100644
--- a/python/pyspark/sql/functions.py
+++ b/python/pyspark/sql/functions.py
@@ -1550,8 +1550,8 @@ def translate(srcCol, matching, replace):
 The translate will happen when any character in the string matching with 
the character
 in the `matching`.
 
->>> 

spark git commit: [MINOR][PYSPARK][DOC] Fix wrongly formatted examples in PySpark documentation

2016-07-06 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-2.0 091cd5f26 -> 03f336d89


[MINOR][PYSPARK][DOC] Fix wrongly formatted examples in PySpark documentation

## What changes were proposed in this pull request?

This PR fixes wrongly formatted examples in PySpark documentation as below:

- **`SparkSession`**

  - **Before**

![2016-07-06 11 34 
41](https://cloud.githubusercontent.com/assets/6477701/16605847/ae939526-436d-11e6-8ab8-6ad578362425.png)

  - **After**

![2016-07-06 11 33 
56](https://cloud.githubusercontent.com/assets/6477701/16605845/ace9ee78-436d-11e6-8923-b76d4fc3e7c3.png)

- **`Builder`**

  - **Before**
![2016-07-06 11 34 
44](https://cloud.githubusercontent.com/assets/6477701/16605844/aba60dbc-436d-11e6-990a-c87bc0281c6b.png)

  - **After**
![2016-07-06 1 26 
37](https://cloud.githubusercontent.com/assets/6477701/16607562/586704c0-437d-11e6-9483-e0af93d8f74e.png)

This PR also fixes several similar instances across the documentation in `sql` 
PySpark module.

## How was this patch tested?

N/A

Author: hyukjinkwon 

Closes #14063 from HyukjinKwon/minor-pyspark-builder.

(cherry picked from commit 4e14199ff740ea186eb2cec2e5cf901b58c5f90e)
Signed-off-by: Reynold Xin 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/03f336d8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/03f336d8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/03f336d8

Branch: refs/heads/branch-2.0
Commit: 03f336d8921e1f22ee4d1f6fa8869163b1f29ea9
Parents: 091cd5f
Author: hyukjinkwon 
Authored: Wed Jul 6 10:45:51 2016 -0700
Committer: Reynold Xin 
Committed: Wed Jul 6 10:45:56 2016 -0700

--
 python/pyspark/mllib/clustering.py | 14 +++---
 python/pyspark/sql/dataframe.py|  8 
 python/pyspark/sql/functions.py|  8 
 python/pyspark/sql/group.py|  2 ++
 python/pyspark/sql/session.py  | 13 +++--
 python/pyspark/sql/types.py|  4 ++--
 6 files changed, 26 insertions(+), 23 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/03f336d8/python/pyspark/mllib/clustering.py
--
diff --git a/python/pyspark/mllib/clustering.py 
b/python/pyspark/mllib/clustering.py
index 93a0b64..c38c543 100644
--- a/python/pyspark/mllib/clustering.py
+++ b/python/pyspark/mllib/clustering.py
@@ -571,14 +571,14 @@ class PowerIterationClusteringModel(JavaModelWrapper, 
JavaSaveable, JavaLoader):
 
 >>> import math
 >>> def genCircle(r, n):
-...   points = []
-...   for i in range(0, n):
-... theta = 2.0 * math.pi * i / n
-... points.append((r * math.cos(theta), r * math.sin(theta)))
-...   return points
+... points = []
+... for i in range(0, n):
+... theta = 2.0 * math.pi * i / n
+... points.append((r * math.cos(theta), r * math.sin(theta)))
+... return points
 >>> def sim(x, y):
-...   dist2 = (x[0] - y[0]) * (x[0] - y[0]) + (x[1] - y[1]) * (x[1] - y[1])
-...   return math.exp(-dist2 / 2.0)
+... dist2 = (x[0] - y[0]) * (x[0] - y[0]) + (x[1] - y[1]) * (x[1] - 
y[1])
+... return math.exp(-dist2 / 2.0)
 >>> r1 = 1.0
 >>> n1 = 10
 >>> r2 = 4.0

http://git-wip-us.apache.org/repos/asf/spark/blob/03f336d8/python/pyspark/sql/dataframe.py
--
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index e6e7029..c7d704a 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -1033,10 +1033,10 @@ class DataFrame(object):
 :func:`drop_duplicates` is an alias for :func:`dropDuplicates`.
 
 >>> from pyspark.sql import Row
->>> df = sc.parallelize([ \
-Row(name='Alice', age=5, height=80), \
-Row(name='Alice', age=5, height=80), \
-Row(name='Alice', age=10, height=80)]).toDF()
+>>> df = sc.parallelize([ \\
+... Row(name='Alice', age=5, height=80), \\
+... Row(name='Alice', age=5, height=80), \\
+... Row(name='Alice', age=10, height=80)]).toDF()
 >>> df.dropDuplicates().show()
 +---+--+-+
 |age|height| name|

http://git-wip-us.apache.org/repos/asf/spark/blob/03f336d8/python/pyspark/sql/functions.py
--
diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py
index 15cefc8..1feca6e 100644
--- a/python/pyspark/sql/functions.py
+++ b/python/pyspark/sql/functions.py
@@ -1550,8 +1550,8 @@ def translate(srcCol, matching, replace):
 The translate will