spark git commit: [SPARK-21278][PYSPARK] Upgrade to Py4J 0.10.6

2017-07-05 Thread holden
p-us.apache.org/repos/asf/spark/diff/c8d0aba1 Branch: refs/heads/master Commit: c8d0aba198c0f593c2b6b656c23b3d0fb7ea98a2 Parents: c8e7f44 Author: Dongjoon Hyun <dongj...@apache.org> Authored: Wed Jul 5 16:33:23 2017 -0700 Committer: Holden Karau <hol..

spark git commit: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name from the Python version

2017-05-09 Thread holden
ent hadoop versions, we can simply drop the hadoop information. If at a later point we need to start publishing different hadoop versions we can look at make different packages or similar. ## How was this patch tested? Ran `make-distribution` locally Author: Holden Karau <hol...@us.ibm.com>

spark git commit: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name from the Python version

2017-05-09 Thread holden
ent hadoop versions, we can simply drop the hadoop information. If at a later point we need to start publishing different hadoop versions we can look at make different packages or similar. ## How was this patch tested? Ran `make-distribution` locally Author: Holden Karau <hol...@us.ibm.com>

spark git commit: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name from the Python version

2017-05-09 Thread holden
oop versions, we can simply drop the hadoop information. If at a later point we need to start publishing different hadoop versions we can look at make different packages or similar. ## How was this patch tested? Ran `make-distribution` locally Author: Holden Karau <hol...@us.ibm.com>

spark git commit: [SPARK-20442][PYTHON][DOCS] Fill up documentations for functions in Column API in PySpark

2017-04-29 Thread holden
-0700 Committer: Holden Karau <hol...@us.ibm.com> Committed: Sat Apr 29 13:46:40 2017 -0700 -- python/pyspark/sql/column.py| 104 ++- .../expressions/bitwiseExpressions.scala| 2 +-

spark git commit: [SPARK-20132][DOCS] Add documentation for column string functions

2017-04-22 Thread holden
pache.org/repos/asf/spark/diff/8765bc17 Branch: refs/heads/master Commit: 8765bc17d0439032d0378686c4f2b17df2432abc Parents: b3c572a Author: Michael Patterson <map...@gmail.com> Authored: Sat Apr 22 19:58:54 2017 -0700 Committer: Holden Karau <hol...@us.ibm.com> Committed: Sat Apr 2

spark git commit: [SPARK-20360][PYTHON] reprs for interpreters

2017-04-18 Thread holden
d) Signed-off-by: Holden Karau <hol...@us.ibm.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7dbc0a91 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7dbc0a91 Diff: http://git-wip-us.apache.org/repos/asf/spar

spark git commit: [SPARK-20360][PYTHON] reprs for interpreters

2017-04-18 Thread holden
lley <rgb...@gmail.com> Authored: Tue Apr 18 12:35:27 2017 -0700 Committer: Holden Karau <hol...@us.ibm.com> Committed: Tue Apr 18 12:35:27 2017 -0700 -- python/pyspark/context.py | 26 +

spark git commit: [SPARK-19019][PYTHON][BRANCH-2.0] Fix hijacked `collections.namedtuple` and port cloudpickle changes for PySpark to work with Python 3.6.0

2017-04-17 Thread holden
..@gmail.com> Authored: Mon Apr 17 10:03:42 2017 -0700 Committer: Holden Karau <hol...@us.ibm.com> Committed: Mon Apr 17 10:03:42 2017 -0700 -- python/pyspark/cloudpickle.py | 98 ++---

spark git commit: [SPARK-19019][PYTHON][BRANCH-1.6] Fix hijacked `collections.namedtuple` and port cloudpickle changes for PySpark to work with Python 3.6.0

2017-04-17 Thread holden
: 23f9faa Author: hyukjinkwon <gurwls...@gmail.com> Authored: Mon Apr 17 09:58:55 2017 -0700 Committer: Holden Karau <hol...@us.ibm.com> Committed: Mon Apr 17 09:58:55 2017 -0700 -- python/pyspar

spark git commit: [SPARK-20232][PYTHON] Improve combineByKey docs

2017-04-13 Thread holden
er Commit: 8ddf0d2a60795a2306f94df8eac6e265b1fe5230 Parents: fbe4216 Author: David Gingrich <da...@textio.com> Authored: Thu Apr 13 12:43:28 2017 -0700 Committer: Holden Karau <hol...@us.ibm.com> Committed: Thu Apr 13 12:43:28 2017 -0700 -

spark git commit: [SPARK-19570][PYSPARK] Allow to disable hive in pyspark shell

2017-04-12 Thread holden
-0700 Committer: Holden Karau <hol...@us.ibm.com> Committed: Wed Apr 12 10:54:50 2017 -0700 -- python/pyspark/shell.py | 22 -- 1 file changed, 16 inser

spark git commit: [SPARK-19505][PYTHON] AttributeError on Exception.message in Python3

2017-04-11 Thread holden
ttp://git-wip-us.apache.org/repos/asf/spark/diff/6297697f Branch: refs/heads/master Commit: 6297697f975960a3006c4e58b4964d9ac40eeaf5 Parents: 123b4fb Author: David Gingrich <da...@textio.com> Authored: Tue Apr 11 12:18:31 2017 -0700 Committer: Holden Karau <hol...@us.ibm.com> Committed

spark git commit: [SPARK-19454][PYTHON][SQL] DataFrame.replace improvements

2017-04-05 Thread holden
-wip-us.apache.org/repos/asf/spark/tree/e2773996 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e2773996 Branch: refs/heads/master Commit: e2773996b8d1c0214d9ffac634a059b4923caf7b Parents: a2d8d76 Author: zero323 <zero...@users.noreply.github.com> Authored: Wed Apr 5 11:47:40 2017 -07

spark git commit: [SPARK-19955][PYSPARK] Jenkins Python Conda based test.

2017-03-29 Thread holden
ability. ## How was this patch tested? Updated shell scripts, ran tests locally with installed conda, ran tests in Jenkins. Author: Holden Karau <hol...@us.ibm.com> Closes #17355 from holdenk/SPARK-19955-support-python-tests-with-conda. Project: http://git-wip-us.apache.org/repos/a

spark git commit: [SPARK-12334][SQL][PYSPARK] Support read from multiple input paths for orc file in DataFrameReader.orc

2017-03-09 Thread holden
pache.org/repos/asf/spark/tree/cabe1df8 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cabe1df8 Branch: refs/heads/master Commit: cabe1df8606e7e5b9e6efb106045deb3f39f5f13 Parents: 30b18e6 Author: Jeff Zhang <zjf...@apache.org> Authored: Thu Mar 9 11:44:34 2017 -0800 Committer: Ho

spark git commit: [SPARK-13330][PYSPARK] PYTHONHASHSEED is not propgated to python worker

2017-02-24 Thread holden
4:42 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed: Fri Feb 24 15:04:42 2017 -0800 -- core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala | 1 + python/pyspark/context.py

spark git commit: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-24 Thread holden
efs/heads/master Commit: 4a5e38f5747148022988631cae0248ae1affadd3 Parents: 8f33731 Author: zero323 <zero...@users.noreply.github.com> Authored: Fri Feb 24 08:22:30 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed: Fri Feb 2

spark git commit: [SPARK-19160][PYTHON][SQL] Add udf decorator

2017-02-15 Thread holden
ro323 <zero...@users.noreply.github.com> Authored: Wed Feb 15 10:16:34 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed: Wed Feb 15 10:16:34 2017 -0800 -- python/pyspark/sql/functions.py | 41

spark git commit: [SPARK-19590][PYSPARK][ML] Update the document for QuantileDiscretizer in pyspark

2017-02-15 Thread holden
..@intel.com> Authored: Wed Feb 15 10:12:07 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed: Wed Feb 15 10:12:07 2017 -0800 -- python/pyspark/ml/feature.py | 12 +++- 1 file changed, 11 inse

spark git commit: [SPARK-18541][PYTHON] Add metadata parameter to pyspark.sql.Column.alias()

2017-02-14 Thread holden
asf/spark/diff/7b64f7aa Branch: refs/heads/master Commit: 7b64f7aa03a49adca5fcafe6fff422823b587514 Parents: e0eeb0f Author: Sheamus K. Parkes <shea.par...@milliman.com> Authored: Tue Feb 14 09:57:43 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed: Tue Feb 1

spark git commit: [SPARK-19162][PYTHON][SQL] UserDefinedFunction should validate that func is callable

2017-02-14 Thread holden
thored: Tue Feb 14 09:46:22 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed: Tue Feb 14 09:46:22 2017 -0800 -- python/pyspark/sql/functions.py | 5 + python/pyspark/sql/tests.py | 7 +++ 2 files c

spark git commit: [SPARK-19453][PYTHON][SQL][DOC] Correct and extend DataFrame.replace docstring

2017-02-14 Thread holden
er Commit: 9c4405e8e801cbab3a5c78c9f4334775925dfcc4 Parents: 457850e Author: zero323 <zero...@users.noreply.github.com> Authored: Tue Feb 14 09:42:24 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed: Tue Feb 1

spark git commit: [SPARK-19429][PYTHON][SQL] Support slice arguments in Column.__getitem__

2017-02-13 Thread holden
323 <zero...@users.noreply.github.com> Authored: Mon Feb 13 15:23:56 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed: Mon Feb 13 15:23:56 2017 -0800 -- python/pyspark/sql/column.py | 11 --- python/pyspa

spark git commit: [SPARK-19427][PYTHON][SQL] Support data type string as a returnType argument of UDF

2017-02-13 Thread holden
heads/master Commit: ab88b2410623e5fdb06d558017bd6d50220e466a Parents: 5e7cd33 Author: zero323 <zero...@users.noreply.github.com> Authored: Mon Feb 13 10:37:34 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed: Mon Feb 13 10:37:34 2017 -0800 ---

spark git commit: [SPARK-19506][ML][PYTHON] Import warnings in pyspark.ml.util

2017-02-13 Thread holden
323 <zero...@users.noreply.github.com> Closes #16846 from zero323/SPARK-19506. (cherry picked from commit 5e7cd3322b04f1dd207829b70546bc7ffdd63363) Signed-off-by: Holden Karau <hol...@us.ibm.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apach

spark git commit: [SPARK-19506][ML][PYTHON] Import warnings in pyspark.ml.util

2017-02-13 Thread holden
pache.org/repos/asf/spark/diff/5e7cd332 Branch: refs/heads/master Commit: 5e7cd3322b04f1dd207829b70546bc7ffdd63363 Parents: 4321ff9 Author: zero323 <zero...@users.noreply.github.com> Authored: Mon Feb 13 09:26:49 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed: Mon Feb 1

spark git commit: [SPARK-19421][ML][PYSPARK] Remove numClasses and numFeatures methods in LinearSVC

2017-02-05 Thread holden
ng <ruife...@foxmail.com> Authored: Sun Feb 5 19:06:51 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed: Sun Feb 5 19:06:51 2017 -0800 -- python/pyspark/ml/classification.py | 16 1 file

spark git commit: [SPARK-14352][SQL] approxQuantile should support multi columns

2017-02-01 Thread holden
com> Authored: Wed Feb 1 14:11:28 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed: Wed Feb 1 14:11:28 2017 -0800 -- python/pyspark/sql/dataframe.py | 37

spark git commit: [SPARK-19163][PYTHON][SQL] Delay _judf initialization to the __call__

2017-01-31 Thread holden
/90638358 Branch: refs/heads/master Commit: 9063835803e54538c94d95bbddcb4810dd7a1c55 Parents: 081b7ad Author: zero323 <zero...@users.noreply.github.com> Authored: Tue Jan 31 18:03:39 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed: Tue Jan 3

spark git commit: [SPARK-17161][PYSPARK][ML] Add PySpark-ML JavaWrapper convenience function to create Py4J JavaArrays

2017-01-31 Thread holden
ttp://git-wip-us.apache.org/repos/asf/spark/diff/57d70d26 Branch: refs/heads/master Commit: 57d70d26c88819360cdc806e7124aa2cc1b9e4c5 Parents: ce112ce Author: Bryan Cutler <cutl...@gmail.com> Authored: Tue Jan 31 15:42:36 2017 -0800 Committer: Holden Karau <hol...@us.ibm.com> Committed

spark git commit: [SPARK-19064][PYSPARK] Fix pip installing of sub components

2017-01-25 Thread holden
est script & make-distribution. ## How was this patch tested? Updated sanity test script to import mllib and ml sub-components. Author: Holden Karau <hol...@us.ibm.com> Closes #16465 from holdenk/SPARK-19064-fix-pip-install-sub-components. (cherry picke

spark git commit: [SPARK-19064][PYSPARK] Fix pip installing of sub components

2017-01-25 Thread holden
est script & make-distribution. ## How was this patch tested? Updated sanity test script to import mllib and ml sub-components. Author: Holden Karau <hol...@us.ibm.com> Closes #16465 from holdenk/SPARK-19064-fix-pip-install-sub-components. Project: http://git-wip-us.apache.org/repos/a

<    1   2   3   4   5