[jira] [Updated] (SPARK-22072) Allow the same shell params to be used for all of the different steps in release-build

2017-09-20 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-22072: Summary: Allow the same shell params to be used for all of the different steps in release-build (was:

[jira] [Created] (SPARK-22072) Automatically rewrite the version string for publish-release

2017-09-19 Thread holdenk (JIRA)
holdenk created SPARK-22072: --- Summary: Automatically rewrite the version string for publish-release Key: SPARK-22072 URL: https://issues.apache.org/jira/browse/SPARK-22072 Project: Spark Issue

[jira] [Created] (SPARK-22071) Improve release build scripts to check correct JAVA version is being used for build

2017-09-19 Thread holdenk (JIRA)
holdenk created SPARK-22071: --- Summary: Improve release build scripts to check correct JAVA version is being used for build Key: SPARK-22071 URL: https://issues.apache.org/jira/browse/SPARK-22071 Project:

[jira] [Updated] (SPARK-22054) Allow release managers to inject their keys

2017-09-18 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-22054: Fix Version/s: (was: 2.2.1) (was: 2.3.0) > Allow release managers to inject

[jira] [Updated] (SPARK-22054) Allow release managers to inject their keys

2017-09-18 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-22054: Target Version/s: 2.2.1, 2.3.0 > Allow release managers to inject their keys >

[jira] [Created] (SPARK-22055) Port release scripts

2017-09-18 Thread holdenk (JIRA)
holdenk created SPARK-22055: --- Summary: Port release scripts Key: SPARK-22055 URL: https://issues.apache.org/jira/browse/SPARK-22055 Project: Spark Issue Type: Bug Components: Build

[jira] [Created] (SPARK-22054) Allow release managers to inject their keys

2017-09-18 Thread holdenk (JIRA)
holdenk created SPARK-22054: --- Summary: Allow release managers to inject their keys Key: SPARK-22054 URL: https://issues.apache.org/jira/browse/SPARK-22054 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-21985) PySpark PairDeserializer is broken for double-zipped RDDs

2017-09-13 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165408#comment-16165408 ] holdenk commented on SPARK-21985: - CC [~a1ray] moving the discussion back here from github, I'm looking

[jira] [Updated] (SPARK-21985) PySpark PairDeserializer is broken for double-zipped RDDs

2017-09-13 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-21985: Target Version/s: 2.1.2 > PySpark PairDeserializer is broken for double-zipped RDDs >

[jira] [Resolved] (SPARK-18128) Add support for publishing to PyPI

2017-09-12 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-18128. - Resolution: Fixed Assignee: holdenk Fix Version/s: 2.2.0 We got the package name

[jira] [Resolved] (SPARK-18267) Distribute PySpark via Python Package Index (pypi)

2017-09-12 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-18267. - Resolution: Fixed Assignee: holdenk Fix Version/s: 2.2.0 > Distribute PySpark via Python

[jira] [Commented] (SPARK-17602) PySpark - Performance Optimization Large Size of Broadcast Variable

2017-09-11 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162457#comment-16162457 ] holdenk commented on SPARK-17602: - [~liujunf] how about you go ahead and make a pull request and put

[jira] [Resolved] (SPARK-19866) Add local version of Word2Vec findSynonyms for spark.ml: Python API

2017-09-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-19866. - Resolution: Fixed Fix Version/s: 2.3.0 > Add local version of Word2Vec findSynonyms for spark.ml:

[jira] [Resolved] (SPARK-15243) Binarizer.explainParam(u"...") raises ValueError

2017-09-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-15243. - Resolution: Fixed > Binarizer.explainParam(u"...") raises ValueError >

[jira] [Updated] (SPARK-15243) Binarizer.explainParam(u"...") raises ValueError

2017-09-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-15243: Fix Version/s: 2.3.0 > Binarizer.explainParam(u"...") raises ValueError >

[jira] [Assigned] (SPARK-15243) Binarizer.explainParam(u"...") raises ValueError

2017-09-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-15243: --- Assignee: Hyukjin Kwon (was: Seth Hendrickson) > Binarizer.explainParam(u"...") raises ValueError

[jira] [Resolved] (SPARK-20676) Upload to PyPi

2017-08-31 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-20676. - Resolution: Fixed Fix Version/s: 2.2.0 > Upload to PyPi > -- > > Key:

[jira] [Commented] (SPARK-20676) Upload to PyPi

2017-08-31 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149656#comment-16149656 ] holdenk commented on SPARK-20676: - Yes. > Upload to PyPi > -- > > Key:

[jira] [Created] (SPARK-21812) PySpark ML Models should not depend transfering params from Java

2017-08-22 Thread holdenk (JIRA)
holdenk created SPARK-21812: --- Summary: PySpark ML Models should not depend transfering params from Java Key: SPARK-21812 URL: https://issues.apache.org/jira/browse/SPARK-21812 Project: Spark

[jira] [Resolved] (SPARK-10931) PySpark ML Models should contain Param values

2017-08-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-10931. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 17849

[jira] [Assigned] (SPARK-10931) PySpark ML Models should contain Param values

2017-08-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-10931: --- Assignee: Bryan Cutler > PySpark ML Models should contain Param values >

[jira] [Resolved] (SPARK-21566) Python method for summary

2017-08-18 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-21566. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18762

[jira] [Assigned] (SPARK-21566) Python method for summary

2017-08-18 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-21566: --- Assignee: Andrew Ray > Python method for summary > - > >

[jira] [Created] (SPARK-21730) Consider officially dropping PyPy pre-2.5 support

2017-08-14 Thread holdenk (JIRA)
holdenk created SPARK-21730: --- Summary: Consider officially dropping PyPy pre-2.5 support Key: SPARK-21730 URL: https://issues.apache.org/jira/browse/SPARK-21730 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-21730) Consider officially dropping PyPy pre-2.5 support

2017-08-14 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-21730: Issue Type: Improvement (was: Bug) > Consider officially dropping PyPy pre-2.5 support >

[jira] [Commented] (SPARK-21573) Tests failing with run-tests.py SyntaxError occasionally in Jenkins

2017-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106024#comment-16106024 ] holdenk commented on SPARK-21573: - Yes we did drop 2.6 support. We should change the script to python2.7

[jira] [Assigned] (SPARK-20090) Add StructType.fieldNames to Python API

2017-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-20090: --- Assignee: Hyukjin Kwon > Add StructType.fieldNames to Python API >

[jira] [Resolved] (SPARK-20090) Add StructType.fieldNames to Python API

2017-07-28 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-20090. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18618

[jira] [Resolved] (SPARK-21434) Add PySpark pip documentation

2017-07-21 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-21434. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 Issue resolved by pull request

[jira] [Assigned] (SPARK-21434) Add PySpark pip documentation

2017-07-21 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-21434: --- Assignee: holdenk > Add PySpark pip documentation > - > >

[jira] [Resolved] (SPARK-21489) Update release docs to point out Python 2.6 support is removed.

2017-07-21 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-21489. - Resolution: Resolved Assignee: Hyukjin Kwon Fix Version/s: 2.2.1 Fixed in

[jira] [Created] (SPARK-21489) Update release docs to point out Python 2.6 support is removed.

2017-07-20 Thread holdenk (JIRA)
holdenk created SPARK-21489: --- Summary: Update release docs to point out Python 2.6 support is removed. Key: SPARK-21489 URL: https://issues.apache.org/jira/browse/SPARK-21489 Project: Spark Issue

[jira] [Commented] (SPARK-7146) Should ML sharedParams be a public API?

2017-07-20 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095299#comment-16095299 ] holdenk commented on SPARK-7146: So it seems like there is a (more recent) agreement that exposing this as

[jira] [Updated] (SPARK-21394) Reviving broken callable objects in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-21394: Fix Version/s: 2.3.0 > Reviving broken callable objects in UDF in PySpark >

[jira] [Updated] (SPARK-21432) Reviving broken partial functions in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-21432: Fix Version/s: 2.3.0 > Reviving broken partial functions in UDF in PySpark >

[jira] [Assigned] (SPARK-21394) Reviving broken callable objects in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-21394: --- Assignee: Hyukjin Kwon > Reviving broken callable objects in UDF in PySpark >

[jira] [Resolved] (SPARK-21394) Reviving broken callable objects in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-21394. - Resolution: Fixed > Reviving broken callable objects in UDF in PySpark >

[jira] [Resolved] (SPARK-21432) Reviving broken partial functions in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-21432. - Resolution: Fixed > Reviving broken partial functions in UDF in PySpark >

[jira] [Assigned] (SPARK-21432) Reviving broken partial functions in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-21432: --- Assignee: Hyukjin Kwon > Reviving broken partial functions in UDF in PySpark >

[jira] [Updated] (SPARK-21394) Reviving broken callable objects in UDF in PySpark

2017-07-17 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-21394: Affects Version/s: (was: 2.2.0) > Reviving broken callable objects in UDF in PySpark >

[jira] [Created] (SPARK-21436) Take advantage of known partioner for distinct on RDDs

2017-07-17 Thread holdenk (JIRA)
holdenk created SPARK-21436: --- Summary: Take advantage of known partioner for distinct on RDDs Key: SPARK-21436 URL: https://issues.apache.org/jira/browse/SPARK-21436 Project: Spark Issue Type:

[jira] [Created] (SPARK-21434) Add PySpark pip documentation

2017-07-17 Thread holdenk (JIRA)
holdenk created SPARK-21434: --- Summary: Add PySpark pip documentation Key: SPARK-21434 URL: https://issues.apache.org/jira/browse/SPARK-21434 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-21425) LongAccumulator, DoubleAccumulator not threadsafe

2017-07-16 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088842#comment-16088842 ] holdenk commented on SPARK-21425: - Thanks for finding this. I've given some thought in the past to

[jira] [Created] (SPARK-21384) Spark 2.2 + YARN without spark.yarn.jars / spark.yarn.archive fails

2017-07-11 Thread holdenk (JIRA)
holdenk created SPARK-21384: --- Summary: Spark 2.2 + YARN without spark.yarn.jars / spark.yarn.archive fails Key: SPARK-21384 URL: https://issues.apache.org/jira/browse/SPARK-21384 Project: Spark

[jira] [Resolved] (SPARK-21278) Upgrade to Py4J 0.10.6

2017-07-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-21278. - Resolution: Fixed Issue resolved by pull request 18546 [https://github.com/apache/spark/pull/18546] >

[jira] [Assigned] (SPARK-21278) Upgrade to Py4J 0.10.6

2017-07-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-21278: --- Assignee: Dongjoon Hyun > Upgrade to Py4J 0.10.6 > -- > > Key:

[jira] [Updated] (SPARK-21278) Upgrade to Py4J 0.10.6

2017-07-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-21278: Fix Version/s: 2.3.0 > Upgrade to Py4J 0.10.6 > -- > > Key:

[jira] [Created] (SPARK-21231) Conda install of packages during Jenkins testing is causing intermittent failure

2017-06-27 Thread holdenk (JIRA)
holdenk created SPARK-21231: --- Summary: Conda install of packages during Jenkins testing is causing intermittent failure Key: SPARK-21231 URL: https://issues.apache.org/jira/browse/SPARK-21231 Project:

[jira] [Updated] (SPARK-21084) Improvements to dynamic allocation for notebook use cases

2017-06-13 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-21084: Component/s: YARN Scheduler Block Manager > Improvements to dynamic

[jira] [Updated] (SPARK-21084) Improvements to dynamic allocation for notebook use cases

2017-06-13 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-21084: Affects Version/s: 2.3.0 > Improvements to dynamic allocation for notebook use cases >

[jira] [Created] (SPARK-21068) SparkR error message when passed an R object rather than Java object could be more informative

2017-06-12 Thread holdenk (JIRA)
holdenk created SPARK-21068: --- Summary: SparkR error message when passed an R object rather than Java object could be more informative Key: SPARK-21068 URL: https://issues.apache.org/jira/browse/SPARK-21068

[jira] [Created] (SPARK-21040) On executor/worker decommission consider speculatively re-launching current tasks

2017-06-09 Thread holdenk (JIRA)
holdenk created SPARK-21040: --- Summary: On executor/worker decommission consider speculatively re-launching current tasks Key: SPARK-21040 URL: https://issues.apache.org/jira/browse/SPARK-21040 Project:

[jira] [Commented] (SPARK-20628) Keep track of nodes which are going to be shut down & avoid scheduling new tasks

2017-05-13 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009261#comment-16009261 ] holdenk commented on SPARK-20628: - I'm going to take a crack at implementing this - I'm traveling a lot

[jira] [Updated] (SPARK-20629) Copy shuffle data when nodes are being shut down

2017-05-13 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-20629: Summary: Copy shuffle data when nodes are being shut down (was: Copy data when nodes are being shut down)

[jira] [Updated] (SPARK-20624) Add better handling for node shutdown

2017-05-12 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-20624: Summary: Add better handling for node shutdown (was: Consider adding better handling for node shutdown)

[jira] [Resolved] (SPARK-20627) Remove pip local version string (PEP440)

2017-05-09 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-20627. - Resolution: Fixed Fix Version/s: 2.1.2 2.3.0 2.2.1 Issue

[jira] [Assigned] (SPARK-20627) Remove pip local version string (PEP440)

2017-05-09 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-20627: --- Assignee: holdenk > Remove pip local version string (PEP440) >

[jira] [Created] (SPARK-20676) Upload to PyPi

2017-05-09 Thread holdenk (JIRA)
holdenk created SPARK-20676: --- Summary: Upload to PyPi Key: SPARK-20676 URL: https://issues.apache.org/jira/browse/SPARK-20676 Project: Spark Issue Type: Sub-task Components: PySpark

[jira] [Created] (SPARK-20629) Copy data when nodes are being shut down

2017-05-06 Thread holdenk (JIRA)
holdenk created SPARK-20629: --- Summary: Copy data when nodes are being shut down Key: SPARK-20629 URL: https://issues.apache.org/jira/browse/SPARK-20629 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-20628) Keep track of nodes which are going to be shut down & avoid scheduling new tasks

2017-05-06 Thread holdenk (JIRA)
holdenk created SPARK-20628: --- Summary: Keep track of nodes which are going to be shut down & avoid scheduling new tasks Key: SPARK-20628 URL: https://issues.apache.org/jira/browse/SPARK-20628 Project:

[jira] [Created] (SPARK-20627) Remove pip local version string (PEP440)

2017-05-06 Thread holdenk (JIRA)
holdenk created SPARK-20627: --- Summary: Remove pip local version string (PEP440) Key: SPARK-20627 URL: https://issues.apache.org/jira/browse/SPARK-20627 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-20624) Consider adding better handling for node shutdown

2017-05-06 Thread holdenk (JIRA)
holdenk created SPARK-20624: --- Summary: Consider adding better handling for node shutdown Key: SPARK-20624 URL: https://issues.apache.org/jira/browse/SPARK-20624 Project: Spark Issue Type:

[jira] [Created] (SPARK-20618) Support Custom Partitioners in PySpark

2017-05-06 Thread holdenk (JIRA)
holdenk created SPARK-20618: --- Summary: Support Custom Partitioners in PySpark Key: SPARK-20618 URL: https://issues.apache.org/jira/browse/SPARK-20618 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-20442) Fill up documentations for functions in Column API in PySpark

2017-04-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-20442. - Resolution: Fixed Fix Version/s: 2.3.0 > Fill up documentations for functions in Column API in

[jira] [Assigned] (SPARK-20442) Fill up documentations for functions in Column API in PySpark

2017-04-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-20442: --- Assignee: Hyukjin Kwon > Fill up documentations for functions in Column API in PySpark >

[jira] [Assigned] (SPARK-20132) Add documentation for column string functions

2017-04-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-20132: --- Assignee: Michael Patterson > Add documentation for column string functions >

[jira] [Resolved] (SPARK-20132) Add documentation for column string functions

2017-04-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-20132. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 17469

[jira] [Assigned] (SPARK-20360) Create repr functions for interpreters to use

2017-04-18 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-20360: --- Assignee: Kyle Kelley > Create repr functions for interpreters to use >

[jira] [Resolved] (SPARK-20360) Create repr functions for interpreters to use

2017-04-18 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-20360. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17662

[jira] [Updated] (SPARK-20347) Provide AsyncRDDActions in Python

2017-04-15 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-20347: Shepherd: holdenk > Provide AsyncRDDActions in Python > - > >

[jira] [Created] (SPARK-20347) Provide AsyncRDDActions in Python

2017-04-15 Thread holdenk (JIRA)
holdenk created SPARK-20347: --- Summary: Provide AsyncRDDActions in Python Key: SPARK-20347 URL: https://issues.apache.org/jira/browse/SPARK-20347 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-20232) Better combineByKey documentation: clarify memory allocation, better example

2017-04-13 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-20232: --- Assignee: David Gingrich > Better combineByKey documentation: clarify memory allocation, better

[jira] [Resolved] (SPARK-20232) Better combineByKey documentation: clarify memory allocation, better example

2017-04-13 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-20232. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17545

[jira] [Commented] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2017-04-12 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15966339#comment-15966339 ] holdenk commented on SPARK-13534: - So I'm following along with the progress on this, I'll try and take a

[jira] [Resolved] (SPARK-19570) Allow to disable hive in pyspark shell

2017-04-12 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-19570. - Resolution: Fixed Fix Version/s: 2.2.0 > Allow to disable hive in pyspark shell >

[jira] [Assigned] (SPARK-19570) Allow to disable hive in pyspark shell

2017-04-12 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-19570: --- Assignee: Jeff Zhang > Allow to disable hive in pyspark shell >

[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive

2017-04-12 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15966201#comment-15966201 ] holdenk commented on SPARK-20202: - Oh right, sorry I was misreading the intent of Affects Version/s. >

[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive

2017-04-11 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965218#comment-15965218 ] holdenk commented on SPARK-20202: - Would it possible make sense to untarget this from the maintenance

[jira] [Assigned] (SPARK-19505) AttributeError on Exception.message in Python3; hides true exceptions in cloudpickle.py and broadcast.py

2017-04-11 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-19505: --- Assignee: David Gingrich > AttributeError on Exception.message in Python3; hides true exceptions in

[jira] [Resolved] (SPARK-19505) AttributeError on Exception.message in Python3; hides true exceptions in cloudpickle.py and broadcast.py

2017-04-11 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-19505. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16845

[jira] [Resolved] (SPARK-19454) Improve DataFrame.replace API

2017-04-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-19454. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16793

[jira] [Assigned] (SPARK-19454) Improve DataFrame.replace API

2017-04-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-19454: --- Assignee: Maciej Szymkiewicz > Improve DataFrame.replace API > - > >

[jira] [Commented] (SPARK-20216) Install pandoc on machine(s) used for packaging

2017-04-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957346#comment-15957346 ] holdenk commented on SPARK-20216: - Thanks [~marmbrus] :) So looking at that host it seems like pandoc is

[jira] [Resolved] (SPARK-20216) Install pandoc on machine(s) used for packaging

2017-04-05 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-20216. - Resolution: Fixed Assignee: holdenk This has been fixed with install pypandoc into the conda env

[jira] [Created] (SPARK-20216) Install pandoc on machine(s) used for packaging

2017-04-04 Thread holdenk (JIRA)
holdenk created SPARK-20216: --- Summary: Install pandoc on machine(s) used for packaging Key: SPARK-20216 URL: https://issues.apache.org/jira/browse/SPARK-20216 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-19759) ALSModel.predict on Dataframes : potential optimization by not using blas

2017-03-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950003#comment-15950003 ] holdenk commented on SPARK-19759: - How do people feel about targeting this for 2.3? > ALSModel.predict

[jira] [Commented] (SPARK-19690) Join a streaming DataFrame with a batch DataFrame may not work

2017-03-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949997#comment-15949997 ] holdenk commented on SPARK-19690: - I think reducing priority and re-targetting might be called for (as

[jira] [Assigned] (SPARK-19522) --executor-memory flag doesn't work in local-cluster mode

2017-03-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-19522: --- Assignee: Andrew Or (was: holdenk) > --executor-memory flag doesn't work in local-cluster mode >

[jira] [Assigned] (SPARK-19522) --executor-memory flag doesn't work in local-cluster mode

2017-03-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-19522: --- Assignee: holdenk (was: Andrew Or) > --executor-memory flag doesn't work in local-cluster mode >

[jira] [Created] (SPARK-20149) Audit PySpark code base for 2.6 specific work arounds

2017-03-29 Thread holdenk (JIRA)
holdenk created SPARK-20149: --- Summary: Audit PySpark code base for 2.6 specific work arounds Key: SPARK-20149 URL: https://issues.apache.org/jira/browse/SPARK-20149 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-19955) Update run-tests to support conda

2017-03-29 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-19955. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17355

[jira] [Assigned] (SPARK-19955) Update run-tests to support conda

2017-03-29 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-19955: --- Assignee: holdenk > Update run-tests to support conda > - > >

[jira] [Created] (SPARK-20064) Bump the PySpark verison number to 2.2

2017-03-22 Thread holdenk (JIRA)
holdenk created SPARK-20064: --- Summary: Bump the PySpark verison number to 2.2 Key: SPARK-20064 URL: https://issues.apache.org/jira/browse/SPARK-20064 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-19475) (ML|MLlib).linalg.DenseVector method delegation fails for __neg__

2017-03-20 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933914#comment-15933914 ] holdenk commented on SPARK-19475: - This seems like it might be a reasonable candidate for 2.1.1 - what do

[jira] [Updated] (SPARK-19570) Allow to disable hive in pyspark shell

2017-03-20 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-19570: Component/s: SQL > Allow to disable hive in pyspark shell > -- > >

[jira] [Commented] (SPARK-19955) Update run-tests to support conda

2017-03-14 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925295#comment-15925295 ] holdenk commented on SPARK-19955: - It appears that the current Jenkin workers have conda installed & an

[jira] [Updated] (SPARK-19955) Update run-tests to support conda

2017-03-14 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-19955: Issue Type: Sub-task (was: Improvement) Parent: SPARK-12661 > Update run-tests to support conda >

[jira] [Created] (SPARK-19955) Update run-tests to support conda

2017-03-14 Thread holdenk (JIRA)
holdenk created SPARK-19955: --- Summary: Update run-tests to support conda Key: SPARK-19955 URL: https://issues.apache.org/jira/browse/SPARK-19955 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-12334) Support read from multiple input paths for orc file in DataFrameReader.orc

2017-03-09 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-12334. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 10307

[jira] [Assigned] (SPARK-12334) Support read from multiple input paths for orc file in DataFrameReader.orc

2017-03-09 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-12334: --- Assignee: Jeff Zhang > Support read from multiple input paths for orc file in DataFrameReader.orc >

<    1   2   3   4   5   6   7   8   9   10   >