[jira] [Commented] (SPARK-1740) Pyspark cancellation kills unrelated pyspark workers

2014-06-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044981#comment-14044981 ] Josh Rosen commented on SPARK-1740: --- The Python daemon - multiple workers architecture

[jira] [Resolved] (SPARK-1030) unneeded file required when running pyspark program using yarn-client

2014-07-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1030. --- Resolution: Fixed Fix Version/s: 1.0.0 Closing this now, since it was addressed as part of

[jira] [Commented] (SPARK-2387) Remove the stage barrier for better resource utilization

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074521#comment-14074521 ] Josh Rosen commented on SPARK-2387: --- {quote} For example, in a push-style shuffle, the

[jira] [Commented] (SPARK-2638) Improve concurrency of fetching Map outputs

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074710#comment-14074710 ] Josh Rosen commented on SPARK-2638: --- Hi Stephen, The goal of MapOutputTracker's

[jira] [Resolved] (SPARK-1394) calling system.platform on worker raises IOError

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1394. --- Resolution: Fixed Fix Version/s: 1.0.1 calling system.platform on worker raises IOError

[jira] [Resolved] (SPARK-1257) Endless running task when using pyspark with input file containing a long line

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1257. --- Resolution: Fixed Fix Version/s: 0.9.1 Assignee: Josh Rosen Endless running task

[jira] [Resolved] (SPARK-1011) MatrixFactorizationModel in pyspark throws serialization error

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1011. --- Resolution: Fixed Fix Version/s: 0.9.1 Assignee: Hossein Falaki This was fixed in

[jira] [Commented] (SPARK-1011) MatrixFactorizationModel in pyspark throws serialization error

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075094#comment-14075094 ] Josh Rosen commented on SPARK-1011: --- Oh, and if you want to combine the actual vs.

[jira] [Resolved] (SPARK-915) Tidy up the scripts

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-915. -- Resolution: Fixed Fix Version/s: 0.9.0 Tidy up the scripts ---

[jira] [Updated] (SPARK-1498) Spark can hang if pyspark tasks fail

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1498: -- Affects Version/s: 0.9.2 0.9.1 Fix Version/s: 1.0.0 This is still a

[jira] [Commented] (SPARK-606) Add mapSideCombine setting to Java API partitionBy() method.

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075481#comment-14075481 ] Josh Rosen commented on SPARK-606: -- mapSideCombine was removed from partitionBy in

[jira] [Resolved] (SPARK-606) Add mapSideCombine setting to Java API partitionBy() method.

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-606. -- Resolution: Won't Fix Fix Version/s: 0.8.0 Assignee: Reynold Xin Add mapSideCombine

[jira] [Resolved] (SPARK-661) Java unit tests don't seem to run with Maven

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-661. -- Resolution: Cannot Reproduce Java unit tests don't seem to run with Maven

[jira] [Resolved] (SPARK-2694) machine learning

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2694. --- Resolution: Incomplete machine learning Key: SPARK-2694

[jira] [Resolved] (SPARK-2637) PEP8 Compliance pull request #1540

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2637. --- Resolution: Won't Fix Closing this as 'wont fix' since we decided not to re-format code cloudpickle

[jira] [Updated] (SPARK-2637) PEP8 Compliance pull request #1540

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2637: -- Component/s: (was: Documentation) PySpark PEP8 Compliance pull request #1540

[jira] [Resolved] (SPARK-1036) .gitignore is overly aggressive

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1036. --- Resolution: Fixed Fix Version/s: 1.0.0 Assignee: Patrick Wendell Fixed by Patrick in

[jira] [Resolved] (SPARK-717) Refactor Programming Guides in Documentation

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-717. -- Resolution: Fixed Target Version/s: 1.0.0 Refactor Programming Guides in Documentation

[jira] [Resolved] (SPARK-2547) The clustering documentaion example provided for spark 0.9.1/docs is having a error

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2547. --- Resolution: Fixed Fix Version/s: 0.9.3 Target Version/s: (was: 0.9.2) The

[jira] [Assigned] (SPARK-2601) py4j.Py4JException on sc.pickleFile

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-2601: - Assignee: Josh Rosen py4j.Py4JException on sc.pickleFile ---

[jira] [Updated] (SPARK-2601) py4j.Py4JException on sc.pickleFile

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2601: -- Affects Version/s: 1.1.0 py4j.Py4JException on sc.pickleFile ---

[jira] [Updated] (SPARK-2435) Add shutdown hook to bin/pyspark

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2435: -- Assignee: Josh Rosen Add shutdown hook to bin/pyspark

[jira] [Commented] (SPARK-1550) Successive creation of spark context fails in pyspark, if the previous initialization of spark context had failed.

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075509#comment-14075509 ] Josh Rosen commented on SPARK-1550: --- Actually, there's still a similar problem in Spark

[jira] [Assigned] (SPARK-1550) Successive creation of spark context fails in pyspark, if the previous initialization of spark context had failed.

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-1550: - Assignee: Josh Rosen Successive creation of spark context fails in pyspark, if the previous

[jira] [Updated] (SPARK-1550) Successive creation of spark context fails in pyspark, if the previous initialization of spark context had failed.

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1550: -- Affects Version/s: 1.0.1 0.9.2 1.0.0 Successive

[jira] [Resolved] (SPARK-1207) Make python support for histograms

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1207. --- Resolution: Duplicate Make python support for histograms --

[jira] [Updated] (SPARK-1170) Add histogram() to PySpark

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1170: -- Assignee: (was: Prashant Sharma) Add histogram() to PySpark --

[jira] [Updated] (SPARK-1170) Add histogram() to PySpark

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1170: -- Assignee: Prashant Sharma (was: Josh Rosen) Add histogram() to PySpark --

[jira] [Assigned] (SPARK-1170) Add histogram() to PySpark

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-1170: - Assignee: Josh Rosen Add histogram() to PySpark --

[jira] [Assigned] (SPARK-1170) Add histogram() to PySpark

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-1170: - Assignee: Josh Rosen Add histogram() to PySpark --

[jira] [Commented] (SPARK-1170) Add histogram() to PySpark

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075522#comment-14075522 ] Josh Rosen commented on SPARK-1170: --- Hi [~dwmclary] and [~prashant_], It looks like

[jira] [Updated] (SPARK-2305) pyspark - depend on py4j 0.8.1

2014-07-28 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2305: -- Target Version/s: 1.1.0 Assignee: Josh Rosen Py4J 0.8.2.1 was just released; I'll look

[jira] [Commented] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077452#comment-14077452 ] Josh Rosen commented on SPARK-1630: --- We aren't passing completely arbitrary iterators of

[jira] [Resolved] (SPARK-2580) broken pipe collecting schemardd results

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2580. --- Resolution: Fixed Fix Version/s: 1.0.3 1.1.0 Target Version/s:

[jira] [Commented] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077492#comment-14077492 ] Josh Rosen commented on SPARK-1630: --- In the current Spark codebase, the PythonRDD

[jira] [Resolved] (SPARK-791) [pyspark] operator.getattr not serialized

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-791. -- Resolution: Fixed [pyspark] operator.getattr not serialized -

[jira] [Updated] (SPARK-791) [pyspark] operator.getattr not serialized

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-791: - Affects Version/s: 1.0.0 Fix Version/s: 1.0.3 0.9.3

[jira] [Updated] (SPARK-791) [pyspark] operator.getattr not serialized

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-791: - Target Version/s: 1.0.2 [pyspark] operator.getattr not serialized

[jira] [Commented] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077945#comment-14077945 ] Josh Rosen commented on SPARK-1630: --- Hi Kalpit, Thanks for sharing your use-case; it

[jira] [Created] (SPARK-2737) ClassCastExceptions when collect()ing JavaRDDs' underlying Scala RDDs

2014-07-29 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-2737: - Summary: ClassCastExceptions when collect()ing JavaRDDs' underlying Scala RDDs Key: SPARK-2737 URL: https://issues.apache.org/jira/browse/SPARK-2737 Project: Spark

[jira] [Commented] (SPARK-2712) Add a small note that mvn package must happen before test

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078536#comment-14078536 ] Josh Rosen commented on SPARK-2712: --- I think that we wouldn't need this if we modified

[jira] [Resolved] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-30 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1630. --- Resolution: Won't Fix Based on some discussion in https://github.com/apache/spark/pull/1551, we've

[jira] [Updated] (SPARK-2736) Ceeate Pyspark RDD from Apache Avro File

2014-07-30 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2736: -- Assignee: Kan Zhang (was: Josh Rosen) Ceeate Pyspark RDD from Apache Avro File

[jira] [Assigned] (SPARK-2736) Ceeate Pyspark RDD from Apache Avro File

2014-07-30 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-2736: - Assignee: Josh Rosen Ceeate Pyspark RDD from Apache Avro File

[jira] [Resolved] (SPARK-2024) Add saveAsSequenceFile to PySpark

2014-07-30 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2024. --- Resolution: Fixed Fix Version/s: 1.1.0 Add saveAsSequenceFile to PySpark

[jira] [Created] (SPARK-2764) Simplify process structure of PySpark daemon / worker launching process

2014-07-30 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-2764: - Summary: Simplify process structure of PySpark daemon / worker launching process Key: SPARK-2764 URL: https://issues.apache.org/jira/browse/SPARK-2764 Project: Spark

[jira] [Assigned] (SPARK-2764) Simplify process structure of PySpark daemon / worker launching process

2014-07-30 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-2764: - Assignee: Josh Rosen Simplify process structure of PySpark daemon / worker launching process

[jira] [Resolved] (SPARK-2737) ClassCastExceptions when collect()ing JavaRDDs' underlying Scala RDDs

2014-07-30 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2737. --- Resolution: Fixed Fix Version/s: 1.1.0 Target Version/s: (was: 1.1.0)

[jira] [Resolved] (SPARK-2740) In JavaPairRdd, allow user to specify ascending and numPartitions for sortByKey

2014-07-31 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2740. --- Resolution: Fixed Fix Version/s: 1.1.0 Assignee: Rui Li In JavaPairRdd, allow user

[jira] [Comment Edited] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-31 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14081616#comment-14081616 ] Josh Rosen edited comment on SPARK-2282 at 7/31/14 10:37 PM: -

[jira] [Updated] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-31 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2282: -- Fix Version/s: 1.1.0 PySpark crashes if too many tasks complete quickly

[jira] [Commented] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-31 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14081616#comment-14081616 ] Josh Rosen commented on SPARK-2282: --- Merged the improved fix from

[jira] [Created] (SPARK-2790) PySpark zip() doesn't work properly if RDDs have different serializers

2014-08-01 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-2790: - Summary: PySpark zip() doesn't work properly if RDDs have different serializers Key: SPARK-2790 URL: https://issues.apache.org/jira/browse/SPARK-2790 Project: Spark

[jira] [Resolved] (SPARK-2012) PySpark StatCounter with numpy arrays

2014-08-02 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2012. --- Resolution: Fixed Fix Version/s: 1.1.0 Assignee: Jeremy Freeman PySpark StatCounter

[jira] [Resolved] (SPARK-1740) Pyspark cancellation kills unrelated pyspark workers

2014-08-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1740. --- Resolution: Fixed Fix Version/s: 1.1.0 Pyspark cancellation kills unrelated pyspark workers

[jira] [Resolved] (SPARK-1687) Support NamedTuples in RDDs

2014-08-04 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1687. --- Resolution: Fixed Fix Version/s: 1.1.0 Support NamedTuples in RDDs

[jira] [Assigned] (SPARK-2583) ConnectionManager cannot distinguish whether error occurred or not

2014-08-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-2583: - Assignee: Josh Rosen (was: Kousuke Saruta) ConnectionManager cannot distinguish whether error

[jira] [Resolved] (SPARK-2764) Simplify process structure of PySpark daemon / worker launching process

2014-08-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2764. --- Resolution: Fixed Fix Version/s: 1.1.0 Simplify process structure of PySpark daemon / worker

[jira] [Assigned] (SPARK-2101) Python unit tests fail on Python 2.6 because of lack of unittest.skipIf()

2014-08-07 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-2101: - Assignee: Josh Rosen Python unit tests fail on Python 2.6 because of lack of unittest.skipIf()

[jira] [Created] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException

2014-08-08 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-2931: - Summary: getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException Key: SPARK-2931 URL: https://issues.apache.org/jira/browse/SPARK-2931 Project: Spark

[jira] [Commented] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException

2014-08-08 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091252#comment-14091252 ] Josh Rosen commented on SPARK-2931: --- It's pretty quick to set up a local spark-perf that

[jira] [Commented] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException

2014-08-08 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091581#comment-14091581 ] Josh Rosen commented on SPARK-2931: --- This isn't the easiest bug to reproduce. I tried

[jira] [Updated] (SPARK-2931) getAllowedLocalityLevel() throws ArrayIndexOutOfBoundsException

2014-08-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2931: -- Attachment: scala-sort-by-key.err @ [~kayousterhout]: I can see how that code in {{executorLost()}}

[jira] [Updated] (SPARK-2948) PySpark doesn't work on Python 2.6

2014-08-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2948: -- Affects Version/s: (was: 1.0.2) 1.1.0 PySpark doesn't work on Python 2.6

[jira] [Created] (SPARK-2951) SerDeUtils.pythonToPairRDD fails on RDDs of pickled array.arrays in Python 2.6

2014-08-09 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-2951: - Summary: SerDeUtils.pythonToPairRDD fails on RDDs of pickled array.arrays in Python 2.6 Key: SPARK-2951 URL: https://issues.apache.org/jira/browse/SPARK-2951 Project:

[jira] [Commented] (SPARK-2871) Missing API in PySpark

2014-08-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091986#comment-14091986 ] Josh Rosen commented on SPARK-2871: --- There's actually an open PR for this that's

[jira] [Created] (SPARK-2954) PySpark MLlib serialization tests fail on Python 2.6

2014-08-10 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-2954: - Summary: PySpark MLlib serialization tests fail on Python 2.6 Key: SPARK-2954 URL: https://issues.apache.org/jira/browse/SPARK-2954 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-2954) PySpark MLlib serialization tests fail on Python 2.6

2014-08-10 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-2954: - Assignee: Josh Rosen PySpark MLlib serialization tests fail on Python 2.6

[jira] [Updated] (SPARK-2954) PySpark MLlib serialization tests fail on Python 2.6

2014-08-10 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2954: -- Component/s: PySpark PySpark MLlib serialization tests fail on Python 2.6

[jira] [Assigned] (SPARK-2948) PySpark doesn't work on Python 2.6

2014-08-10 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-2948: - Assignee: Josh Rosen PySpark doesn't work on Python 2.6 --

[jira] [Assigned] (SPARK-2910) Test with Python 2.6 on Jenkins

2014-08-10 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-2910: - Assignee: Josh Rosen Test with Python 2.6 on Jenkins ---

[jira] [Resolved] (SPARK-2898) Failed to connect to daemon

2014-08-10 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2898. --- Resolution: Fixed Fix Version/s: 1.1.0 Failed to connect to daemon

[jira] [Created] (SPARK-2974) Utils.getLocalDir() may return non-existent spark.local.dir directory

2014-08-11 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-2974: - Summary: Utils.getLocalDir() may return non-existent spark.local.dir directory Key: SPARK-2974 URL: https://issues.apache.org/jira/browse/SPARK-2974 Project: Spark

[jira] [Resolved] (SPARK-2948) PySpark doesn't work on Python 2.6

2014-08-11 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2948. --- Resolution: Fixed Fix Version/s: 1.1.0 PySpark doesn't work on Python 2.6

[jira] [Resolved] (SPARK-2954) PySpark MLlib serialization tests fail on Python 2.6

2014-08-11 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2954. --- Resolution: Fixed Fix Version/s: 1.1.0 PySpark MLlib serialization tests fail on Python 2.6

[jira] [Created] (SPARK-2977) Fix handling of short shuffle manager names in ShuffleBlockManager

2014-08-11 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-2977: - Summary: Fix handling of short shuffle manager names in ShuffleBlockManager Key: SPARK-2977 URL: https://issues.apache.org/jira/browse/SPARK-2977 Project: Spark

[jira] [Resolved] (SPARK-2101) Python unit tests fail on Python 2.6 because of lack of unittest.skipIf()

2014-08-11 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2101. --- Resolution: Fixed Fix Version/s: 1.1.0 Python unit tests fail on Python 2.6 because of lack

[jira] [Updated] (SPARK-2975) SPARK_LOCAL_DIRS may cause problems when running in local mode

2014-08-11 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2975: -- Priority: Critical (was: Minor) I'm raising the priority of this issue to 'critical', since it causes

[jira] [Updated] (SPARK-2977) Fix handling of short shuffle manager names in ShuffleBlockManager

2014-08-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2977: -- Assignee: (was: Josh Rosen) Fix handling of short shuffle manager names in ShuffleBlockManager

[jira] [Assigned] (SPARK-2977) Fix handling of short shuffle manager names in ShuffleBlockManager

2014-08-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-2977: - Assignee: Josh Rosen Fix handling of short shuffle manager names in ShuffleBlockManager

[jira] [Commented] (SPARK-922) Update Spark AMI to Python 2.7

2014-08-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098824#comment-14098824 ] Josh Rosen commented on SPARK-922: -- Updated script, which also updates numpy: {code} yum

[jira] [Comment Edited] (SPARK-922) Update Spark AMI to Python 2.7

2014-08-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098824#comment-14098824 ] Josh Rosen edited comment on SPARK-922 at 8/15/14 6:05 PM: ---

[jira] [Comment Edited] (SPARK-922) Update Spark AMI to Python 2.7

2014-08-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098824#comment-14098824 ] Josh Rosen edited comment on SPARK-922 at 8/15/14 6:10 PM: ---

[jira] [Resolved] (SPARK-2110) Misleading help displayed for interactive mode pyspark --help

2014-08-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2110. --- Resolution: Fixed Fix Version/s: 1.1.0 I think this was fixed by SPARK-2678: these options

[jira] [Resolved] (SPARK-2911) provide rdd.parent[T](j) to obtain jth parent of rdd

2014-08-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2911. --- Resolution: Fixed Fix Version/s: 1.2.0 Assignee: Erik Erlandson Marking as 'fixed'

[jira] [Resolved] (SPARK-2717) BasicBlockFetchIterator#next should log when it gets stuck

2014-08-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2717. --- Resolution: Won't Fix This is subsumed by the patch that adds timeouts to BasicBlockFetchIterator.

[jira] [Updated] (SPARK-1477) Add the lifecycle interface

2014-08-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1477: -- Target Version/s: 1.2.0 (was: 1.1.0) Retargeting this to 1.2.0. Add the lifecycle interface

[jira] [Commented] (SPARK-922) Update Spark AMI to Python 2.7

2014-08-15 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099089#comment-14099089 ] Josh Rosen commented on SPARK-922: -- Yeah, you still need to set PYSPARK_PYTHON since this

[jira] [Resolved] (SPARK-2677) BasicBlockFetchIterator#next can wait forever

2014-08-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2677. --- Resolution: Fixed Fix Version/s: 1.1.0 Assignee: Kousuke Saruta (was: Josh Rosen)

[jira] [Resolved] (SPARK-3035) Wrong example with SparkContext.addFile

2014-08-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3035. --- Resolution: Fixed Fix Version/s: (was: 1.0.2) 1.1.0 Wrong example

[jira] [Resolved] (SPARK-2325) Utils.getLocalDir had better check the directory and choose a good one instead of choosing the first one directly

2014-08-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2325. --- Resolution: Duplicate Resolving as duplicate of SPARK-2974, which I'm working on fixing.

[jira] [Commented] (SPARK-2975) SPARK_LOCAL_DIRS may cause problems when running in local mode

2014-08-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099841#comment-14099841 ] Josh Rosen commented on SPARK-2975: --- The driver's configuration properties seem to be

[jira] [Comment Edited] (SPARK-2975) SPARK_LOCAL_DIRS may cause problems when running in local mode

2014-08-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099841#comment-14099841 ] Josh Rosen edited comment on SPARK-2975 at 8/17/14 12:31 AM: -

[jira] [Created] (SPARK-3102) Add tests for yarn-client mode

2014-08-18 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-3102: - Summary: Add tests for yarn-client mode Key: SPARK-3102 URL: https://issues.apache.org/jira/browse/SPARK-3102 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-3103) Fix UTF8 encoding in PySpark saveAsTextFile().

2014-08-18 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-3103: - Summary: Fix UTF8 encoding in PySpark saveAsTextFile(). Key: SPARK-3103 URL: https://issues.apache.org/jira/browse/SPARK-3103 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-3104) Jenkins failing to test some PRs when asked to

2014-08-18 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100930#comment-14100930 ] Josh Rosen commented on SPARK-3104: --- Jenkins only listens to commands from users in the

[jira] [Created] (SPARK-3105) Calling cache() after RDDs are pipelined has no effect in PySpark

2014-08-18 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-3105: - Summary: Calling cache() after RDDs are pipelined has no effect in PySpark Key: SPARK-3105 URL: https://issues.apache.org/jira/browse/SPARK-3105 Project: Spark

[jira] [Resolved] (SPARK-3103) Fix UTF8 encoding in PySpark saveAsTextFile().

2014-08-18 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3103. --- Resolution: Fixed Fix Version/s: 1.1.0 Assignee: Davies Liu Fix UTF8 encoding in

[jira] [Created] (SPARK-3114) Python UDFS broken in Spark SQL

2014-08-18 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-3114: - Summary: Python UDFS broken in Spark SQL Key: SPARK-3114 URL: https://issues.apache.org/jira/browse/SPARK-3114 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-3114) Python UDFS broken in Spark SQL

2014-08-18 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3114. --- Resolution: Fixed Fix Version/s: 1.1.0 Python UDFS broken in Spark SQL

  1   2   3   4   5   6   7   8   9   10   >