[jira] [Created] (SPARK-5857) pyspark PYTHONPATH not properly set up?

2015-02-16 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-5857: -- Summary: pyspark PYTHONPATH not properly set up? Key: SPARK-5857 URL: https://issues.apache.org/jira/browse/SPARK-5857 Project: Spark Issue Type: Bug C

[jira] [Commented] (SPARK-5856) In Maven build script, launch Zinc with more memory

2015-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323780#comment-14323780 ] Apache Spark commented on SPARK-5856: - User 'pwendell' has created a pull request for

[jira] [Created] (SPARK-5856) In Maven build script, launch Zinc with more memory

2015-02-16 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-5856: -- Summary: In Maven build script, launch Zinc with more memory Key: SPARK-5856 URL: https://issues.apache.org/jira/browse/SPARK-5856 Project: Spark Issue T

[jira] [Commented] (SPARK-1823) ExternalAppendOnlyMap can still OOM if one key is very large

2015-02-16 Thread Mingyu Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323768#comment-14323768 ] Mingyu Kim commented on SPARK-1823: --- [~andrewor14], is anyone working on this actively?

[jira] [Commented] (SPARK-5854) Implement Personalized PageRank with GraphX

2015-02-16 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323762#comment-14323762 ] Takeshi Yamamuro commented on SPARK-5854: - Look good, and welcome your PR :) > Im

[jira] [Created] (SPARK-5855) [Spark SQL] 'explain' command in SparkSQL don't support to analyze the DDL 'VIEW'

2015-02-16 Thread Yi Zhou (JIRA)
Yi Zhou created SPARK-5855: -- Summary: [Spark SQL] 'explain' command in SparkSQL don't support to analyze the DDL 'VIEW' Key: SPARK-5855 URL: https://issues.apache.org/jira/browse/SPARK-5855 Project: Spark

[jira] [Resolved] (SPARK-5802) Cache scaled data in GLM

2015-02-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-5802. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4593 [https://githu

[jira] [Commented] (SPARK-5005) Failed to start spark-shell when using yarn-client mode with the Spark1.2.0

2015-02-16 Thread anuj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323684#comment-14323684 ] anuj commented on SPARK-5005: - i am using spark-1.2.1 and running on yarn mode > Failed to s

[jira] [Commented] (SPARK-5005) Failed to start spark-shell when using yarn-client mode with the Spark1.2.0

2015-02-16 Thread anuj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323683#comment-14323683 ] anuj commented on SPARK-5005: - i am having same issue. @yangping wu what is the resolution fo

[jira] [Commented] (SPARK-5166) Stabilize Spark SQL APIs

2015-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323642#comment-14323642 ] Apache Spark commented on SPARK-5166: - User 'marmbrus' has created a pull request for

[jira] [Commented] (SPARK-5247) Enable javadoc/scaladoc for public classes in catalyst project

2015-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323643#comment-14323643 ] Apache Spark commented on SPARK-5247: - User 'marmbrus' has created a pull request for

[jira] [Commented] (SPARK-5258) Clean up exposed classes in sql.hive package

2015-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323644#comment-14323644 ] Apache Spark commented on SPARK-5258: - User 'marmbrus' has created a pull request for

[jira] [Commented] (SPARK-5722) Infer_schema_type incorrect for Integers in pyspark

2015-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323641#comment-14323641 ] Apache Spark commented on SPARK-5722: - User 'dondrake' has created a pull request for

[jira] [Resolved] (SPARK-5853) Schema support in Row

2015-02-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-5853. Resolution: Fixed Fix Version/s: 1.3.0 > Schema support in Row > - > >

[jira] [Resolved] (SPARK-5363) Spark 1.2 freeze without error notification

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-5363. --- Resolution: Fixed Fix Version/s: 1.2.2 1.3.0 I've merged Davies' patch for 1

[jira] [Resolved] (SPARK-5395) Large number of Python workers causing resource depletion

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-5395. --- Resolution: Fixed Fix Version/s: 1.2.2 I've merged this into `branch-1.2` (1.2.2), so I'm marki

[jira] [Updated] (SPARK-5395) Large number of Python workers causing resource depletion

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5395: -- Labels: (was: backport-needed) > Large number of Python workers causing resource depletion > -

[jira] [Updated] (SPARK-5081) Shuffle write increases

2015-02-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5081: --- Priority: Critical (was: Major) > Shuffle write increases > --- > >

[jira] [Commented] (SPARK-5081) Shuffle write increases

2015-02-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323629#comment-14323629 ] Patrick Wendell commented on SPARK-5081: Hey [~cb_betz], can you verify a few thin

[jira] [Commented] (SPARK-5829) JavaStreamingContext.fileStream run task loop repeated empty when no more new files found

2015-02-16 Thread Littlestar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323619#comment-14323619 ] Littlestar commented on SPARK-5829: --- But it depends on "org.apache.spark.streaming.Time"

[jira] [Updated] (SPARK-5854) Implement Personalized PageRank with GraphX

2015-02-16 Thread Baoxu Shi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Baoxu Shi updated SPARK-5854: - Remaining Estimate: 48h (was: 24h) Original Estimate: 48h (was: 24h) > Implement Personalized PageR

[jira] [Created] (SPARK-5854) Implement Personalized PageRank with GraphX

2015-02-16 Thread Baoxu Shi (JIRA)
Baoxu Shi created SPARK-5854: Summary: Implement Personalized PageRank with GraphX Key: SPARK-5854 URL: https://issues.apache.org/jira/browse/SPARK-5854 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-5852) CTAS fails when converting Hive write path to data source parquet write path

2015-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323596#comment-14323596 ] Apache Spark commented on SPARK-5852: - User 'chenghao-intel' has created a pull reques

[jira] [Commented] (SPARK-5853) Schema support in Row

2015-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323595#comment-14323595 ] Apache Spark commented on SPARK-5853: - User 'rxin' has created a pull request for this

[jira] [Updated] (SPARK-5853) Schema support in Row

2015-02-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5853: --- Summary: Schema support in Row (was: Optional schema support in Row) > Schema support in Row > --

[jira] [Updated] (SPARK-5853) Optional schema support in Row

2015-02-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5853: --- Description: Would be great to add some optional schema support in row, with a schema function that re

[jira] [Created] (SPARK-5853) Optional schema support in Row

2015-02-16 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-5853: -- Summary: Optional schema support in Row Key: SPARK-5853 URL: https://issues.apache.org/jira/browse/SPARK-5853 Project: Spark Issue Type: Sub-task Compo

[jira] [Updated] (SPARK-5850) Remove experimental label for Scala 2.11 and FlumePollingStream

2015-02-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5850: --- Priority: Blocker (was: Major) > Remove experimental label for Scala 2.11 and FlumePollingStr

[jira] [Closed] (SPARK-3340) Deprecate ADD_JARS and ADD_FILES

2015-02-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-3340. Resolution: Fixed Fix Version/s: 1.3.0 > Deprecate ADD_JARS and ADD_FILES > -

[jira] [Closed] (SPARK-5849) Handle more types of invalid JSON requests in SubmitRestProtocolMessage.parseAction

2015-02-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-5849. Resolution: Fixed Fix Version/s: 1.3.0 > Handle more types of invalid JSON requests in > SubmitRestP

[jira] [Updated] (SPARK-5788) Capture exceptions in Python write thread

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5788: -- Assignee: Davies Liu > Capture exceptions in Python write thread >

[jira] [Resolved] (SPARK-5788) Capture exceptions in Python write thread

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-5788. --- Resolution: Fixed Fix Version/s: 1.2.2 1.3.0 Issue resolved by pull request

[jira] [Updated] (SPARK-5250) EOFException in when reading gzipped files from S3 with wholeTextFiles

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5250: -- Priority: Critical (was: Major) > EOFException in when reading gzipped files from S3 with wholeTextFile

[jira] [Updated] (SPARK-4414) SparkContext.wholeTextFiles Doesn't work with S3 Buckets

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4414: -- Priority: Critical (was: Major) > SparkContext.wholeTextFiles Doesn't work with S3 Buckets > --

[jira] [Updated] (SPARK-5250) EOFException in when reading gzipped files from S3 with wholeTextFiles

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5250: -- Component/s: Spark Core > EOFException in when reading gzipped files from S3 with wholeTextFiles > -

[jira] [Updated] (SPARK-5594) SparkException: Failed to get broadcast (TorrentBroadcast)

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5594: -- Affects Version/s: 1.3.0 > SparkException: Failed to get broadcast (TorrentBroadcast) >

[jira] [Commented] (SPARK-4603) EOF when broadcasting a dict with an empty string value.

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323534#comment-14323534 ] Josh Rosen commented on SPARK-4603: --- I can't reproduce this issue in Spark 1.3.0. Have

[jira] [Commented] (SPARK-5801) Shuffle creates too many nested directories

2015-02-16 Thread Weizhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323531#comment-14323531 ] Weizhong commented on SPARK-5801: - This is because in standalone, worker will create temp

[jira] [Commented] (SPARK-4395) Running a Spark SQL SELECT command from PySpark causes a hang for ~ 1 hour

2015-02-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323532#comment-14323532 ] Davies Liu commented on SPARK-4395: --- [~sameerf] Could you help [~lian cheng] to reproduc

[jira] [Commented] (SPARK-5829) JavaStreamingContext.fileStream run task loop repeated empty when no more new files found

2015-02-16 Thread Littlestar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323529#comment-14323529 ] Littlestar commented on SPARK-5829: --- The following code as same as saveAsNewAPIHadoopFi

[jira] [Updated] (SPARK-4118) Create python bindings for Streaming KMeans

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4118: -- Issue Type: Sub-task (was: Improvement) Parent: SPARK-3258 > Create python bindings for Streami

[jira] [Updated] (SPARK-4127) Streaming Linear Regression- Python bindings

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4127: -- Issue Type: Sub-task (was: Improvement) Parent: SPARK-3258 > Streaming Linear Regression- Pytho

[jira] [Commented] (SPARK-4395) Running a Spark SQL SELECT command from PySpark causes a hang for ~ 1 hour

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323523#comment-14323523 ] Josh Rosen commented on SPARK-4395: --- Should we resolve this as "Can't Reproduce"? > Run

[jira] [Commented] (SPARK-3524) remove workaround to pickle array of float for Pyrolite

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323520#comment-14323520 ] Josh Rosen commented on SPARK-3524: --- Actually, it turns out that Pyrolite isn't publishe

[jira] [Updated] (SPARK-3524) remove workaround to pickle array of float for Pyrolite

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3524: -- Assignee: (was: Josh Rosen) > remove workaround to pickle array of float for Pyrolite >

[jira] [Updated] (SPARK-3524) remove workaround to pickle array of float for Pyrolite

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3524: -- Assignee: Josh Rosen (was: Davies Liu) It looks like an updated version of Pyrolite has been released,

[jira] [Resolved] (SPARK-1670) PySpark Fails to Create SparkContext Due To Debugging Options in conf/java-opts

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1670. --- Resolution: Fixed Fix Version/s: 1.3.0 The root cause, SPARK-2313, was fixed for 1.3, so I'm go

[jira] [Resolved] (SPARK-5848) ConsoleProgressBar timer thread leaks SparkContext

2015-02-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5848. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4635 [https://github.com/ap

[jira] [Updated] (SPARK-5848) ConsoleProgressBar timer thread leaks SparkContext

2015-02-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5848: - Assignee: Matt Whelan > ConsoleProgressBar timer thread leaks SparkContext > -

[jira] [Commented] (SPARK-3842) Remove the hacks for Python callback server in py4j

2015-02-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323513#comment-14323513 ] Davies Liu commented on SPARK-3842: --- Only part of the patched were accepted by upstream,

[jira] [Commented] (SPARK-3842) Remove the hacks for Python callback server in py4j

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323512#comment-14323512 ] Josh Rosen commented on SPARK-3842: --- It looks like this issue is still relevant as of Fe

[jira] [Resolved] (SPARK-5361) Multiple Java RDD <-> Python RDD conversions not working correctly

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-5361. --- Resolution: Fixed Fix Version/s: 1.2.2 I've cherry-picked the fix into `branch-1.2` (1.2.2), so

[jira] [Resolved] (SPARK-2697) Source Scala and Python shell banners from a single place

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2697. --- Resolution: Won't Fix Now that we source the version from SparkContext / the Spark package object, I

[jira] [Resolved] (SPARK-5047) Add Kafka to the Python Streaming API

2015-02-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-5047. --- Resolution: Fixed Fix Version/s: 1.3.0 Target Version/s: 1.3.0 (was: 1.3.0, 1.2.1) h

[jira] [Resolved] (SPARK-2876) RDD.partitionBy loads entire partition into memory

2015-02-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-2876. --- Resolution: Fixed Fix Version/s: 1.1.0 > RDD.partitionBy loads entire partition into memory > -

[jira] [Commented] (SPARK-2876) RDD.partitionBy loads entire partition into memory

2015-02-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323492#comment-14323492 ] Davies Liu commented on SPARK-2876: --- [~joshrosen] BatchedSerializer is still needed in s

[jira] [Resolved] (SPARK-5441) SerDeUtil Pair RDD to python conversion doesn't accept empty RDDs

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-5441. --- Resolution: Fixed Fix Version/s: 1.2.2 I've cherry-picked this to `branch-1.2` (1.2.2), so I'm

[jira] [Commented] (SPARK-3316) Spark driver will not exit after python program finished

2015-02-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323487#comment-14323487 ] Davies Liu commented on SPARK-3316: --- I remembered that this is fixed in SPARK-4415, we c

[jira] [Resolved] (SPARK-3316) Spark driver will not exit after python program finished

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3316. --- Resolution: Cannot Reproduce I'm going to resolve this as "Cannot Reproduce" because it's really old a

[jira] [Commented] (SPARK-5723) Change the default file format to Parquet for CTAS statements.

2015-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323484#comment-14323484 ] Apache Spark commented on SPARK-5723: - User 'yhuai' has created a pull request for thi

[jira] [Commented] (SPARK-2876) RDD.partitionBy loads entire partition into memory

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323481#comment-14323481 ] Josh Rosen commented on SPARK-2876: --- [~davies], I'm going through old PySpark issues and

[jira] [Commented] (SPARK-5810) Maven Coordinate Inclusion failing in pySpark

2015-02-16 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323474#comment-14323474 ] Burak Yavuz commented on SPARK-5810: Makes sense to add a regression test. I'll add it

[jira] [Commented] (SPARK-1418) Python MLlib's _get_unmangled_rdd should uncache RDDs when training is done

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323473#comment-14323473 ] Josh Rosen commented on SPARK-1418: --- This issue's description is now a little confusing,

[jira] [Commented] (SPARK-5852) CTAS fails when converting Hive write path to data source parquet write path

2015-02-16 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323471#comment-14323471 ] Yin Huai commented on SPARK-5852: - The following CTAS statement directly using the data so

[jira] [Created] (SPARK-5852) CTAS fails when converting Hive write path to data source parquet write path

2015-02-16 Thread Yin Huai (JIRA)
Yin Huai created SPARK-5852: --- Summary: CTAS fails when converting Hive write path to data source parquet write path Key: SPARK-5852 URL: https://issues.apache.org/jira/browse/SPARK-5852 Project: Spark

[jira] [Resolved] (SPARK-5026) PySpark rdd.randomSpit() is not documented

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-5026. --- Resolution: Fixed Fix Version/s: 1.2.1 It looks like this is no longer an issue in 1.2.1, so I'

[jira] [Resolved] (SPARK-655) Implement co-partitioning aware joins in PySpark

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-655. -- Resolution: Duplicate > Implement co-partitioning aware joins in PySpark > --

[jira] [Resolved] (SPARK-5296) Predicate Pushdown (BaseRelation) to have an interface that will accept more filters

2015-02-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5296. - Resolution: Fixed Fix Version/s: 1.3.0 > Predicate Pushdown (BaseRelation) to have

[jira] [Commented] (SPARK-5810) Maven Coordinate Inclusion failing in pySpark

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323465#comment-14323465 ] Josh Rosen commented on SPARK-5810: --- I think that this should be fixed now that my patch

[jira] [Resolved] (SPARK-5839) HiveMetastoreCatalog does not recognize table names and aliases of data source tables.

2015-02-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5839. - Resolution: Fixed Fix Version/s: 1.3.0 > HiveMetastoreCatalog does not recognize ta

[jira] [Updated] (SPARK-5839) HiveMetastoreCatalog does not recognize table names and aliases of data source tables.

2015-02-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5839: Assignee: Yin Huai > HiveMetastoreCatalog does not recognize table names and aliases of data

[jira] [Updated] (SPARK-5519) Add user guide for FP-Growth

2015-02-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5519: - Assignee: Xiangrui Meng (was: Jacky Li) > Add user guide for FP-Growth >

[jira] [Updated] (SPARK-5283) ML sharedParams should be public

2015-02-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5283: - Target Version/s: 1.4.0 (was: 1.3.0) > ML sharedParams should be public > ---

[jira] [Updated] (SPARK-5114) Should Evaluator be a PipelineStage

2015-02-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5114: - Target Version/s: 1.4.0 (was: 1.3.0) > Should Evaluator be a PipelineStage >

[jira] [Resolved] (SPARK-4865) Include temporary tables in SHOW TABLES

2015-02-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4865. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4618 [https:/

[jira] [Commented] (SPARK-5016) GaussianMixtureEM should distribute matrix inverse for large numFeatures, k

2015-02-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323460#comment-14323460 ] Joseph K. Bradley commented on SPARK-5016: -- Your back-of-the-envelope calculation

[jira] [Resolved] (SPARK-5746) Check invalid cases for the write path of data source API

2015-02-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5746. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4617 [https:/

[jira] [Updated] (SPARK-1600) flaky "recovery with file input stream" test in streaming.CheckpointSuite

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1600: -- Target Version/s: 1.3.0, 1.2.2 (was: 1.3.0) I've also merged this into branch-1.2. > flaky "recovery w

[jira] [Updated] (SPARK-1600) flaky "recovery with file input stream" test in streaming.CheckpointSuite

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1600: -- Fix Version/s: 1.2.2 > flaky "recovery with file input stream" test in streaming.CheckpointSuite > -

[jira] [Assigned] (SPARK-1600) flaky "recovery with file input stream" test in streaming.CheckpointSuite

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-1600: - Assignee: Josh Rosen > flaky "recovery with file input stream" test in streaming.CheckpointSuite

[jira] [Resolved] (SPARK-2313) PySpark should accept port via a command line argument rather than STDIN

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2313. --- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4603 [https://github.com/

[jira] [Assigned] (SPARK-2313) PySpark should accept port via a command line argument rather than STDIN

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-2313: - Assignee: Josh Rosen > PySpark should accept port via a command line argument rather than STDIN >

[jira] [Created] (SPARK-5851) spark_ec2.py ssh failure retry handling not always appropriate

2015-02-16 Thread Florian Verhein (JIRA)
Florian Verhein created SPARK-5851: -- Summary: spark_ec2.py ssh failure retry handling not always appropriate Key: SPARK-5851 URL: https://issues.apache.org/jira/browse/SPARK-5851 Project: Spark

[jira] [Updated] (SPARK-5695) Check GBT caching logic

2015-02-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5695: - Assignee: Joseph K. Bradley (was: Xiangrui Meng) > Check GBT caching logic >

[jira] [Resolved] (SPARK-5358) spark.files.userClassPathFirst doesn't work correctly

2015-02-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5358. -- Resolution: Duplicate OK as I understand this was covered by SPARK-2996 > spark.files.userClassPathFirs

[jira] [Updated] (SPARK-5357) Upgrade from commons-codec 1.5

2015-02-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5357: - Priority: Minor (was: Major) Assignee: Matt Whelan > Upgrade from commons-codec 1.5 > ---

[jira] [Resolved] (SPARK-5357) Upgrade from commons-codec 1.5

2015-02-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5357. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4153 [https://github.com/ap

[jira] [Commented] (SPARK-5850) Remove experimental label for Scala 2.11 and FlumePollingStream

2015-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323430#comment-14323430 ] Apache Spark commented on SPARK-5850: - User 'pwendell' has created a pull request for

[jira] [Created] (SPARK-5850) Clean up experimental label for Scala 2.11 and FlumePollingStream

2015-02-16 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-5850: -- Summary: Clean up experimental label for Scala 2.11 and FlumePollingStream Key: SPARK-5850 URL: https://issues.apache.org/jira/browse/SPARK-5850 Project: Spark

[jira] [Updated] (SPARK-5850) Remove experimental label for Scala 2.11 and FlumePollingStream

2015-02-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5850: --- Summary: Remove experimental label for Scala 2.11 and FlumePollingStream (was: Clean up exper

[jira] [Updated] (SPARK-5849) Handle more types of invalid JSON requests in SubmitRestProtocolMessage.parseAction

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5849: -- Summary: Handle more types of invalid JSON requests in SubmitRestProtocolMessage.parseAction (was: Hand

[jira] [Updated] (SPARK-5849) Handle more types of invalid JSON in SubmitRestProtocolMessage.parseAction

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5849: -- Summary: Handle more types of invalid JSON in SubmitRestProtocolMessage.parseAction (was: SubmitRestPro

[jira] [Commented] (SPARK-5849) Handle more types of invalid JSON in SubmitRestProtocolMessage.parseAction

2015-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323421#comment-14323421 ] Apache Spark commented on SPARK-5849: - User 'JoshRosen' has created a pull request for

[jira] [Updated] (SPARK-5841) Memory leak in DiskBlockManager

2015-02-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5841: - Assignee: Matt Whelan > Memory leak in DiskBlockManager > --- > >

[jira] [Resolved] (SPARK-5841) Memory leak in DiskBlockManager

2015-02-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5841. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4627 [https://github.com/ap

[jira] [Created] (SPARK-5849) SubmitRestProtocolMessage should handle invalid requests that parse as JSON strings

2015-02-16 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-5849: - Summary: SubmitRestProtocolMessage should handle invalid requests that parse as JSON strings Key: SPARK-5849 URL: https://issues.apache.org/jira/browse/SPARK-5849 Project:

[jira] [Updated] (SPARK-5849) SubmitRestProtocolMessage should handle invalid requests that parse as JSON strings

2015-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5849: -- Affects Version/s: (was: 1.30) 1.3.0 > SubmitRestProtocolMessage should handl

[jira] [Commented] (SPARK-5016) GaussianMixtureEM should distribute matrix inverse for large numFeatures, k

2015-02-16 Thread Travis Galoppo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323412#comment-14323412 ] Travis Galoppo commented on SPARK-5016: --- Realistically, I think it will be very diff

[jira] [Updated] (SPARK-5846) Spark SQL does not correctly set job description and scheduler pool

2015-02-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5846: --- Priority: Critical (was: Major) > Spark SQL does not correctly set job description and schedu

[jira] [Updated] (SPARK-5848) ConsoleProgressBar timer thread leaks SparkContext

2015-02-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5848: --- Component/s: (was: Web UI) Spark Shell > ConsoleProgressBar timer thread

  1   2   3   >