[jira] [Created] (SPARK-5977) PySpark SPARK_CLASSPATH doesn't distribute jars to executors

2015-02-24 Thread Michael Nazario (JIRA)
Michael Nazario created SPARK-5977: -- Summary: PySpark SPARK_CLASSPATH doesn't distribute jars to executors Key: SPARK-5977 URL: https://issues.apache.org/jira/browse/SPARK-5977 Project: Spark

[jira] [Commented] (SPARK-5976) Factors returned by ALS do not have partitioners associated.

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335464#comment-14335464 ] Apache Spark commented on SPARK-5976: - User 'mengxr' has created a pull request for

[jira] [Commented] (SPARK-5952) Failure to lock metastore client in tableExists()

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335316#comment-14335316 ] Apache Spark commented on SPARK-5952: - User 'marmbrus' has created a pull request for

[jira] [Created] (SPARK-5981) pyspark ML models fail during predict/transform on vector within map

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5981: Summary: pyspark ML models fail during predict/transform on vector within map Key: SPARK-5981 URL: https://issues.apache.org/jira/browse/SPARK-5981 Project:

[jira] [Updated] (SPARK-5975) SparkSubmit --jars not present on driver

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5975: - Target Version/s: 1.4.0 (was: 1.3.0) SparkSubmit --jars not present on driver

[jira] [Created] (SPARK-5979) `--packages` should not exclude spark streaming assembly jars for kafka and flume

2015-02-24 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-5979: -- Summary: `--packages` should not exclude spark streaming assembly jars for kafka and flume Key: SPARK-5979 URL: https://issues.apache.org/jira/browse/SPARK-5979

[jira] [Comment Edited] (SPARK-5140) Two RDDs which are scheduled concurrently should be able to wait on parent in all cases

2015-02-24 Thread Corey J. Nolet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335518#comment-14335518 ] Corey J. Nolet edited comment on SPARK-5140 at 2/24/15 9:50 PM:

[jira] [Updated] (SPARK-5973) zip two rdd with AutoBatchedSerializer will fail

2015-02-24 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-5973: -- Description: zip two rdd with AutoBatchedSerializer will fail, this bug was introduced by SPARK-4841

[jira] [Updated] (SPARK-3665) Java API for GraphX

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3665: - Affects Version/s: 1.0.0 Java API for GraphX --- Key: SPARK-3665

[jira] [Resolved] (SPARK-5952) Failure to lock metastore client in tableExists()

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5952. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4746

[jira] [Created] (SPARK-5978) Spark examples cannot compile with Hadoop 2

2015-02-24 Thread Michael Nazario (JIRA)
Michael Nazario created SPARK-5978: -- Summary: Spark examples cannot compile with Hadoop 2 Key: SPARK-5978 URL: https://issues.apache.org/jira/browse/SPARK-5978 Project: Spark Issue Type:

[jira] [Commented] (SPARK-5973) zip two rdd with AutoBatchedSerializer will fail

2015-02-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335473#comment-14335473 ] Josh Rosen commented on SPARK-5973: --- Just for searchability's sake, what error message

[jira] [Created] (SPARK-5974) Add save/load to examples in ML guide

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5974: Summary: Add save/load to examples in ML guide Key: SPARK-5974 URL: https://issues.apache.org/jira/browse/SPARK-5974 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-5708) Add Slf4jSink to Spark Metrics Sink

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5708. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4644

[jira] [Updated] (SPARK-3665) Java API for GraphX

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3665: - Target Version/s: 1.4.0 (was: 1.3.0) Java API for GraphX --- Key:

[jira] [Commented] (SPARK-5978) Spark examples cannot compile with Hadoop 2

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335461#comment-14335461 ] Sean Owen commented on SPARK-5978: -- Pretty similar to

[jira] [Created] (SPARK-5980) Add GradientBoostedTrees Python examples to ML guide

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5980: Summary: Add GradientBoostedTrees Python examples to ML guide Key: SPARK-5980 URL: https://issues.apache.org/jira/browse/SPARK-5980 Project: Spark

[jira] [Updated] (SPARK-5971) Add Mesos support to spark-ec2

2015-02-24 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5971: Description: Right now, spark-ec2 can only launch Spark clusters that use the standalone

[jira] [Created] (SPARK-5982) Remove Local Read Time

2015-02-24 Thread Kay Ousterhout (JIRA)
Kay Ousterhout created SPARK-5982: - Summary: Remove Local Read Time Key: SPARK-5982 URL: https://issues.apache.org/jira/browse/SPARK-5982 Project: Spark Issue Type: Bug Reporter:

[jira] [Created] (SPARK-5976) Factors returned by ALS do not have partitioners associated.

2015-02-24 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5976: Summary: Factors returned by ALS do not have partitioners associated. Key: SPARK-5976 URL: https://issues.apache.org/jira/browse/SPARK-5976 Project: Spark

[jira] [Commented] (SPARK-5801) Shuffle creates too many nested directories

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335328#comment-14335328 ] Apache Spark commented on SPARK-5801: - User 'vanzin' has created a pull request for

[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335352#comment-14335352 ] Reynold Xin commented on SPARK-5124: BTW great points about the life cycle and thread

[jira] [Created] (SPARK-5975) SparkSubmit --jars not present on driver

2015-02-24 Thread Andrew Or (JIRA)
Andrew Or created SPARK-5975: Summary: SparkSubmit --jars not present on driver Key: SPARK-5975 URL: https://issues.apache.org/jira/browse/SPARK-5975 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-5708) Add Slf4jSink to Spark Metrics Sink

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5708: - Assignee: Judy Nash Add Slf4jSink to Spark Metrics Sink ---

[jira] [Commented] (SPARK-2336) Approximate k-NN Models for MLLib

2015-02-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335476#comment-14335476 ] Xiangrui Meng commented on SPARK-2336: -- [~Rusty] Could you provide a summary of your

[jira] [Commented] (SPARK-5140) Two RDDs which are scheduled concurrently should be able to wait on parent in all cases

2015-02-24 Thread Corey J. Nolet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335518#comment-14335518 ] Corey J. Nolet commented on SPARK-5140: --- I have a framework (similar to cascading)

[jira] [Created] (SPARK-5969) The pyspark.rdd.sortByKey always fills only two partitions when ascending=False.

2015-02-24 Thread Milan Straka (JIRA)
Milan Straka created SPARK-5969: --- Summary: The pyspark.rdd.sortByKey always fills only two partitions when ascending=False. Key: SPARK-5969 URL: https://issues.apache.org/jira/browse/SPARK-5969

[jira] [Updated] (SPARK-5914) Enable spark-submit to run requiring only user permission on windows

2015-02-24 Thread Judy Nash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Judy Nash updated SPARK-5914: - Summary: Enable spark-submit to run requiring only user permission on windows (was: Enable spark-submit

[jira] [Commented] (SPARK-1867) Spark Documentation Error causes java.lang.IllegalStateException: unread block data

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334714#comment-14334714 ] Sean Owen commented on SPARK-1867: -- I think the status is: for CDH users, I think this is

[jira] [Updated] (SPARK-5952) Failure to lock metastore client in tableExists()

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5952: Component/s: SQL Failure to lock metastore client in tableExists()

[jira] [Commented] (SPARK-5967) JobProgressListener.stageIdToActiveJobIds never cleared

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334729#comment-14334729 ] Sean Owen commented on SPARK-5967: -- This might be a duplicate of this issue:

[jira] [Commented] (SPARK-1867) Spark Documentation Error causes java.lang.IllegalStateException: unread block data

2015-02-24 Thread Philippe Girolami (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334727#comment-14334727 ] Philippe Girolami commented on SPARK-1867: -- [~srowen] I'm only getting this issue

[jira] [Commented] (SPARK-3850) Scala style: disallow trailing spaces

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334737#comment-14334737 ] Sean Owen commented on SPARK-3850: -- Yeah, does anyone know how to do that? this one seems

[jira] [Commented] (SPARK-5914) Enable spark-submit to run requiring only user permission on windows

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334599#comment-14334599 ] Apache Spark commented on SPARK-5914: - User 'judynash' has created a pull request for

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334755#comment-14334755 ] Sean Owen commented on SPARK-5837: -- That should be the YARN Application Master for your

[jira] [Commented] (SPARK-794) Remove sleep() in ClusterScheduler.stop

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334765#comment-14334765 ] Sean Owen commented on SPARK-794: - [~joshrosen] would it be OK if I tried my hand at

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-24 Thread Rok Roskar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334760#comment-14334760 ] Rok Roskar commented on SPARK-5837: --- yes exactly -- that is the original issue: it

[jira] [Commented] (SPARK-5830) Don't create unnecessary directory for local root dir

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334773#comment-14334773 ] Sean Owen commented on SPARK-5830: -- So, the 'too many nested dirs' issue is already

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-24 Thread Rok Roskar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334752#comment-14334752 ] Rok Roskar commented on SPARK-5837: --- still trying to debug this: in the application

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-24 Thread Rok Roskar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334766#comment-14334766 ] Rok Roskar commented on SPARK-5837: --- The spark UI definitely does start, at least

[jira] [Commented] (SPARK-5970) Temporary directories are not removed (but their content is)

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334775#comment-14334775 ] Sean Owen commented on SPARK-5970: -- I believe that's right. Would you like to open a PR?

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334764#comment-14334764 ] Sean Owen commented on SPARK-5837: -- OK good, that's clear. I think it's narrowed down

[jira] [Assigned] (SPARK-5967) JobProgressListener.stageIdToActiveJobIds never cleared

2015-02-24 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das reassigned SPARK-5967: Assignee: Tathagata Das (was: Josh Rosen) JobProgressListener.stageIdToActiveJobIds

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334771#comment-14334771 ] Sean Owen commented on SPARK-5837: -- Ah, it looks like it has to do with running on

[jira] [Commented] (SPARK-5967) JobProgressListener.stageIdToActiveJobIds never cleared

2015-02-24 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334839#comment-14334839 ] Tathagata Das commented on SPARK-5967: -- They are similar but not the same. Those were

[jira] [Commented] (SPARK-5967) JobProgressListener.stageIdToActiveJobIds never cleared

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334843#comment-14334843 ] Sean Owen commented on SPARK-5967: -- [~tdas] can one or both of those be resolved as

[jira] [Commented] (SPARK-2336) Approximate k-NN Models for MLLib

2015-02-24 Thread Ashutosh Trivedi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334880#comment-14334880 ] Ashutosh Trivedi commented on SPARK-2336: - Hi, [~mengxr] We are going through the

[jira] [Updated] (SPARK-5968) Parquet warning in spark-shell

2015-02-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-5968: -- Description: This may happen in the case of schema evolving, namely appending new Parquet data with

[jira] [Commented] (SPARK-2335) k-Nearest Neighbor classification and regression for MLLib

2015-02-24 Thread Ashutosh Trivedi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334889#comment-14334889 ] Ashutosh Trivedi commented on SPARK-2335: - Hi [~mengxr] do you have any link for

[jira] [Commented] (SPARK-5281) Registering table on RDD is giving MissingRequirementError

2015-02-24 Thread Alberto (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334942#comment-14334942 ] Alberto commented on SPARK-5281: Having this problem as well trying to migrate to 1.2.1.

[jira] [Comment Edited] (SPARK-5281) Registering table on RDD is giving MissingRequirementError

2015-02-24 Thread Alberto (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334942#comment-14334942 ] Alberto edited comment on SPARK-5281 at 2/24/15 2:45 PM: - Having

[jira] [Commented] (SPARK-3850) Scala style: disallow trailing spaces

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335067#comment-14335067 ] Sean Owen commented on SPARK-3850: -- PS the error triggered by whitespace was a data file

[jira] [Commented] (SPARK-3850) Scala style: disallow trailing spaces

2015-02-24 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335074#comment-14335074 ] Nicholas Chammas commented on SPARK-3850: - Ah I see. I'm fine with closing this

[jira] [Updated] (SPARK-3850) Scala style: disallow trailing spaces

2015-02-24 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-3850: Description: Background discussions: * https://github.com/apache/spark/pull/2619 *

[jira] [Commented] (SPARK-3850) Scala style: disallow trailing spaces

2015-02-24 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335045#comment-14335045 ] Nicholas Chammas commented on SPARK-3850: - I guess the root is the [Style

[jira] [Updated] (SPARK-5971) Add Mesos support to spark-ec2

2015-02-24 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5971: Summary: Add Mesos support to spark-ec2 (was: Add support for launching Spark-on-Mesos

[jira] [Resolved] (SPARK-5968) Parquet warning in spark-shell

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5968. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4744

[jira] [Updated] (SPARK-4808) Spark fails to spill with small number of large objects

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4808: - Target Version/s: 1.4.0 (was: 1.2.2, 1.4.0, 1.3.1) Spark fails to spill with small number of large

[jira] [Resolved] (SPARK-5910) DataFrame.selectExpr(col as newName) does not work

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5910. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4736

[jira] [Updated] (SPARK-4808) Spark fails to spill with small number of large objects

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4808: - Labels: (was: backport-needed) Spark fails to spill with small number of large objects

[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335234#comment-14335234 ] Reynold Xin commented on SPARK-5124: [~vanzin] I agree - it would be better to have a

[jira] [Resolved] (SPARK-5532) Repartitioning DataFrame causes saveAsParquetFile to fail with VectorUDT

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5532. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4738

[jira] [Created] (SPARK-5972) Cache residuals for GradientBoostedTrees during training

2015-02-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5972: Summary: Cache residuals for GradientBoostedTrees during training Key: SPARK-5972 URL: https://issues.apache.org/jira/browse/SPARK-5972 Project: Spark

[jira] [Closed] (SPARK-5695) Check GBT caching logic

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-5695. Resolution: Not a Problem Fix Version/s: 1.3.0 Target Version/s: 1.3.0

[jira] [Created] (SPARK-5973) zip two rdd with AutoBatchedSerializer will fail

2015-02-24 Thread Davies Liu (JIRA)
Davies Liu created SPARK-5973: - Summary: zip two rdd with AutoBatchedSerializer will fail Key: SPARK-5973 URL: https://issues.apache.org/jira/browse/SPARK-5973 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-5695) Check GBT caching logic

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335239#comment-14335239 ] Joseph K. Bradley commented on SPARK-5695: -- This is likely from re-computing the

[jira] [Assigned] (SPARK-5910) DataFrame.selectExpr(col as newName) does not work

2015-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reassigned SPARK-5910: --- Assignee: Michael Armbrust DataFrame.selectExpr(col as newName) does not work

[jira] [Updated] (SPARK-5436) Validate GradientBoostedTrees during training

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5436: - Assignee: Manoj Kumar Validate GradientBoostedTrees during training

[jira] [Updated] (SPARK-5967) JobProgressListener.stageIdToActiveJobIds never cleared

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5967: - Fix Version/s: (was: 1.4.0) JobProgressListener.stageIdToActiveJobIds never cleared

[jira] [Updated] (SPARK-5967) JobProgressListener.stageIdToActiveJobIds never cleared

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5967: - Target Version/s: 1.3.0, 1.2.2 (was: 1.3.0) JobProgressListener.stageIdToActiveJobIds never cleared

[jira] [Closed] (SPARK-5967) JobProgressListener.stageIdToActiveJobIds never cleared

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-5967. Resolution: Fixed Fix Version/s: 1.4.0 1.2.2 1.3.0

[jira] [Comment Edited] (SPARK-5124) Standardize internal RPC interface

2015-02-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335252#comment-14335252 ] Marcelo Vanzin edited comment on SPARK-5124 at 2/24/15 7:03 PM:

[jira] [Commented] (SPARK-3850) Scala style: disallow trailing spaces

2015-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335077#comment-14335077 ] Sean Owen commented on SPARK-3850: -- Well for golden data files, all bets are off I think.

[jira] [Commented] (SPARK-5968) Parquet warning in spark-shell

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335098#comment-14335098 ] Apache Spark commented on SPARK-5968: - User 'liancheng' has created a pull request for

[jira] [Commented] (SPARK-3850) Scala style: disallow trailing spaces

2015-02-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335082#comment-14335082 ] Cheng Lian commented on SPARK-3850: --- Actually that was a Scala source file, where we put

[jira] [Created] (SPARK-5971) Add support for launching Spark-on-Mesos clusters to spark-ec2

2015-02-24 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-5971: --- Summary: Add support for launching Spark-on-Mesos clusters to spark-ec2 Key: SPARK-5971 URL: https://issues.apache.org/jira/browse/SPARK-5971 Project: Spark

[jira] [Commented] (SPARK-3674) Add support for launching YARN clusters in spark-ec2

2015-02-24 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335199#comment-14335199 ] Nicholas Chammas commented on SPARK-3674: - There is an open PR for this here:

[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335204#comment-14335204 ] Marcelo Vanzin commented on SPARK-5124: --- Ah, also, another comment before I forget.

[jira] [Closed] (SPARK-3882) JobProgressListener gets permanently out of sync with long running job

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-3882. Resolution: Duplicate JobProgressListener gets permanently out of sync with long running job

[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335252#comment-14335252 ] Marcelo Vanzin commented on SPARK-5124: --- Hi @rxin, Yeah, the current {{receive()}}

[jira] [Commented] (SPARK-5973) zip two rdd with AutoBatchedSerializer will fail

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335260#comment-14335260 ] Apache Spark commented on SPARK-5973: - User 'davies' has created a pull request for

[jira] [Closed] (SPARK-5965) Spark UI does not show main class when running app in standalone cluster mode

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-5965. Resolution: Fixed Fix Version/s: 1.3.0 Spark UI does not show main class when running app in

[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335264#comment-14335264 ] Reynold Xin commented on SPARK-5124: It seems to me the 2nd would be useful in some

[jira] [Updated] (SPARK-5981) pyspark ML models fail during predict/transform on vector within map

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5981: - Issue Type: Improvement (was: Bug) pyspark ML models fail during predict/transform on

[jira] [Updated] (SPARK-5981) pyspark ML models fail during predict/transform on vector within map

2015-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5981: - Target Version/s: 1.4.0 (was: 1.3.0) pyspark ML models fail during predict/transform on

[jira] [Commented] (SPARK-5845) Time to cleanup intermediate shuffle files not included in shuffle write time

2015-02-24 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335646#comment-14335646 ] Ilya Ganelin commented on SPARK-5845: - If I understand correctly, the file cleanup

[jira] [Comment Edited] (SPARK-5845) Time to cleanup intermediate shuffle files not included in shuffle write time

2015-02-24 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335646#comment-14335646 ] Ilya Ganelin edited comment on SPARK-5845 at 2/24/15 11:19 PM:

[jira] [Commented] (SPARK-5904) DataFrame methods with varargs do not work in Java

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335674#comment-14335674 ] Apache Spark commented on SPARK-5904: - User 'rxin' has created a pull request for this

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-24 Thread Mukesh Jha (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336027#comment-14336027 ] Mukesh Jha commented on SPARK-5837: --- My Hadoop version is Hadoop 2.5.0-cdh5.3.0 HTTP

[jira] [Commented] (SPARK-5999) Remove duplicate Literal matching block

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336120#comment-14336120 ] Apache Spark commented on SPARK-5999: - User 'viirya' has created a pull request for

[jira] [Commented] (SPARK-5970) Temporary directories are not removed (but their content is)

2015-02-24 Thread Milan Straka (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336070#comment-14336070 ] Milan Straka commented on SPARK-5970: - I found a _Contributing to Spark_ guide, will

[jira] [Created] (SPARK-5999) Remove duplicate Literal matching block

2015-02-24 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-5999: -- Summary: Remove duplicate Literal matching block Key: SPARK-5999 URL: https://issues.apache.org/jira/browse/SPARK-5999 Project: Spark Issue Type:

[jira] [Commented] (SPARK-5969) The pyspark.rdd.sortByKey always fills only two partitions when ascending=False.

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336128#comment-14336128 ] Apache Spark commented on SPARK-5969: - User 'foxik' has created a pull request for

[jira] [Commented] (SPARK-5775) GenericRow cannot be cast to SpecificMutableRow when nested data and partitioned table

2015-02-24 Thread Anselme Vignon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336152#comment-14336152 ] Anselme Vignon commented on SPARK-5775: --- [~marmbrus][~lian cheng] Hi, I'm quite new

[jira] [Commented] (SPARK-4705) Driver retries in cluster mode always fail if event logging is enabled

2015-02-24 Thread Twinkle Sachdeva (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336034#comment-14336034 ] Twinkle Sachdeva commented on SPARK-4705: - Hi [~vanzin] Working on it. Thanks,

[jira] [Commented] (SPARK-5970) Temporary directories are not removed (but their content is)

2015-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336113#comment-14336113 ] Apache Spark commented on SPARK-5970: - User 'foxik' has created a pull request for

[jira] [Closed] (SPARK-5816) Add huge backward compatibility warning in DriverWrapper

2015-02-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-5816. Resolution: Fixed Fix Version/s: 1.3.0 Target Version/s: 1.3.0 (was: 1.3.0, 1.4.0) Add

[jira] [Reopened] (SPARK-5286) Fail to drop an invalid table when using the data source API

2015-02-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai reopened SPARK-5286: - I am reopen this issue because we need to catch all Throwables instead of just Exceptions. Fail to drop an

[jira] [Commented] (SPARK-5978) Spark examples cannot compile with Hadoop 2

2015-02-24 Thread Michael Nazario (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335827#comment-14335827 ] Michael Nazario commented on SPARK-5978: If I get the chance I'll look into it.

[jira] [Created] (SPARK-5997) Increase partition count without performing a shuffle

2015-02-24 Thread Andrew Ash (JIRA)
Andrew Ash created SPARK-5997: - Summary: Increase partition count without performing a shuffle Key: SPARK-5997 URL: https://issues.apache.org/jira/browse/SPARK-5997 Project: Spark Issue Type:

  1   2   >