[jira] [Created] (SPARK-2517) Remove as many compilation warning messages as possible

2014-07-16 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-2517: -- Summary: Remove as many compilation warning messages as possible Key: SPARK-2517 URL: https://issues.apache.org/jira/browse/SPARK-2517 Project: Spark Issue

[jira] [Updated] (SPARK-2517) Remove as many compilation warning messages as possible

2014-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2517: --- Assignee: Yin Huai Remove as many compilation warning messages as possible

[jira] [Created] (SPARK-2518) Fix foldability of Substring expression.

2014-07-16 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-2518: Summary: Fix foldability of Substring expression. Key: SPARK-2518 URL: https://issues.apache.org/jira/browse/SPARK-2518 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2518) Fix foldability of Substring expression.

2014-07-16 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063212#comment-14063212 ] Takuya Ueshin commented on SPARK-2518: -- PRed:

[jira] [Updated] (SPARK-2518) Fix foldability of Substring expression.

2014-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2518: --- Assignee: Takuya Ueshin Fix foldability of Substring expression.

[jira] [Updated] (SPARK-2517) Remove as many compilation warning messages as possible

2014-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2517: --- Description: We should probably treat warnings as failures in Jenkins. Some examples: {code}

[jira] [Commented] (SPARK-1981) Add AWS Kinesis streaming support

2014-07-16 Thread Chris Fregly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063247#comment-14063247 ] Chris Fregly commented on SPARK-1981: - PR: https://github.com/apache/spark/pull/1434

[jira] [Commented] (SPARK-1761) Add broadcast information on SparkUI storage tab

2014-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063250#comment-14063250 ] Reynold Xin commented on SPARK-1761: This would be very useful actually. Add

[jira] [Updated] (SPARK-2274) spark SQL query hang up sometimes

2014-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2274: --- Component/s: SQL spark SQL query hang up sometimes -

[jira] [Updated] (SPARK-2274) spark SQL query hang up sometimes

2014-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2274: --- Assignee: Michael Armbrust spark SQL query hang up sometimes -

[jira] [Commented] (SPARK-2519) Eliminate pattern-matching on Tuple2 in performance-critical aggregation code

2014-07-16 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063251#comment-14063251 ] Sandy Ryza commented on SPARK-2519: --- https://github.com/apache/spark/pull/1435

[jira] [Updated] (SPARK-2433) In MLlib, implementation for Naive Bayes in Spark 0.9.1 is having an implementation bug.

2014-07-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2433: - Target Version/s: 0.9.2 In MLlib, implementation for Naive Bayes in Spark 0.9.1 is having an

[jira] [Commented] (SPARK-2521) Broadcast RDD object once per TaskSet (instead of sending it for every task)

2014-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063256#comment-14063256 ] Reynold Xin commented on SPARK-2521: cc [~matei] Broadcast RDD object once per

[jira] [Updated] (SPARK-2521) Broadcast RDD object once per TaskSet (instead of sending it for every task)

2014-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2521: --- Component/s: Spark Core Broadcast RDD object once per TaskSet (instead of sending it for every

[jira] [Updated] (SPARK-2521) Broadcast RDD object once per TaskSet (instead of sending it for every task)

2014-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2521: --- Description: This can substantially reduce task size, as well as being able to support very large

[jira] [Commented] (SPARK-2433) In MLlib, implementation for Naive Bayes in Spark 0.9.1 is having an implementation bug.

2014-07-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063268#comment-14063268 ] Xiangrui Meng commented on SPARK-2433: -- [~rahul1993] Thanks for reporting this bug!

[jira] [Created] (SPARK-2522) Use TorrentBroadcastFactory as the default broadcast factory

2014-07-16 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-2522: Summary: Use TorrentBroadcastFactory as the default broadcast factory Key: SPARK-2522 URL: https://issues.apache.org/jira/browse/SPARK-2522 Project: Spark

[jira] [Created] (SPARK-2523) Potential Bugs if SerDe is not the identical among partitions and table

2014-07-16 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2523: Summary: Potential Bugs if SerDe is not the identical among partitions and table Key: SPARK-2523 URL: https://issues.apache.org/jira/browse/SPARK-2523 Project: Spark

[jira] [Commented] (SPARK-2523) Potential Bugs if SerDe is not the identical among partitions and table

2014-07-16 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063288#comment-14063288 ] Cheng Hao commented on SPARK-2523: -- This is the follow up for

[jira] [Comment Edited] (SPARK-2523) Potential Bugs if SerDe is not the identical among partitions and table

2014-07-16 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063288#comment-14063288 ] Cheng Hao edited comment on SPARK-2523 at 7/16/14 8:42 AM: --- This

[jira] [Commented] (SPARK-2523) Potential Bugs if SerDe is not the identical among partitions and table

2014-07-16 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063289#comment-14063289 ] Cheng Hao commented on SPARK-2523: -- [~yhuai] Can you review the code for me? Potential

[jira] [Updated] (SPARK-2520) the executor is thrown java.io.StreamCorruptedException

2014-07-16 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-2520: --- Description: This issue occurs with a very small probability. I can not reproduce it. The executor

[jira] [Commented] (SPARK-2465) Use long as user / item ID for ALS

2014-07-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063344#comment-14063344 ] Sean Owen commented on SPARK-2465: -- Yeah that's a good separate point. My hunch is that

[jira] [Commented] (SPARK-2356) Exception: Could not locate executable null\bin\winutils.exe in the Hadoop

2014-07-16 Thread Kostiantyn Kudriavtsev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063350#comment-14063350 ] Kostiantyn Kudriavtsev commented on SPARK-2356: --- and the use case when I got

[jira] [Updated] (SPARK-2356) Exception: Could not locate executable null\bin\winutils.exe in the Hadoop

2014-07-16 Thread Kostiantyn Kudriavtsev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostiantyn Kudriavtsev updated SPARK-2356: -- Priority: Critical (was: Major) Exception: Could not locate executable

[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

2014-07-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063380#comment-14063380 ] Sean Owen commented on SPARK-2420: -- I think Jetty is the only actual issue here. Guava

[jira] [Commented] (SPARK-2454) Separate driver spark home from executor spark home

2014-07-16 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063402#comment-14063402 ] Nan Zhu commented on SPARK-2454: this will make sparkHome as an application-specific

[jira] [Commented] (SPARK-2341) loadLibSVMFile doesn't handle regression datasets

2014-07-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063419#comment-14063419 ] Sean Owen commented on SPARK-2341: -- OK is it worth a pull request for changing the

[jira] [Commented] (SPARK-2190) Specialized ColumnType for Timestamp

2014-07-16 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063431#comment-14063431 ] Cheng Lian commented on SPARK-2190: --- PR https://github.com/apache/spark/pull/1440

[jira] [Commented] (SPARK-953) Latent Dirichlet Association (LDA model)

2014-07-16 Thread Masaki Rikitoku (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063447#comment-14063447 ] Masaki Rikitoku commented on SPARK-953: --- parallel gibbs sampling for lda (plda) may

[jira] [Commented] (SPARK-2313) PySpark should accept port via a command line argument rather than STDIN

2014-07-16 Thread Matthew Farrellee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063454#comment-14063454 ] Matthew Farrellee commented on SPARK-2313: -- as this stands, having another

[jira] [Commented] (SPARK-2443) Reading from Partitioned Tables is Slow

2014-07-16 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063455#comment-14063455 ] Teng Qiu commented on SPARK-2443: - Hi, how can you access parquet table using HiveContext

[jira] [Commented] (SPARK-2111) pyspark errors when SPARK_PRINT_LAUNCH_COMMAND=1

2014-07-16 Thread Matthew Farrellee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063457#comment-14063457 ] Matthew Farrellee commented on SPARK-2111: -- this was resolved by

[jira] [Commented] (SPARK-2111) pyspark errors when SPARK_PRINT_LAUNCH_COMMAND=1

2014-07-16 Thread Matthew Farrellee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063462#comment-14063462 ] Matthew Farrellee commented on SPARK-2111: -- [~pwendell] please close this issue

[jira] [Resolved] (SPARK-2111) pyspark errors when SPARK_PRINT_LAUNCH_COMMAND=1

2014-07-16 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-2111. -- Resolution: Fixed Fix Version/s: 1.1.0 1.0.1 pyspark errors when

[jira] [Updated] (SPARK-2111) pyspark errors when SPARK_PRINT_LAUNCH_COMMAND=1

2014-07-16 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-2111: - Assignee: Prashant Sharma pyspark errors when SPARK_PRINT_LAUNCH_COMMAND=1

[jira] [Commented] (SPARK-2443) Reading from Partitioned Tables is Slow

2014-07-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063530#comment-14063530 ] Michael Armbrust commented on SPARK-2443: - [~chutium], I would recommend using the

[jira] [Updated] (SPARK-2308) Add KMeans MiniBatch clustering algorithm to MLlib

2014-07-16 Thread RJ Nowling (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] RJ Nowling updated SPARK-2308: -- Attachment: uneven_centers.pdf many_small_centers.pdf Add KMeans MiniBatch clustering

[jira] [Commented] (SPARK-2308) Add KMeans MiniBatch clustering algorithm to MLlib

2014-07-16 Thread RJ Nowling (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063601#comment-14063601 ] RJ Nowling commented on SPARK-2308: --- I tested kmeans vs minibatch kmeans under 2

[jira] [Created] (SPARK-2525) Remove as many compilation warning messages as possible in Spark SQL

2014-07-16 Thread Yin Huai (JIRA)
Yin Huai created SPARK-2525: --- Summary: Remove as many compilation warning messages as possible in Spark SQL Key: SPARK-2525 URL: https://issues.apache.org/jira/browse/SPARK-2525 Project: Spark

[jira] [Commented] (SPARK-2525) Remove as many compilation warning messages as possible in Spark SQL

2014-07-16 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063701#comment-14063701 ] Yin Huai commented on SPARK-2525: - Those deprecation warnings in Spark SQL are caused by

[jira] [Reopened] (SPARK-2314) RDD actions are only overridden in Scala, not java or python

2014-07-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reopened SPARK-2314: - Yeah I think we might need to do this in python too as at least take is implemented

[jira] [Created] (SPARK-2526) Simplify make-distribution.sh

2014-07-16 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-2526: -- Summary: Simplify make-distribution.sh Key: SPARK-2526 URL: https://issues.apache.org/jira/browse/SPARK-2526 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-2526) Simplify make-distribution.sh to just pass through Maven options

2014-07-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2526: --- Summary: Simplify make-distribution.sh to just pass through Maven options (was: Simplify

[jira] [Commented] (SPARK-2523) Potential Bugs if SerDe is not the identical among partitions and table

2014-07-16 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063753#comment-14063753 ] Yin Huai commented on SPARK-2523: - Yeah, no problem. Can you add a case which can trigger

[jira] [Commented] (SPARK-2495) Ability to re-create ML models

2014-07-16 Thread Alexander Albul (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063762#comment-14063762 ] Alexander Albul commented on SPARK-2495: Yes, i can work on it, but first i need

[jira] [Comment Edited] (SPARK-2495) Ability to re-create ML models

2014-07-16 Thread Alexander Albul (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063762#comment-14063762 ] Alexander Albul edited comment on SPARK-2495 at 7/16/14 5:34 PM:

[jira] [Created] (SPARK-2527) incorrect persistence level shown in Spark UI after repersisting

2014-07-16 Thread Diana Carroll (JIRA)
Diana Carroll created SPARK-2527: Summary: incorrect persistence level shown in Spark UI after repersisting Key: SPARK-2527 URL: https://issues.apache.org/jira/browse/SPARK-2527 Project: Spark

[jira] [Created] (SPARK-2528) spark-ec2 security group permissions are too open

2014-07-16 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-2528: --- Summary: spark-ec2 security group permissions are too open Key: SPARK-2528 URL: https://issues.apache.org/jira/browse/SPARK-2528 Project: Spark Issue

[jira] [Updated] (SPARK-2527) incorrect persistence level shown in Spark UI after repersisting

2014-07-16 Thread Diana Carroll (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Diana Carroll updated SPARK-2527: - Description: If I persist an RDD at one level, unpersist it, then repersist it at another

[jira] [Comment Edited] (SPARK-2495) Ability to re-create ML models

2014-07-16 Thread Alexander Albul (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063762#comment-14063762 ] Alexander Albul edited comment on SPARK-2495 at 7/16/14 5:33 PM:

[jira] [Commented] (SPARK-2519) Eliminate pattern-matching on Tuple2 in performance-critical aggregation code

2014-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063833#comment-14063833 ] Reynold Xin commented on SPARK-2519: Does this pull request fix all of them?

[jira] [Updated] (SPARK-2269) Clean up and add unit tests for resourceOffers in MesosSchedulerBackend

2014-07-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2269: --- Assignee: Tim Chen Clean up and add unit tests for resourceOffers in MesosSchedulerBackend

[jira] [Resolved] (SPARK-2522) Use TorrentBroadcastFactory as the default broadcast factory

2014-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-2522. Resolution: Fixed Fix Version/s: 1.1.0 Use TorrentBroadcastFactory as the default

[jira] [Resolved] (SPARK-2317) Improve task logging

2014-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-2317. Resolution: Fixed Fix Version/s: 1.1.0 Improve task logging

[jira] [Commented] (SPARK-2463) Creating multiple StreamingContexts from shell generates duplicate Streaming tabs in UI

2014-07-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063944#comment-14063944 ] Andrew Or commented on SPARK-2463: -- It probably won't be too much work, because the

[jira] [Created] (SPARK-2530) Relax incorrect assumption of one ExternalAppendOnlyMap per thread

2014-07-16 Thread Andrew Or (JIRA)
Andrew Or created SPARK-2530: Summary: Relax incorrect assumption of one ExternalAppendOnlyMap per thread Key: SPARK-2530 URL: https://issues.apache.org/jira/browse/SPARK-2530 Project: Spark

[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

2014-07-16 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063998#comment-14063998 ] Xuefu Zhang commented on SPARK-2420: Thanks for your comments, [~srowen]. I mostly

[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

2014-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064000#comment-14064000 ] Reynold Xin commented on SPARK-2420: Thanks for looking into this. Change Spark

[jira] [Issue Comment Deleted] (SPARK-1215) Clustering: Index out of bounds error

2014-07-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-1215: - Comment: was deleted (was: Just to let you know, I'll give the go-ahead for this

[jira] [Resolved] (SPARK-2504) Fix nullability of Substring expression.

2014-07-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2504. - Resolution: Fixed Fix Version/s: 1.0.2 1.1.0 Assignee:

[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

2014-07-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064019#comment-14064019 ] Sean Owen commented on SPARK-2420: -- It'd be best to say what problem you are seeing with

[jira] [Commented] (SPARK-2454) Separate driver spark home from executor spark home

2014-07-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064036#comment-14064036 ] Andrew Or commented on SPARK-2454: -- There may be multiple installations of Spark on the

[jira] [Updated] (SPARK-2454) Separate driver spark home from executor spark home

2014-07-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2454: - Description: The driver may not always share the same directory structure as the executors. It makes

[jira] [Closed] (SPARK-2465) Use long as user / item ID for ALS

2014-07-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen closed SPARK-2465. Resolution: Won't Fix Will possibly revisit this in the long term, or look at creating a parallel LongALS

[jira] [Created] (SPARK-2531) Make BroadcastNestedLoopJoin take into account a BuildSide

2014-07-16 Thread Zongheng Yang (JIRA)
Zongheng Yang created SPARK-2531: Summary: Make BroadcastNestedLoopJoin take into account a BuildSide Key: SPARK-2531 URL: https://issues.apache.org/jira/browse/SPARK-2531 Project: Spark

[jira] [Commented] (SPARK-2454) Separate driver spark home from executor spark home

2014-07-16 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064085#comment-14064085 ] Nan Zhu commented on SPARK-2454: I see, it makes sense to me... Separate driver spark

[jira] [Updated] (SPARK-2411) Standalone Master - direct users to turn on event logs

2014-07-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2411: - Attachment: (was: Master event logs.png) Standalone Master - direct users to turn on event logs

[jira] [Commented] (SPARK-2519) Eliminate pattern-matching on Tuple2 in performance-critical aggregation code

2014-07-16 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064102#comment-14064102 ] Sandy Ryza commented on SPARK-2519: --- I looked in ShuffledRDD, ExternalAppendOnlyMap,

[jira] [Commented] (SPARK-2531) Make BroadcastNestedLoopJoin take into account a BuildSide

2014-07-16 Thread Zongheng Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064106#comment-14064106 ] Zongheng Yang commented on SPARK-2531: -- Github PR:

[jira] [Updated] (SPARK-2154) Worker goes down.

2014-07-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2154: --- Assignee: Aaron Davidson Worker goes down. - Key:

[jira] [Resolved] (SPARK-2154) Worker goes down.

2014-07-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2154. Resolution: Fixed Fix Version/s: 1.1.0 1.0.2 Issue resolved by

[jira] [Closed] (SPARK-2154) Worker goes down.

2014-07-16 Thread siva venkat gogineni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] siva venkat gogineni closed SPARK-2154. --- Fixed in the future releases Worker goes down. -

[jira] [Created] (SPARK-2532) Fix issues with consolidated shuffle

2014-07-16 Thread Mridul Muralidharan (JIRA)
Mridul Muralidharan created SPARK-2532: -- Summary: Fix issues with consolidated shuffle Key: SPARK-2532 URL: https://issues.apache.org/jira/browse/SPARK-2532 Project: Spark Issue Type:

[jira] [Updated] (SPARK-2434) Generate runtime warnings for naive implementations

2014-07-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2434: - Assignee: Burak Yavuz Generate runtime warnings for naive implementations

[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

2014-07-16 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064168#comment-14064168 ] Xuefu Zhang commented on SPARK-2420: As to guava conflict, HIVE-7387 has more details

[jira] [Commented] (SPARK-2154) Worker goes down.

2014-07-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064172#comment-14064172 ] Patrick Wendell commented on SPARK-2154: [~talk2siva8] Yes, that's correct.

[jira] [Updated] (SPARK-2495) Ability to re-create ML models

2014-07-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2495: - Assignee: Alexander Albul Ability to re-create ML models --

[jira] [Updated] (SPARK-1997) Update breeze to version 0.8.1

2014-07-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1997: - Target Version/s: 1.1.0 Update breeze to version 0.8.1 --

[jira] [Updated] (SPARK-2533) Show summary of locality level of completed tasks in the each stage page of web UI

2014-07-16 Thread Masayoshi TSUZUKI (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masayoshi TSUZUKI updated SPARK-2533: - Summary: Show summary of locality level of completed tasks in the each stage page of web

[jira] [Created] (SPARK-2533) ---- Show summary of locality level of completed tasks in the each stage page of web UI

2014-07-16 Thread Masayoshi TSUZUKI (JIRA)
Masayoshi TSUZUKI created SPARK-2533: Summary: Show summary of locality level of completed tasks in the each stage page of web UI Key: SPARK-2533 URL: https://issues.apache.org/jira/browse/SPARK-2533

[jira] [Created] (SPARK-2534) Avoid pulling in the entire RDD in groupByKey

2014-07-16 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-2534: -- Summary: Avoid pulling in the entire RDD in groupByKey Key: SPARK-2534 URL: https://issues.apache.org/jira/browse/SPARK-2534 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2534) Avoid pulling in the entire RDD in groupByKey

2014-07-16 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064333#comment-14064333 ] Sandy Ryza commented on SPARK-2534: --- Yowza Avoid pulling in the entire RDD in

[jira] [Commented] (SPARK-2501) Handle stage re-submissions properly in the UI

2014-07-16 Thread Masayoshi TSUZUKI (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064350#comment-14064350 ] Masayoshi TSUZUKI commented on SPARK-2501: -- [SPARK-2299] seems to include the

[jira] [Created] (SPARK-2535) Add StringComparison case to NullPropagation.

2014-07-16 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-2535: Summary: Add StringComparison case to NullPropagation. Key: SPARK-2535 URL: https://issues.apache.org/jira/browse/SPARK-2535 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-2501) Handle stage re-submissions properly in the UI

2014-07-16 Thread Masayoshi TSUZUKI (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064350#comment-14064350 ] Masayoshi TSUZUKI edited comment on SPARK-2501 at 7/16/14 11:58 PM:

[jira] [Created] (SPARK-2536) Update the MLlib page of Spark website

2014-07-16 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-2536: Summary: Update the MLlib page of Spark website Key: SPARK-2536 URL: https://issues.apache.org/jira/browse/SPARK-2536 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-2536) Update the MLlib page of Spark website

2014-07-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2536: - Description: It still shows v0.9. (was: It stills shows v0.9.) Update the MLlib page of Spark

[jira] [Created] (SPARK-2538) External aggregation in Python

2014-07-16 Thread Davies Liu (JIRA)
Davies Liu created SPARK-2538: - Summary: External aggregation in Python Key: SPARK-2538 URL: https://issues.apache.org/jira/browse/SPARK-2538 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-2495) Ability to re-create ML models

2014-07-16 Thread Alexander Albul (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064374#comment-14064374 ] Alexander Albul commented on SPARK-2495: Hi Meng, Here is the list of models that

[jira] [Updated] (SPARK-2537) Workaround Timezone specific Hive tests

2014-07-16 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-2537: -- Description: Several Hive tests in {{HiveCompatibilitySuite}} are timezone sensitive: -

[jira] [Commented] (SPARK-2535) Add StringComparison case to NullPropagation.

2014-07-16 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064381#comment-14064381 ] Takuya Ueshin commented on SPARK-2535: -- PR: https://github.com/apache/spark/pull/1451

[jira] [Created] (SPARK-2539) ConnectionManager should handle Uncaught Exception

2014-07-16 Thread Kousuke Saruta (JIRA)
Kousuke Saruta created SPARK-2539: - Summary: ConnectionManager should handle Uncaught Exception Key: SPARK-2539 URL: https://issues.apache.org/jira/browse/SPARK-2539 Project: Spark Issue

[jira] [Created] (SPARK-2540) Add More Types Support for unwarpData of HiveUDF

2014-07-16 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2540: Summary: Add More Types Support for unwarpData of HiveUDF Key: SPARK-2540 URL: https://issues.apache.org/jira/browse/SPARK-2540 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2433) In MLlib, implementation for Naive Bayes in Spark 0.9.1 is having an implementation bug.

2014-07-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064421#comment-14064421 ] Xiangrui Meng commented on SPARK-2433: -- PR for branch-0.9:

[jira] [Updated] (SPARK-2433) In MLlib, implementation for Naive Bayes in Spark 0.9.1 is having an implementation bug.

2014-07-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2433: - Fix Version/s: 1.0.0 In MLlib, implementation for Naive Bayes in Spark 0.9.1 is having an

[jira] [Updated] (SPARK-2438) Streaming + MLLib

2014-07-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2438: - Assignee: Jeremy Freeman Streaming + MLLib - Key: SPARK-2438

[jira] [Created] (SPARK-2541) Standalone mode can't access secure HDFS anymore

2014-07-16 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-2541: Summary: Standalone mode can't access secure HDFS anymore Key: SPARK-2541 URL: https://issues.apache.org/jira/browse/SPARK-2541 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2406) Partitioned Parquet Support

2014-07-16 Thread Pat McDonough (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064450#comment-14064450 ] Pat McDonough commented on SPARK-2406: -- Okay, so it sounds like these are completely

[jira] [Commented] (SPARK-2481) The environment variables SPARK_HISTORY_OPTS is covered in start-history-server.sh

2014-07-16 Thread Masayoshi TSUZUKI (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064471#comment-14064471 ] Masayoshi TSUZUKI commented on SPARK-2481: -- Ah, I understand what you meant and I

  1   2   >