[jira] [Commented] (SPARK-4395) Running a Spark SQL SELECT command from PySpark causes a hang for ~ 1 hour

2014-11-25 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224183#comment-14224183 ] Cheng Lian commented on SPARK-4395: --- Couldn't reproduce the long pause locally, and

[jira] [Commented] (SPARK-4592) Worker registration failed: Duplicate worker ID error during Master failover

2014-11-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224184#comment-14224184 ] Apache Spark commented on SPARK-4592: - User 'andrewor14' has created a pull request

[jira] [Commented] (SPARK-4584) 2x Performance regression for Spark-on-YARN

2014-11-25 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224185#comment-14224185 ] Sandy Ryza commented on SPARK-4584: --- I took a look at the jobs Nishkam ran before and

[jira] [Updated] (SPARK-4584) 2x Performance regression for Spark-on-YARN

2014-11-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4584: -- Target Version/s: 1.2.0 2x Performance regression for Spark-on-YARN

[jira] [Commented] (SPARK-3588) Gaussian Mixture Model clustering

2014-11-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224188#comment-14224188 ] Xiangrui Meng commented on SPARK-3588: -- Since [~tgaloppo] already submitted a PR, we

[jira] [Commented] (SPARK-4592) Worker registration failed: Duplicate worker ID error during Master failover

2014-11-25 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224194#comment-14224194 ] Andrew Or commented on SPARK-4592: -- I submitted a fix at

[jira] [Commented] (SPARK-4594) Improvement the broadcast for HiveConf

2014-11-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224207#comment-14224207 ] Apache Spark commented on SPARK-4594: - User 'Leolh' has created a pull request for

[jira] [Commented] (SPARK-4597) Use proper exception and reset variable

2014-11-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224215#comment-14224215 ] Apache Spark commented on SPARK-4597: - User 'viirya' has created a pull request for

[jira] [Created] (SPARK-4598) java.lang.OutOfMemoryError occurs when opening stage page of an application has 100000 tasks,

2014-11-25 Thread meiyoula (JIRA)
meiyoula created SPARK-4598: --- Summary: java.lang.OutOfMemoryError occurs when opening stage page of an application has 10 tasks, Key: SPARK-4598 URL: https://issues.apache.org/jira/browse/SPARK-4598

[jira] [Commented] (SPARK-4585) Spark dynamic scaling executors use upper limit value as default.

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224244#comment-14224244 ] Sean Owen commented on SPARK-4585: -- Given the discussion in SPARK-3174, it seems like

[jira] [Created] (SPARK-4599) add hive profile to root pom

2014-11-25 Thread Adrian Wang (JIRA)
Adrian Wang created SPARK-4599: -- Summary: add hive profile to root pom Key: SPARK-4599 URL: https://issues.apache.org/jira/browse/SPARK-4599 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-4585) Spark dynamic scaling executors use upper limit value as default.

2014-11-25 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated SPARK-4585: -- Issue Type: Improvement (was: Bug) Spark dynamic scaling executors use upper limit value as default.

[jira] [Commented] (SPARK-4599) add hive profile to root pom

2014-11-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224256#comment-14224256 ] Apache Spark commented on SPARK-4599: - User 'adrian-wang' has created a pull request

[jira] [Commented] (SPARK-4585) Spark dynamic scaling executors use upper limit value as default.

2014-11-25 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224257#comment-14224257 ] Sandy Ryza commented on SPARK-4585: --- I was discussing this with [~brocknoland]. The

[jira] [Resolved] (SPARK-962) debian package contains old version of executable scripts

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-962. - Resolution: Fixed This appears fixed. The referenced PR was apparently merged into 0.9, and the outdated

[jira] [Assigned] (SPARK-4509) Revert EC2 tag-based cluster membership patch in branch-1.2

2014-11-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-4509: Assignee: Xiangrui Meng Revert EC2 tag-based cluster membership patch in branch-1.2

[jira] [Commented] (SPARK-4594) Improvement the broadcast for HiveConf

2014-11-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224284#comment-14224284 ] Apache Spark commented on SPARK-4594: - User 'Leolh' has created a pull request for

[jira] [Commented] (SPARK-3588) Gaussian Mixture Model clustering

2014-11-25 Thread Travis Galoppo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224287#comment-14224287 ] Travis Galoppo commented on SPARK-3588: --- Sorry about the duplicate effort; I did a

[jira] [Assigned] (SPARK-4592) Worker registration failed: Duplicate worker ID error during Master failover

2014-11-25 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or reassigned SPARK-4592: Assignee: Andrew Or (was: Josh Rosen) Worker registration failed: Duplicate worker ID error

[jira] [Resolved] (SPARK-910) hadoopFile creates RecordReader key and value at the wrong scope

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-910. - Resolution: Not a Problem Given the PR discussion, it looks like this was resolved as NotAProblem. Either

[jira] [Resolved] (SPARK-981) Seemingly spurious Duplicate worker ID error messages

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-981. - Resolution: Duplicate It sounds like it may be the same issue reported, and being worked on, in

[jira] [Commented] (SPARK-1148) Suggestions for exception handling (avoid potential bugs)

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224305#comment-14224305 ] Sean Owen commented on SPARK-1148: -- [~d.yuan] Several problems like this have been fixed

[jira] [Updated] (SPARK-4596) Refactorize Normalizer to make code cleaner

2014-11-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4596: - Assignee: DB Tsai Refactorize Normalizer to make code cleaner

[jira] [Resolved] (SPARK-4596) Refactorize Normalizer to make code cleaner

2014-11-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-4596. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3446

[jira] [Updated] (SPARK-4530) GradientDescent get a wrong gradient value according to the gradient formula, which is caused by the miniBatchSize parameter.

2014-11-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4530: - Fix Version/s: 1.2.0 GradientDescent get a wrong gradient value according to the gradient

[jira] [Commented] (SPARK-4530) GradientDescent get a wrong gradient value according to the gradient formula, which is caused by the miniBatchSize parameter.

2014-11-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224312#comment-14224312 ] Xiangrui Meng commented on SPARK-4530: -- PR: https://github.com/apache/spark/pull/3399

[jira] [Updated] (SPARK-1182) Sort the configuration parameters in configuration.md

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-1182: - Fix Version/s: (was: 1.0.0) The current PR seems to be https://github.com/apache/spark/pull/2312 but

[jira] [Commented] (SPARK-1016) When running examples jar (compiled with maven) logs don't initialize properly

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224336#comment-14224336 ] Sean Owen commented on SPARK-1016: -- Is this resolved now? examples should be logging via

[jira] [Commented] (SPARK-4530) GradientDescent get a wrong gradient value according to the gradient formula, which is caused by the miniBatchSize parameter.

2014-11-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224378#comment-14224378 ] Apache Spark commented on SPARK-4530: - User 'witgo' has created a pull request for

[jira] [Commented] (SPARK-1301) Add UI elements to collapse Aggregated Metrics by Executor pane on stage page

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224387#comment-14224387 ] Sean Owen commented on SPARK-1301: -- Is this still relevant now that this info is on a

[jira] [Resolved] (SPARK-2223) Building and running tests with maven is extremely slow

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2223. -- Resolution: Duplicate Can I suggest this be folded into SPARK-3431? there's no action here, and the

[jira] [Created] (SPARK-4600) org.apache.spark.graphx.VertexRDD.diff does not work

2014-11-25 Thread Teppei Tosa (JIRA)
Teppei Tosa created SPARK-4600: -- Summary: org.apache.spark.graphx.VertexRDD.diff does not work Key: SPARK-4600 URL: https://issues.apache.org/jira/browse/SPARK-4600 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2192) Examples Data Not in Binary Distribution

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224436#comment-14224436 ] Sean Owen commented on SPARK-2192: -- Data files are now consolidated under data/, and they

[jira] [Resolved] (SPARK-2556) Multiple SparkContexts can coexist in one process

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2556. -- Resolution: Duplicate Multiple SparkContexts can coexist in one process

[jira] [Resolved] (SPARK-1623) SPARK-1623. Broadcast cleaner should use getCanonicalPath when deleting files by name

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-1623. -- Resolution: Fixed Yes, the original PR was clearly superseded by

[jira] [Resolved] (SPARK-2404) spark-submit and spark-class may overwrite the already defined SPARK_HOME

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2404. -- Resolution: Won't Fix Fix Version/s: (was: 1.0.1) According to the PR discussion, this

[jira] [Resolved] (SPARK-4535) Fix the error in comments

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-4535. -- Resolution: Fixed Fix Version/s: 1.2.0 Fix the error in comments

[jira] [Resolved] (SPARK-2290) Do not send SPARK_HOME from driver to executors

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2290. -- Resolution: Duplicate According to PR discussion, this duplicates SPARK-2454, which was resolved in

[jira] [Resolved] (SPARK-3009) ApplicationInfo doesn't get initialised after deserialisation during recovery

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-3009. -- Resolution: Fixed Looks like https://github.com/apache/spark/pull/1947 was merged and resolved this.

[jira] [Commented] (SPARK-4036) Add Conditional Random Fields (CRF) algorithm to Spark MLlib

2014-11-25 Thread Kai Sasaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224467#comment-14224467 ] Kai Sasaki commented on SPARK-4036: --- Hi, I want to work on this ticket. I'll write a

[jira] [Commented] (SPARK-3171) Don't print meaningless information of SelectionKey

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224469#comment-14224469 ] Sean Owen commented on SPARK-3171: -- Looks like this was ready to commit but the PR was

[jira] [Resolved] (SPARK-2419) Misc updates to streaming programming guide

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2419. -- Resolution: Fixed Looks like both of the PRs were merged, so this is resolved. Misc updates to

[jira] [Resolved] (SPARK-2027) spark-ec2 puts Hadoop's log4j ahead of Spark's in classpath

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2027. -- Resolution: Duplicate spark-ec2 puts Hadoop's log4j ahead of Spark's in classpath

[jira] [Resolved] (SPARK-2007) Spark on YARN picks up hadoop log4j.properties even if SPARK_LOG4J_CONF is set

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2007. -- Resolution: Fixed Looks like this is resolved per the method Marcelo alludes to in the PR. That seems

[jira] [Commented] (SPARK-2132) Color GC time red when over a percentage of task time

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224490#comment-14224490 ] Sean Owen commented on SPARK-2132: -- This might be best as part of the larger

[jira] [Created] (SPARK-4601) Call site of jobs generated by streaming incorrect in Spark UI

2014-11-25 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-4601: Summary: Call site of jobs generated by streaming incorrect in Spark UI Key: SPARK-4601 URL: https://issues.apache.org/jira/browse/SPARK-4601 Project: Spark

[jira] [Updated] (SPARK-4601) Call site of jobs generated by streaming incorrect in Spark UI

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-4601: - Description: When running the NetworkWordCount, the description of the word count jobs are set as

[jira] [Commented] (SPARK-4601) Call site of jobs generated by streaming incorrect in Spark UI

2014-11-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224523#comment-14224523 ] Apache Spark commented on SPARK-4601: - User 'tdas' has created a pull request for this

[jira] [Commented] (SPARK-4598) java.lang.OutOfMemoryError occurs when opening stage page of an application has 100000 tasks,

2014-11-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224529#comment-14224529 ] Apache Spark commented on SPARK-4598: - User 'XuTingjun' has created a pull request for

[jira] [Commented] (SPARK-2475) Check whether #cores #receivers in local mode

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224537#comment-14224537 ] Tathagata Das commented on SPARK-2475: -- SPARK-4381 partially solved for the most

[jira] [Updated] (SPARK-4381) User should get warned when set spark.master with local in Spark Streaming

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-4381: - Priority: Major (was: Minor) User should get warned when set spark.master with local in Spark

[jira] [Resolved] (SPARK-3024) CLI interface to Driver

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-3024. -- Resolution: Duplicate If the gist of this is to expose the UI data via JSON, can I suggest this be

[jira] [Resolved] (SPARK-4381) User should get warned when set spark.master with local in Spark Streaming

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-4381. -- Resolution: Fixed Fix Version/s: 1.3.0 1.2.0 User should get warned

[jira] [Updated] (SPARK-4381) User should get warned when set spark.master with local in Spark Streaming

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-4381: - Affects Version/s: (was: 1.2.0) 1.1.1 1.0.2

[jira] [Commented] (SPARK-4133) PARSING_ERROR(2) when upgrading issues from 1.0.2 to 1.1.0

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224539#comment-14224539 ] Tathagata Das commented on SPARK-4133: -- [~joshrosen] Any more comments regarding this

[jira] [Commented] (SPARK-2985) Buffered data in BlockGenerator gets lost when receiver crashes

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224555#comment-14224555 ] Tathagata Das commented on SPARK-2985: -- This is by design. If you are using the

[jira] [Commented] (SPARK-4314) Exception when textFileStream attempts to read deleted _COPYING_ file

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224559#comment-14224559 ] Tathagata Das commented on SPARK-4314: -- Let me try to understand the scenario when

[jira] [Commented] (SPARK-4462) flume-sink build broken in SBT

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224564#comment-14224564 ] Tathagata Das commented on SPARK-4462: -- [~marmbrus] Are you still seeing this issue?

[jira] [Commented] (SPARK-4276) Spark streaming requires at least two working thread

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224565#comment-14224565 ] Tathagata Das commented on SPARK-4276: -- This issue is resolved by this JIRA, the user

[jira] [Commented] (SPARK-4276) Spark streaming requires at least two working thread

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224566#comment-14224566 ] Tathagata Das commented on SPARK-4276: -- Since this is not an issue, I am closing this

[jira] [Closed] (SPARK-4276) Spark streaming requires at least two working thread

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das closed SPARK-4276. Resolution: Duplicate Spark streaming requires at least two working thread

[jira] [Commented] (SPARK-2383) With auto.offset.reset, KafkaReceiver potentially deletes Consumer nodes from Zookeeper

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224571#comment-14224571 ] Tathagata Das commented on SPARK-2383: -- This issue has been resolved in Spark 1.2.0.

[jira] [Closed] (SPARK-2383) With auto.offset.reset, KafkaReceiver potentially deletes Consumer nodes from Zookeeper

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das closed SPARK-2383. Resolution: Duplicate With auto.offset.reset, KafkaReceiver potentially deletes Consumer nodes

[jira] [Commented] (SPARK-2401) AdaBoost.MH, a multi-class multi-label classifier

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224575#comment-14224575 ] Sean Owen commented on SPARK-2401: -- This looks like a duplicate of SPARK-1546. At least

[jira] [Commented] (SPARK-4537) Add 'processing delay' and 'totalDelay' to the metrics reported by the Spark Streaming subsystem

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224580#comment-14224580 ] Tathagata Das commented on SPARK-4537: -- Would be cool to solve this issue! Add

[jira] [Commented] (SPARK-4196) Streaming + checkpointing + saveAsNewAPIHadoopFiles = NotSerializableException for Hadoop Configuration

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224582#comment-14224582 ] Tathagata Das commented on SPARK-4196: -- Let me try to take a pass on this.

[jira] [Resolved] (SPARK-4344) spark.yarn.user.classpath.first is undocumented

2014-11-25 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-4344. -- Resolution: Fixed Fix Version/s: 1.3.0 1.2.0

[jira] [Created] (SPARK-4602) saveAsNewAPIHadoopFiles by default does not use SparkContext's hadoop configuration

2014-11-25 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-4602: Summary: saveAsNewAPIHadoopFiles by default does not use SparkContext's hadoop configuration Key: SPARK-4602 URL: https://issues.apache.org/jira/browse/SPARK-4602

[jira] [Created] (SPARK-4603) EOF when broadcasting a dict with an empty string value.

2014-11-25 Thread Alex Angelini (JIRA)
Alex Angelini created SPARK-4603: Summary: EOF when broadcasting a dict with an empty string value. Key: SPARK-4603 URL: https://issues.apache.org/jira/browse/SPARK-4603 Project: Spark Issue

[jira] [Commented] (SPARK-4196) Streaming + checkpointing + saveAsNewAPIHadoopFiles = NotSerializableException for Hadoop Configuration

2014-11-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224649#comment-14224649 ] Sean Owen commented on SPARK-4196: -- That didn't work for me, IIRC. The problem is that

[jira] [Commented] (SPARK-4602) saveAsNewAPIHadoopFiles by default does not use SparkContext's hadoop configuration

2014-11-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224652#comment-14224652 ] Apache Spark commented on SPARK-4602: - User 'tdas' has created a pull request for this

[jira] [Commented] (SPARK-4002) KafkaStreamSuite Kafka input stream case fails on OSX

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224656#comment-14224656 ] Tathagata Das commented on SPARK-4002: -- [[~jerryshao] have you figured out this

[jira] [Commented] (SPARK-4196) Streaming + checkpointing + saveAsNewAPIHadoopFiles = NotSerializableException for Hadoop Configuration

2014-11-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224651#comment-14224651 ] Apache Spark commented on SPARK-4196: - User 'tdas' has created a pull request for this

[jira] [Commented] (SPARK-2072) Streaming not processing a file with particular number of entries

2014-11-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224660#comment-14224660 ] Tathagata Das commented on SPARK-2072: -- Does this issue still exist? If not, I am

[jira] [Updated] (SPARK-4598) Paginate stage page to avoid OOM with 100,000 tasks

2014-11-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4598: --- Summary: Paginate stage page to avoid OOM with 100,000 tasks (was:

[jira] [Updated] (SPARK-4598) java.lang.OutOfMemoryError occurs when opening stage page of an application has 100000 tasks,

2014-11-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4598: --- Priority: Critical (was: Major) java.lang.OutOfMemoryError occurs when opening stage page

[jira] [Commented] (SPARK-907) Add JSON endpoints to SparkUI

2014-11-25 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224811#comment-14224811 ] Andrew Ash commented on SPARK-907: -- [~dmccauley] this report looks like a duplicate of

[jira] [Commented] (SPARK-4598) Paginate stage page to avoid OOM with 100,000 tasks

2014-11-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224810#comment-14224810 ] Patrick Wendell commented on SPARK-4598: It is a good idea to paginate this page.

[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped

2014-11-25 Thread Gino Bustelo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224818#comment-14224818 ] Gino Bustelo commented on SPARK-2892: - Update? Socket Receiver does not stop when

[jira] [Commented] (SPARK-907) Add JSON endpoints to SparkUI

2014-11-25 Thread David McCauley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224820#comment-14224820 ] David McCauley commented on SPARK-907: -- [~aash] yes, SPARK-3644 seems to be a much

[jira] [Closed] (SPARK-907) Add JSON endpoints to SparkUI

2014-11-25 Thread David McCauley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David McCauley closed SPARK-907. Resolution: Duplicate Add JSON endpoints to SparkUI -

[jira] [Commented] (SPARK-2495) Ability to re-create ML models

2014-11-25 Thread Tamas Jambor (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224894#comment-14224894 ] Tamas Jambor commented on SPARK-2495: - hi all, what's the reason

[jira] [Created] (SPARK-4604) Make MatrixFactorizationModel constructor public

2014-11-25 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-4604: Summary: Make MatrixFactorizationModel constructor public Key: SPARK-4604 URL: https://issues.apache.org/jira/browse/SPARK-4604 Project: Spark Issue Type:

[jira] [Commented] (SPARK-4604) Make MatrixFactorizationModel constructor public

2014-11-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224973#comment-14224973 ] Apache Spark commented on SPARK-4604: - User 'mengxr' has created a pull request for

[jira] [Commented] (SPARK-2495) Ability to re-create ML models

2014-11-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224970#comment-14224970 ] Xiangrui Meng commented on SPARK-2495: -- I created SPARK-4604 for

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-25 Thread Tianshuo Deng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224996#comment-14224996 ] Tianshuo Deng commented on SPARK-4452: -- [~sandyr]: Thanks for the feedback! For

[jira] [Created] (SPARK-4605) Spark Kernel to enable interactive Spark applications

2014-11-25 Thread Chip Senkbeil (JIRA)
Chip Senkbeil created SPARK-4605: Summary: Spark Kernel to enable interactive Spark applications Key: SPARK-4605 URL: https://issues.apache.org/jira/browse/SPARK-4605 Project: Spark Issue

[jira] [Updated] (SPARK-4605) Spark Kernel to enable interactive Spark applications

2014-11-25 Thread Chip Senkbeil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chip Senkbeil updated SPARK-4605: - Attachment: Kernel Architecture.pdf PDF outlining the key values of the kernel and its general

[jira] [Updated] (SPARK-4605) Spark Kernel to enable interactive Spark applications

2014-11-25 Thread Chip Senkbeil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chip Senkbeil updated SPARK-4605: - Description: Enables applications to interact with a Spark cluster using Scala in several ways:

[jira] [Updated] (SPARK-4605) Spark Kernel to enable interactive Spark applications

2014-11-25 Thread Chip Senkbeil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chip Senkbeil updated SPARK-4605: - Attachment: Kernel Architecture Widescreen.pdf Kernel Architecture.pdf Spark

[jira] [Updated] (SPARK-4605) Spark Kernel to enable interactive Spark applications

2014-11-25 Thread Chip Senkbeil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chip Senkbeil updated SPARK-4605: - Attachment: (was: Kernel Architecture.pdf) Spark Kernel to enable interactive Spark

[jira] [Commented] (SPARK-2192) Examples Data Not in Binary Distribution

2014-11-25 Thread Pat McDonough (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225036#comment-14225036 ] Pat McDonough commented on SPARK-2192: -- [~srowen] - I fully support that and agree

[jira] [Comment Edited] (SPARK-2192) Examples Data Not in Binary Distribution

2014-11-25 Thread Pat McDonough (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225036#comment-14225036 ] Pat McDonough edited comment on SPARK-2192 at 11/25/14 7:11 PM:

[jira] [Commented] (SPARK-4605) Spark Kernel to enable interactive Spark applications

2014-11-25 Thread Chip Senkbeil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225058#comment-14225058 ] Chip Senkbeil commented on SPARK-4605: -- [~ilikerps], referencing you in case you did

[jira] [Commented] (SPARK-4462) flume-sink build broken in SBT

2014-11-25 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225115#comment-14225115 ] Michael Armbrust commented on SPARK-4462: - Yeah, it's probably been fixed. Thanks

[jira] [Created] (SPARK-4606) SparkSubmitDriverBootstrapper does not propagate EOF to child JVM

2014-11-25 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-4606: - Summary: SparkSubmitDriverBootstrapper does not propagate EOF to child JVM Key: SPARK-4606 URL: https://issues.apache.org/jira/browse/SPARK-4606 Project: Spark

[jira] [Commented] (SPARK-4606) SparkSubmitDriverBootstrapper does not propagate EOF to child JVM

2014-11-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225136#comment-14225136 ] Apache Spark commented on SPARK-4606: - User 'vanzin' has created a pull request for

[jira] [Updated] (SPARK-4605) Proposal: Spark Kernel to enable interactive Spark applications

2014-11-25 Thread Chip Senkbeil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chip Senkbeil updated SPARK-4605: - Summary: Proposal: Spark Kernel to enable interactive Spark applications (was: Spark Kernel to

[jira] [Updated] (SPARK-4605) Proposal: Spark Kernel to enable interactive Spark applications

2014-11-25 Thread Chip Senkbeil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chip Senkbeil updated SPARK-4605: - Description: This architecture is describing running code that was demonstrated at the

[jira] [Updated] (SPARK-4605) Proposed Contribution: Spark Kernel to enable interactive Spark applications

2014-11-25 Thread Chip Senkbeil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chip Senkbeil updated SPARK-4605: - Summary: Proposed Contribution: Spark Kernel to enable interactive Spark applications (was:

  1   2   3   >