[jira] [Commented] (SPARK-3019) Pluggable block transfer (data plane communication) interface

2014-08-17 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099888#comment-14099888 ] Sandy Ryza commented on SPARK-3019: --- I agree that it's not typically a problem, but I

[jira] [Resolved] (SPARK-3042) DecisionTree filtering is very inefficient

2014-08-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3042. -- Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1975

[jira] [Commented] (SPARK-3019) Pluggable block transfer (data plane communication) interface

2014-08-17 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099909#comment-14099909 ] Mridul Muralidharan commented on SPARK-3019: I am yet to go through the

[jira] [Commented] (SPARK-3019) Pluggable block transfer (data plane communication) interface

2014-08-17 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099910#comment-14099910 ] Mridul Muralidharan commented on SPARK-3019: Btw, can we do something about

[jira] [Created] (SPARK-3088) fix test noisy message

2014-08-17 Thread wangfei (JIRA)
wangfei created SPARK-3088: -- Summary: fix test noisy message Key: SPARK-3088 URL: https://issues.apache.org/jira/browse/SPARK-3088 Project: Spark Issue Type: Improvement Components: Build

[jira] [Updated] (SPARK-3088) noisy messages when run tests

2014-08-17 Thread wangfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangfei updated SPARK-3088: --- Summary: noisy messages when run tests (was: fix test noisy message) noisy messages when run tests

[jira] [Commented] (SPARK-3088) noisy messages when run tests

2014-08-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099922#comment-14099922 ] Apache Spark commented on SPARK-3088: - User 'scwf' has created a pull request for this

[jira] [Updated] (SPARK-732) Recomputation of RDDs may result in duplicated accumulator updates

2014-08-17 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-732: -- Priority: Blocker (was: Major) Recomputation of RDDs may result in duplicated accumulator updates

[jira] [Updated] (SPARK-732) Recomputation of RDDs may result in duplicated accumulator updates

2014-08-17 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-732: -- Component/s: Spark Core Recomputation of RDDs may result in duplicated accumulator updates

[jira] [Updated] (SPARK-732) Recomputation of RDDs may result in duplicated accumulator updates

2014-08-17 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-732: -- Affects Version/s: 1.1.0 1.0.1 Recomputation of RDDs may result in duplicated

[jira] [Commented] (SPARK-3019) Pluggable block transfer (data plane communication) interface

2014-08-17 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099983#comment-14099983 ] Sandy Ryza commented on SPARK-3019: --- Thanks for the info Mridul. A few extra

[jira] [Commented] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp

2014-08-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1406#comment-1406 ] Apache Spark commented on SPARK-2881: - User 'pwendell' has created a pull request for

[jira] [Updated] (SPARK-2608) scheduler backend create executor launch command not correctly

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2608: --- Fix Version/s: (was: 1.1.0) scheduler backend create executor launch command not

[jira] [Updated] (SPARK-2608) scheduler backend create executor launch command not correctly

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2608: --- Component/s: (was: Spark Core) Mesos scheduler backend create executor

[jira] [Updated] (SPARK-2608) scheduler backend create executor launch command not correctly

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2608: --- Target Version/s: 1.1.0 scheduler backend create executor launch command not correctly

[jira] [Updated] (SPARK-2608) Mesos scheduler backend create executor launch command not correctly

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2608: --- Summary: Mesos scheduler backend create executor launch command not correctly (was:

[jira] [Created] (SPARK-3089) Make error message in ConnectionManager more meaningful

2014-08-17 Thread Kousuke Saruta (JIRA)
Kousuke Saruta created SPARK-3089: - Summary: Make error message in ConnectionManager more meaningful Key: SPARK-3089 URL: https://issues.apache.org/jira/browse/SPARK-3089 Project: Spark

[jira] [Commented] (SPARK-3089) Make error message in ConnectionManager more meaningful

2014-08-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100024#comment-14100024 ] Apache Spark commented on SPARK-3089: - User 'sarutak' has created a pull request for

[jira] [Created] (SPARK-3090) Add shutdown hook to stop SparkContext for YARN Client mode

2014-08-17 Thread Kousuke Saruta (JIRA)
Kousuke Saruta created SPARK-3090: - Summary: Add shutdown hook to stop SparkContext for YARN Client mode Key: SPARK-3090 URL: https://issues.apache.org/jira/browse/SPARK-3090 Project: Spark

[jira] [Commented] (SPARK-3090) Add shutdown hook to stop SparkContext for YARN Client mode

2014-08-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100075#comment-14100075 ] Apache Spark commented on SPARK-3090: - User 'sarutak' has created a pull request for

[jira] [Updated] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2881: --- Target Version/s: 1.2.0 (was: 1.1.0) Snappy is now default codec - could lead to conflicts

[jira] [Updated] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2881: --- Fix Version/s: 1.1.0 Snappy is now default codec - could lead to conflicts since uses /tmp

[jira] [Commented] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100164#comment-14100164 ] Patrick Wendell commented on SPARK-2881: Okay I've merged a change in branch-1.1

[jira] [Created] (SPARK-3091) Add support for caching metadata on Parquet files

2014-08-17 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-3091: Summary: Add support for caching metadata on Parquet files Key: SPARK-3091 URL: https://issues.apache.org/jira/browse/SPARK-3091 Project: Spark Issue Type:

[jira] [Updated] (SPARK-3085) Use compact data structures in SQL joins

2014-08-17 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3085: - Target Version/s: 1.1.0 Use compact data structures in SQL joins

[jira] [Commented] (SPARK-3091) Add support for caching metadata on Parquet files

2014-08-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100181#comment-14100181 ] Apache Spark commented on SPARK-3091: - User 'mateiz' has created a pull request for

[jira] [Updated] (SPARK-3092) Always include the thriftserver when -Phive is enabled.

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3092: --- Target Version/s: 1.1.0 Always include the thriftserver when -Phive is enabled.

[jira] [Created] (SPARK-3092) Always include the thriftserver when -Phive is enabled.

2014-08-17 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-3092: -- Summary: Always include the thriftserver when -Phive is enabled. Key: SPARK-3092 URL: https://issues.apache.org/jira/browse/SPARK-3092 Project: Spark

[jira] [Updated] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2881: --- Fix Version/s: 1.2.0 Snappy is now default codec - could lead to conflicts since uses /tmp

[jira] [Resolved] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2881. Resolution: Fixed Snappy is now default codec - could lead to conflicts since uses /tmp

[jira] [Updated] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2881: --- Target Version/s: (was: 1.2.0) Snappy is now default codec - could lead to conflicts

[jira] [Commented] (SPARK-2881) Snappy is now default codec - could lead to conflicts since uses /tmp

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100185#comment-14100185 ] Patrick Wendell commented on SPARK-2881: Fixed in master branch via:

[jira] [Updated] (SPARK-2653) Heap size should be the sum of driver.memory and executor.memory in local mode

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2653: --- Priority: Minor (was: Major) Heap size should be the sum of driver.memory and

[jira] [Updated] (SPARK-2365) Add IndexedRDD, an efficient updatable key-value store

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2365: --- Target Version/s: 1.2.0 (was: 1.1.0) Add IndexedRDD, an efficient updatable key-value

[jira] [Updated] (SPARK-2653) Heap size should be the sum of driver.memory and executor.memory in local mode

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2653: --- Target Version/s: 1.2.0 (was: 1.1.0) Heap size should be the sum of driver.memory and

[jira] [Updated] (SPARK-2672) support compressed file in wholeFile()

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2672: --- Target Version/s: 1.2.0 (was: 1.1.0) support compressed file in wholeFile()

[jira] [Commented] (SPARK-2371) Show locally-running tasks (e.g. from take()) in web UI

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100186#comment-14100186 ] Patrick Wendell commented on SPARK-2371: SPARK-3029 disables local execution for

[jira] [Updated] (SPARK-2371) Show locally-running tasks (e.g. from take()) in web UI

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2371: --- Target Version/s: 1.2.0 (was: 1.1.0) Show locally-running tasks (e.g. from take()) in web

[jira] [Updated] (SPARK-3091) Add support for caching metadata on Parquet files

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3091: --- Priority: Blocker (was: Major) Add support for caching metadata on Parquet files

[jira] [Commented] (SPARK-3092) Always include the thriftserver when -Phive is enabled.

2014-08-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100196#comment-14100196 ] Apache Spark commented on SPARK-3092: - User 'pwendell' has created a pull request for

[jira] [Commented] (SPARK-2970) spark-sql script ends with IOException when EventLogging is enabled

2014-08-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100205#comment-14100205 ] Apache Spark commented on SPARK-2970: - User 'marmbrus' has created a pull request for

[jira] [Reopened] (SPARK-2970) spark-sql script ends with IOException when EventLogging is enabled

2014-08-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reopened SPARK-2970: - spark-sql script ends with IOException when EventLogging is enabled

[jira] [Created] (SPARK-3093) masterLock in Worker is no longer need

2014-08-17 Thread Chen Chao (JIRA)
Chen Chao created SPARK-3093: Summary: masterLock in Worker is no longer need Key: SPARK-3093 URL: https://issues.apache.org/jira/browse/SPARK-3093 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-3093) masterLock in Worker is no longer need

2014-08-17 Thread Chen Chao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100230#comment-14100230 ] Chen Chao commented on SPARK-3093: -- PR available at

[jira] [Commented] (SPARK-3093) masterLock in Worker is no longer need

2014-08-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100229#comment-14100229 ] Apache Spark commented on SPARK-3093: - User 'CrazyJvm' has created a pull request for

[jira] [Commented] (SPARK-975) Spark Replay Debugger

2014-08-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100239#comment-14100239 ] Cheng Lian commented on SPARK-975: -- Usually we just filter them out by checking

[jira] [Resolved] (SPARK-3008) PySpark fails due to zipimport not able to load the assembly jar (/usr/bin/python: No module named pyspark)

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3008. Resolution: Duplicate PySpark fails due to zipimport not able to load the assembly jar

[jira] [Commented] (SPARK-3008) PySpark fails due to zipimport not able to load the assembly jar (/usr/bin/python: No module named pyspark)

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100248#comment-14100248 ] Patrick Wendell commented on SPARK-3008: I believe this is a known issue - you

[jira] [Resolved] (SPARK-3087) ChiSqTest only stores results in the first 100 columns.

2014-08-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3087. -- Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1997

[jira] [Resolved] (SPARK-2900) inputBytes aren't aggregated for stages like other task metrics

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2900. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1826

[jira] [Updated] (SPARK-2900) inputBytes aren't aggregated for stages like other task metrics

2014-08-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2900: --- Assignee: Sandy Ryza inputBytes aren't aggregated for stages like other task metrics

[jira] [Created] (SPARK-3095) [PySpark] Speed up RDD.count()

2014-08-17 Thread Davies Liu (JIRA)
Davies Liu created SPARK-3095: - Summary: [PySpark] Speed up RDD.count() Key: SPARK-3095 URL: https://issues.apache.org/jira/browse/SPARK-3095 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-1310) Add support for cross validation to MLLibb

2014-08-17 Thread zhengbing li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100306#comment-14100306 ] zhengbing li commented on SPARK-1310: - can you provide a link to your solution? Add