[jira] [Resolved] (SPARK-2137) Timestamp UDFs broken

2014-06-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-2137. Resolution: Fixed Fix Version/s: 1.1.0 1.0.1 > Timestamp UDFs broken > --

[jira] [Updated] (SPARK-1487) Support record filtering via predicate pushdown in Parquet

2014-06-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-1487: --- Fix Version/s: 1.0.1 > Support record filtering via predicate pushdown in Parquet > -

[jira] [Updated] (SPARK-1913) Parquet table column pruning error caused by filter pushdown

2014-06-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-1913: --- Fix Version/s: 1.1.0 1.0.1 > Parquet table column pruning error caused by filter p

[jira] [Resolved] (SPARK-1487) Support record filtering via predicate pushdown in Parquet

2014-06-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-1487. Resolution: Fixed > Support record filtering via predicate pushdown in Parquet > --

[jira] [Reopened] (SPARK-1487) Support record filtering via predicate pushdown in Parquet

2014-06-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin reopened SPARK-1487: > Support record filtering via predicate pushdown in Parquet >

[jira] [Commented] (SPARK-1201) Do not materialize partitions whenever possible in BlockManager

2014-06-13 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031431#comment-14031431 ] Andrew Or commented on SPARK-1201: -- It will mainly be a bug fix that doesn't change the A

[jira] [Commented] (SPARK-1201) Do not materialize partitions whenever possible in BlockManager

2014-06-13 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031426#comment-14031426 ] Mark Hamstra commented on SPARK-1201: - Okay, but my question is really whether resolut

[jira] [Commented] (SPARK-1201) Do not materialize partitions whenever possible in BlockManager

2014-06-13 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031425#comment-14031425 ] Andrew Or commented on SPARK-1201: -- Depends on when we release 1.0.1. I am actually worki

[jira] [Commented] (SPARK-2141) Add sc.getPersistentRDDs() to PySpark

2014-06-13 Thread Kan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031356#comment-14031356 ] Kan Zhang commented on SPARK-2141: -- https://github.com/apache/spark/pull/1082 > Add sc.g

[jira] [Commented] (SPARK-1906) spark-submit doesn't send master URL to Driver in standalone cluster mode

2014-06-13 Thread Jacob Eisinger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031336#comment-14031336 ] Jacob Eisinger commented on SPARK-1906: --- All of the spark-defaults.conf and command

[jira] [Updated] (SPARK-2137) Timestamp UDFs broken

2014-06-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2137: Assignee: Yin Huai > Timestamp UDFs broken > - > > Key:

[jira] [Updated] (SPARK-2119) Reading Parquet InputSplits dominates query execution time when reading off S3

2014-06-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2119: Assignee: Cheng Lian > Reading Parquet InputSplits dominates query execution time when read

[jira] [Commented] (SPARK-1112) When spark.akka.frameSize > 10, task results bigger than 10MiB block execution

2014-06-13 Thread Chen Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031307#comment-14031307 ] Chen Jin commented on SPARK-1112: - [~matei] Do you know which akka version we should use t

[jira] [Commented] (SPARK-1201) Do not materialize partitions whenever possible in BlockManager

2014-06-13 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031257#comment-14031257 ] Mark Hamstra commented on SPARK-1201: - What causes this to not be fixable within the s

[jira] [Comment Edited] (SPARK-2060) Querying JSON Datasets with SQL and DSL in Spark SQL

2014-06-13 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030989#comment-14030989 ] Yin Huai edited comment on SPARK-2060 at 6/13/14 10:15 PM: --- Prog

[jira] [Created] (SPARK-2144) SparkUI Executors tab displays incorrect RDD blocks

2014-06-13 Thread Andrew Or (JIRA)
Andrew Or created SPARK-2144: Summary: SparkUI Executors tab displays incorrect RDD blocks Key: SPARK-2144 URL: https://issues.apache.org/jira/browse/SPARK-2144 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2143) Display Spark version on Driver web page

2014-06-13 Thread Jeff Hammerbacher (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031204#comment-14031204 ] Jeff Hammerbacher commented on SPARK-2143: -- Perhaps under the "Environment" tab?

[jira] [Created] (SPARK-2143) Display Spark version on Driver web page

2014-06-13 Thread Jeff Hammerbacher (JIRA)
Jeff Hammerbacher created SPARK-2143: Summary: Display Spark version on Driver web page Key: SPARK-2143 URL: https://issues.apache.org/jira/browse/SPARK-2143 Project: Spark Issue Type: Im

[jira] [Updated] (SPARK-2081) Undefine output() from the abstract class Command and implement it in concrete subclasses

2014-06-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2081: Fix Version/s: 1.1.0 1.0.1 > Undefine output() from the abstract class C

[jira] [Created] (SPARK-2142) Give better indicator of how GC cuts into task time

2014-06-13 Thread Sandy Ryza (JIRA)
Sandy Ryza created SPARK-2142: - Summary: Give better indicator of how GC cuts into task time Key: SPARK-2142 URL: https://issues.apache.org/jira/browse/SPARK-2142 Project: Spark Issue Type: Impro

[jira] [Updated] (SPARK-1201) Do not materialize partitions whenever possible in BlockManager

2014-06-13 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-1201: - Description: This is a slightly more complex version of SPARK-942 where we try to avoid unrolling iterato

[jira] [Resolved] (SPARK-2128) No plan for DESCRIBE

2014-06-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2128. - Resolution: Fixed https://github.com/apache/spark/pull/1071 > No plan for DESCRIBE > ---

[jira] [Updated] (SPARK-2128) No plan for DESCRIBE

2014-06-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2128: Fix Version/s: 1.1.0 1.0.1 > No plan for DESCRIBE >

[jira] [Resolved] (SPARK-2094) Ensure exactly once semantics for DDL / Commands

2014-06-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2094. - Resolution: Fixed https://github.com/apache/spark/pull/1071 > Ensure exactly once semant

[jira] [Resolved] (SPARK-2081) Undefine output() from the abstract class Command and implement it in concrete subclasses

2014-06-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2081. - Resolution: Fixed https://github.com/apache/spark/pull/1071 > Undefine output() from the

[jira] [Resolved] (SPARK-1852) SparkSQL Queries with Sorts run before the user asks them to

2014-06-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-1852. - Resolution: Fixed Fix Version/s: 1.1.0 1.0.1 https://github.com

[jira] [Commented] (SPARK-1730) Make receiver store data reliably to avoid data-loss on executor failures

2014-06-13 Thread Hari Shreedharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031011#comment-14031011 ] Hari Shreedharan commented on SPARK-1730: - I am starting work on this one. I have

[jira] [Commented] (SPARK-2060) Querying JSON Datasets with SQL and DSL in Spark SQL

2014-06-13 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030989#comment-14030989 ] Yin Huai commented on SPARK-2060: - Programming guide: http://yhuai.github.io/site/sql-prog

[jira] [Commented] (SPARK-1782) svd for sparse matrix using ARPACK

2014-06-13 Thread Li Pu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030954#comment-14030954 ] Li Pu commented on SPARK-1782: -- PR: https://github.com/apache/spark/pull/964 > svd for spars

[jira] [Resolved] (SPARK-1097) ConcurrentModificationException

2014-06-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-1097. Resolution: Fixed Fix Version/s: 1.1.0 1.0.1 Assignee: Nishkam Ra

[jira] [Comment Edited] (SPARK-1951) spark on yarn can't start

2014-06-13 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030900#comment-14030900 ] Thomas Graves edited comment on SPARK-1951 at 6/13/14 5:55 PM: -

[jira] [Resolved] (SPARK-1951) spark on yarn can't start

2014-06-13 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-1951. -- Resolution: Invalid As pointed by Sean out spark-submit changed the default scheme to be file:/

[jira] [Updated] (SPARK-2141) Add sc.getPersistentRDDs() to PySpark

2014-06-13 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-2141: Description: PySpark does not appear to have {{sc.getPersistentRDDs()}}. (was: PySpark doe

[jira] [Created] (SPARK-2141) Add sc.getPersistentRDDs() to PySpark

2014-06-13 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-2141: --- Summary: Add sc.getPersistentRDDs() to PySpark Key: SPARK-2141 URL: https://issues.apache.org/jira/browse/SPARK-2141 Project: Spark Issue Type: New Fea

[jira] [Created] (SPARK-2140) yarn stable client doesn't properly handle MEMORY_OVERHEAD for AM

2014-06-13 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-2140: Summary: yarn stable client doesn't properly handle MEMORY_OVERHEAD for AM Key: SPARK-2140 URL: https://issues.apache.org/jira/browse/SPARK-2140 Project: Spark

[jira] [Updated] (SPARK-2140) yarn stable client doesn't properly handle MEMORY_OVERHEAD for AM

2014-06-13 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-2140: - Fix Version/s: 1.1.0 1.0.1 > yarn stable client doesn't properly handle MEMORY

[jira] [Resolved] (SPARK-2139) spark.yarn.dist.* configs are not documented

2014-06-13 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-2139. -- Resolution: Duplicate > spark.yarn.dist.* configs are not documented >

[jira] [Commented] (SPARK-2051) spark.yarn.dist.* configs are not supported in yarn-cluster mode

2014-06-13 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030612#comment-14030612 ] Thomas Graves commented on SPARK-2051: -- Note that the configs weren't originally inte

[jira] [Updated] (SPARK-2051) spark.yarn.dist.* configs are not supported in yarn-cluster mode

2014-06-13 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-2051: - Summary: spark.yarn.dist.* configs are not supported in yarn-cluster mode (was: In yarn.ClientBa

[jira] [Updated] (SPARK-2051) spark.yarn.dist.* configs are not supported in yarn-cluster mode

2014-06-13 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-2051: - Issue Type: Improvement (was: Bug) > spark.yarn.dist.* configs are not supported in yarn-cluster

[jira] [Created] (SPARK-2139) spark.yarn.dist.* configs are not documented

2014-06-13 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-2139: Summary: spark.yarn.dist.* configs are not documented Key: SPARK-2139 URL: https://issues.apache.org/jira/browse/SPARK-2139 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2014-06-13 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-2138: --- Description: When the algorithm running at certain stage, when running the reduceBykey() function, It can le

[jira] [Updated] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2014-06-13 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-2138: --- Description: When the algorithm running at certain stage, when running the reduceBykey() algorithm, It can l

[jira] [Created] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2014-06-13 Thread DjvuLee (JIRA)
DjvuLee created SPARK-2138: -- Summary: The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger Key: SPARK-2138 URL: https://issues.apache.org/jira/browse/SPARK-2138 Pro

[jira] [Commented] (SPARK-2134) Report metrics before application finishes

2014-06-13 Thread Rahul Singhal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030541#comment-14030541 ] Rahul Singhal commented on SPARK-2134: -- PR: https://github.com/apache/spark/pull/1076

[jira] [Commented] (SPARK-2133) FileNotFoundException in BlockObjectWriter

2014-06-13 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030480#comment-14030480 ] Wenchen Fan commented on SPARK-2133: The exception happened when spark want to write a

[jira] [Created] (SPARK-2137) Timestamp UDFs broken

2014-06-13 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-2137: --- Summary: Timestamp UDFs broken Key: SPARK-2137 URL: https://issues.apache.org/jira/browse/SPARK-2137 Project: Spark Issue Type: Bug Component

[jira] [Updated] (SPARK-1782) svd for sparse matrix using ARPACK

2014-06-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1782: - Assignee: (was: Xiangrui Meng) > svd for sparse matrix using ARPACK > ---

[jira] [Assigned] (SPARK-1782) svd for sparse matrix using ARPACK

2014-06-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-1782: Assignee: Xiangrui Meng > svd for sparse matrix using ARPACK >