[jira] [Created] (SPARK-1913) column pruning problem of Parquet File

2014-05-23 Thread Chen Chao (JIRA)
Chen Chao created SPARK-1913: Summary: column pruning problem of Parquet File Key: SPARK-1913 URL: https://issues.apache.org/jira/browse/SPARK-1913 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-1914) Simplify CountFunction not to traverse to evaluate all child expressions.

2014-05-23 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-1914: Summary: Simplify CountFunction not to traverse to evaluate all child expressions. Key: SPARK-1914 URL: https://issues.apache.org/jira/browse/SPARK-1914 Project:

[jira] [Commented] (SPARK-1913) column pruning problem of Parquet File

2014-05-23 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007026#comment-14007026 ] Cheng Lian commented on SPARK-1913: --- Attributes referenced only in those filters that

[jira] [Updated] (SPARK-1912) Compression memory issue during reduce

2014-05-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-1912: --- Summary: Compression memory issue during reduce (was: Compression memory issue during shuffle)

[jira] [Updated] (SPARK-1914) Simplify CountFunction not to traverse to evaluate all child expressions.

2014-05-23 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin updated SPARK-1914: - Description: {{CountFunction}} should count up only if the child's evaluated value is not null.

[jira] [Commented] (SPARK-1914) Simplify CountFunction not to traverse to evaluate all child expressions.

2014-05-23 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007033#comment-14007033 ] Takuya Ueshin commented on SPARK-1914: -- Pull-requested:

[jira] [Updated] (SPARK-1913) column pruning problem of Parquet File

2014-05-23 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-1913: -- Description: When scanning Parquet tables, attributes referenced only in predicates that are pushed

[jira] [Updated] (SPARK-1913) column pruning problem of Parquet File

2014-05-23 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-1913: -- Description: When scanning Parquet tables, attributes referenced only in predicates that are pushed

[jira] [Updated] (SPARK-1913) Column pruning for Parquet table

2014-05-23 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-1913: -- Description: When scanning Parquet tables, attributes referenced only in predicates that are pushed

[jira] [Updated] (SPARK-1913) Column pruning for Parquet table

2014-05-23 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-1913: -- Summary: Column pruning for Parquet table (was: column pruning problem of Parquet File) Column

[jira] [Commented] (SPARK-1215) Clustering: Index out of bounds error

2014-05-23 Thread Denis Serduik (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007090#comment-14007090 ] Denis Serduik commented on SPARK-1215: -- I don't think that the problem is about size

[jira] [Commented] (SPARK-1898) In deploy.yarn.Client, use YarnClient rather than YarnClientImpl

2014-05-23 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007100#comment-14007100 ] Thomas Graves commented on SPARK-1898: -- That is correct, as long as you don't modify

[jira] [Commented] (SPARK-1915) AverageFunction should not count if the evaluated value is null.

2014-05-23 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007136#comment-14007136 ] Takuya Ueshin commented on SPARK-1915: -- Pull-requested:

[jira] [Commented] (SPARK-1912) Compression memory issue during reduce

2014-05-23 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007229#comment-14007229 ] Andrew Ash commented on SPARK-1912: --- https://github.com/apache/spark/pull/860

[jira] [Commented] (SPARK-1867) Spark Documentation Error causes java.lang.IllegalStateException: unread block data

2014-05-23 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007283#comment-14007283 ] sam commented on SPARK-1867: Changing org.apache.hadoop % hadoop-client % 2.3.0-mr1-cdh5.0.0,

[jira] [Commented] (SPARK-1867) Spark Documentation Error causes java.lang.IllegalStateException: unread block data

2014-05-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007339#comment-14007339 ] Sean Owen commented on SPARK-1867: -- Yes, the 'mr1' artifacts are for when you are *not*

[jira] [Commented] (SPARK-1790) Update EC2 scripts to support r3 instance types

2014-05-23 Thread Sujeet Varakhedi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007427#comment-14007427 ] Sujeet Varakhedi commented on SPARK-1790: - I will work on this Update EC2

[jira] [Commented] (SPARK-983) Support external sorting for RDD#sortByKey()

2014-05-23 Thread Madhu Siddalingaiah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007479#comment-14007479 ] Madhu Siddalingaiah commented on SPARK-983: --- I have the beginnings of a

[jira] [Commented] (SPARK-1867) Spark Documentation Error causes java.lang.IllegalStateException: unread block data

2014-05-23 Thread Michael Malak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007565#comment-14007565 ] Michael Malak commented on SPARK-1867: -- Thank you, sam, that fixed it for me! FYI, I

[jira] [Created] (SPARK-1916) SparkFlumeEvent with body bigger than 1020 bytes are not read properly

2014-05-23 Thread David Lemieux (JIRA)
David Lemieux created SPARK-1916: Summary: SparkFlumeEvent with body bigger than 1020 bytes are not read properly Key: SPARK-1916 URL: https://issues.apache.org/jira/browse/SPARK-1916 Project: Spark

[jira] [Updated] (SPARK-1916) SparkFlumeEvent with body bigger than 1020 bytes are not read properly

2014-05-23 Thread David Lemieux (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Lemieux updated SPARK-1916: - Attachment: SPARK-1916.diff Attaching a diff for now. I'll create a pull request shortly.

[jira] [Commented] (SPARK-1902) Spark shell prints error when :4040 port already in use

2014-05-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007684#comment-14007684 ] Patrick Wendell commented on SPARK-1902: Yeah, this would be a good one to fix.

[jira] [Created] (SPARK-1917) PySpark fails to import functions from {{scipy.special}}

2014-05-23 Thread Uri Laserson (JIRA)
Uri Laserson created SPARK-1917: --- Summary: PySpark fails to import functions from {{scipy.special}} Key: SPARK-1917 URL: https://issues.apache.org/jira/browse/SPARK-1917 Project: Spark Issue

[jira] [Commented] (SPARK-1913) Parquet table column pruning error caused by filter pushdown

2014-05-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007710#comment-14007710 ] Reynold Xin commented on SPARK-1913: I added the exception. Parquet table column

[jira] [Updated] (SPARK-1913) Parquet table column pruning error caused by filter pushdown

2014-05-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-1913: --- Description: When scanning Parquet tables, attributes referenced only in predicates that are pushed

[jira] [Commented] (SPARK-1917) PySpark fails to import functions from {{scipy.special}}

2014-05-23 Thread Uri Laserson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007741#comment-14007741 ] Uri Laserson commented on SPARK-1917: - https://github.com/apache/spark/pull/866

[jira] [Commented] (SPARK-1138) Spark 0.9.0 does not work with Hadoop / HDFS

2014-05-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007837#comment-14007837 ] Reynold Xin commented on SPARK-1138: Just want to chime in that I also encountered

[jira] [Created] (SPARK-1918) PySpark shell --py-files does not work for zip files

2014-05-23 Thread Andrew Or (JIRA)
Andrew Or created SPARK-1918: Summary: PySpark shell --py-files does not work for zip files Key: SPARK-1918 URL: https://issues.apache.org/jira/browse/SPARK-1918 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-1918) PySpark shell --py-files does not work for zip files

2014-05-23 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007874#comment-14007874 ] Andrew Or commented on SPARK-1918: -- https://github.com/apache/spark/pull/853 PySpark

[jira] [Created] (SPARK-1919) In Windows, Spark shell cannot load classes in spark.jars (--jars)

2014-05-23 Thread Andrew Or (JIRA)
Andrew Or created SPARK-1919: Summary: In Windows, Spark shell cannot load classes in spark.jars (--jars) Key: SPARK-1919 URL: https://issues.apache.org/jira/browse/SPARK-1919 Project: Spark