[jira] [Updated] (SPARK-3965) Spark assembly for hadoop2 contains avro-mapred for hadoop1

2014-11-17 Thread David Jacot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Jacot updated SPARK-3965: --- Affects Version/s: 1.2.0 > Spark assembly for hadoop2 contains avro-mapred for hadoop1 > -

[jira] [Commented] (SPARK-3717) DecisionTree, RandomForest: Partition by feature

2014-11-17 Thread SUMANTH B B N (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215800#comment-14215800 ] SUMANTH B B N commented on SPARK-3717: -- [~josephkb] [~manishamde] [~codedeft] await

[jira] [Commented] (SPARK-4468) Wrong Parquet filters are created for all inequality predicates with literals on the left hand side

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215792#comment-14215792 ] Apache Spark commented on SPARK-4468: - User 'liancheng' has created a pull request for

[jira] [Commented] (SPARK-3337) Paranoid quoting in shell to allow install dirs with spaces within.

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215776#comment-14215776 ] Andrew Or commented on SPARK-3337: -- Hm [~pwendell] what do you think? > Paranoid quoting

[jira] [Issue Comment Deleted] (SPARK-3337) Paranoid quoting in shell to allow install dirs with spaces within.

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3337: - Comment: was deleted (was: Hm [~pwend...@gmail.com] what do you think?) > Paranoid quoting in shell to al

[jira] [Commented] (SPARK-3337) Paranoid quoting in shell to allow install dirs with spaces within.

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215775#comment-14215775 ] Andrew Or commented on SPARK-3337: -- Hm [~pwend...@gmail.com] what do you think? > Parano

[jira] [Commented] (SPARK-4470) SparkContext accepts local[0] as a master URL

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215741#comment-14215741 ] Apache Spark commented on SPARK-4470: - User 'kmaehashi' has created a pull request for

[jira] [Created] (SPARK-4470) SparkContext accepts local[0] as a master URL

2014-11-17 Thread Kenichi Maehashi (JIRA)
Kenichi Maehashi created SPARK-4470: --- Summary: SparkContext accepts local[0] as a master URL Key: SPARK-4470 URL: https://issues.apache.org/jira/browse/SPARK-4470 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-4466) Provide support for publishing Scala 2.11 artifacts to Maven

2014-11-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4466. Resolution: Fixed Fix Version/s: 1.2.0 > Provide support for publishing Scala 2.11 ar

[jira] [Commented] (SPARK-4469) Move the SemanticAnalyzer from Physical Execution to Analysis

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215692#comment-14215692 ] Apache Spark commented on SPARK-4469: - User 'chenghao-intel' has created a pull reques

[jira] [Created] (SPARK-4469) Move the SemanticAnalyzer from Physical Execution to Analysis

2014-11-17 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4469: Summary: Move the SemanticAnalyzer from Physical Execution to Analysis Key: SPARK-4469 URL: https://issues.apache.org/jira/browse/SPARK-4469 Project: Spark Issue Ty

[jira] [Commented] (SPARK-4468) Wrong Parquet filters are created for all inequality predicates with literals on the left hand side

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215662#comment-14215662 ] Apache Spark commented on SPARK-4468: - User 'liancheng' has created a pull request for

[jira] [Created] (SPARK-4468) Wrong Parquet filters are created for all inequality predicates with literals on the left hand side

2014-11-17 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-4468: - Summary: Wrong Parquet filters are created for all inequality predicates with literals on the left hand side Key: SPARK-4468 URL: https://issues.apache.org/jira/browse/SPARK-4468

[jira] [Commented] (SPARK-4075) Jar url validation is not enough for Jar file

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215648#comment-14215648 ] Apache Spark commented on SPARK-4075: - User 'sarutak' has created a pull request for t

[jira] [Commented] (SPARK-2883) Spark Support for ORCFile format

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215643#comment-14215643 ] Apache Spark commented on SPARK-2883: - User 'scwf' has created a pull request for this

[jira] [Commented] (SPARK-3337) Paranoid quoting in shell to allow install dirs with spaces within.

2014-11-17 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215640#comment-14215640 ] Shivaram Venkataraman commented on SPARK-3337: -- [~andrewor14] can we pull thi

[jira] [Updated] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4452: - Description: When an Aggregator is used with ExternalSorter in a task, spark will create many small files

[jira] [Updated] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4452: - Description: When an Aggregator is used with ExternalSorter in a task, spark will create many small files

[jira] [Updated] (SPARK-4467) Number of elements read is never reset in ExternalSorter

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4467: - Fix Version/s: 1.1.1 > Number of elements read is never reset in ExternalSorter >

[jira] [Commented] (SPARK-4467) Number of elements read is never reset in ExternalSorter

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215620#comment-14215620 ] Apache Spark commented on SPARK-4467: - User 'andrewor14' has created a pull request fo

[jira] [Commented] (SPARK-4467) Number of elements read is never reset in ExternalSorter

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215619#comment-14215619 ] Andrew Or commented on SPARK-4467: -- For master and branch-1.2: https://github.com/apache/

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215617#comment-14215617 ] Andrew Or commented on SPARK-4452: -- I have created SPARK-4467 for the `elementsRead` bug

[jira] [Updated] (SPARK-4467) Number of elements read is never reset in ExternalSorter

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4467: - Summary: Number of elements read is never reset in ExternalSorter (was: Number of elements written is nev

[jira] [Updated] (SPARK-4286) Support External Shuffle Service with Mesos integration

2014-11-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4286: --- Assignee: Timothy Chen > Support External Shuffle Service with Mesos integration > ---

[jira] [Commented] (SPARK-4213) SparkSQL - ParquetFilters - No support for LT, LTE, GT, GTE operators

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215616#comment-14215616 ] Apache Spark commented on SPARK-4213: - User 'sarutak' has created a pull request for t

[jira] [Commented] (SPARK-4453) Simplify Parquet record filter generation

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215615#comment-14215615 ] Apache Spark commented on SPARK-4453: - User 'sarutak' has created a pull request for t

[jira] [Created] (SPARK-4467) Number of elements written is never reset in ExternalSorter

2014-11-17 Thread Andrew Or (JIRA)
Andrew Or created SPARK-4467: Summary: Number of elements written is never reset in ExternalSorter Key: SPARK-4467 URL: https://issues.apache.org/jira/browse/SPARK-4467 Project: Spark Issue Type

[jira] [Commented] (SPARK-4466) Provide support for publishing Scala 2.11 artifacts to Maven

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215595#comment-14215595 ] Apache Spark commented on SPARK-4466: - User 'pwendell' has created a pull request for

[jira] [Updated] (SPARK-4466) Provide support for publishing Scala 2.11 artifacts to Maven

2014-11-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4466: --- Assignee: Patrick Wendell > Provide support for publishing Scala 2.11 artifacts to Maven > ---

[jira] [Created] (SPARK-4466) Provide support for publishing Scala 2.11 artifacts to Maven

2014-11-17 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-4466: -- Summary: Provide support for publishing Scala 2.11 artifacts to Maven Key: SPARK-4466 URL: https://issues.apache.org/jira/browse/SPARK-4466 Project: Spark

[jira] [Commented] (SPARK-4463) Add (de)select all button for additional metrics in webUI

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215577#comment-14215577 ] Apache Spark commented on SPARK-4463: - User 'kayousterhout' has created a pull request

[jira] [Updated] (SPARK-4286) Support External Shuffle Service with Mesos integration

2014-11-17 Thread Timothy Chen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Chen updated SPARK-4286: Description: With the new external shuffle service added, we need to also make the Mesos integratio

[jira] [Created] (SPARK-4465) runAsSparkUser doesn't affect TaskRunner in Mesos environment at all.

2014-11-17 Thread Jongyoul Lee (JIRA)
Jongyoul Lee created SPARK-4465: --- Summary: runAsSparkUser doesn't affect TaskRunner in Mesos environment at all. Key: SPARK-4465 URL: https://issues.apache.org/jira/browse/SPARK-4465 Project: Spark

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215557#comment-14215557 ] Matei Zaharia commented on SPARK-4452: -- BTW we may also want to create a separate JIR

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215556#comment-14215556 ] Matei Zaharia commented on SPARK-4452: -- Got it. It would be fine to do this if you fo

[jira] [Resolved] (SPARK-4453) Simplify Parquet record filter generation

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4453. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3317 [https:/

[jira] [Comment Edited] (SPARK-4127) Streaming Linear Regression- Python bindings

2014-11-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215489#comment-14215489 ] Xiangrui Meng edited comment on SPARK-4127 at 11/18/14 12:38 AM: ---

[jira] [Updated] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4452: - Assignee: Tianshuo Deng > Shuffle data structures can starve others on the same thread for memory > -

[jira] [Resolved] (SPARK-4448) Support ConstantObjectInspector for unwrapping data

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4448. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3308 [https:/

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215491#comment-14215491 ] Apache Spark commented on SPARK-4452: - User 'andrewor14' has created a pull request fo

[jira] [Commented] (SPARK-4127) Streaming Linear Regression- Python bindings

2014-11-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215489#comment-14215489 ] Xiangrui Meng commented on SPARK-4127: -- [~slcclimber] I think you need to call `_conv

[jira] [Resolved] (SPARK-4443) Statistics bug for external table in spark sql hive

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4443. - Resolution: Fixed Issue resolved by pull request 3304 [https://github.com/apache/spark/pul

[jira] [Commented] (SPARK-4464) Description about configuration options need to be modified in docs.

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215487#comment-14215487 ] Apache Spark commented on SPARK-4464: - User 'tsudukim' has created a pull request for

[jira] [Resolved] (SPARK-4425) Handle NaN or Infinity cast to Timestamp correctly

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4425. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3283 [https:/

[jira] [Resolved] (SPARK-4420) Change nullability of Cast from DoubleType/FloatType to DecimalType.

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4420. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3278 [https:/

[jira] [Commented] (SPARK-4288) Add Sparse Autoencoder algorithm to MLlib

2014-11-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215479#comment-14215479 ] Xiangrui Meng commented on SPARK-4288: -- The implementation of neural network requires

[jira] [Updated] (SPARK-4405) Matrices.* construction methods should check for rows x cols overflow

2014-11-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4405: - Target Version/s: 1.2.0 > Matrices.* construction methods should check for rows x cols overflow >

[jira] [Created] (SPARK-4464) Description about configuration options need to be modified in docs.

2014-11-17 Thread Masayoshi TSUZUKI (JIRA)
Masayoshi TSUZUKI created SPARK-4464: Summary: Description about configuration options need to be modified in docs. Key: SPARK-4464 URL: https://issues.apache.org/jira/browse/SPARK-4464 Project: S

[jira] [Updated] (SPARK-4431) Implement efficient activeIterator for dense and sparse vector

2014-11-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4431: - Target Version/s: 1.2.0 > Implement efficient activeIterator for dense and sparse vector > ---

[jira] [Updated] (SPARK-4431) Implement efficient activeIterator for dense and sparse vector

2014-11-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4431: - Assignee: DB Tsai > Implement efficient activeIterator for dense and sparse vector > -

[jira] [Updated] (SPARK-4406) SVD should check for k < 1

2014-11-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4406: - Target Version/s: 1.2.0 > SVD should check for k < 1 > -- > >

[jira] [Updated] (SPARK-4306) LogisticRegressionWithLBFGS support for PySpark MLlib

2014-11-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4306: - Assignee: Davies Liu (was: Varadharajan) > LogisticRegressionWithLBFGS support for PySpark MLlib

[jira] [Updated] (SPARK-4439) Expose RandomForest in Python

2014-11-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4439: - Assignee: Davies Liu > Expose RandomForest in Python > - > >

[jira] [Updated] (SPARK-4435) Add setThreshold in Python LogisticRegressionModel and SVMModel

2014-11-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4435: - Assignee: Davies Liu > Add setThreshold in Python LogisticRegressionModel and SVMModel > -

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Tianshuo Deng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215449#comment-14215449 ] Tianshuo Deng commented on SPARK-4452: -- [~matei]: You are right, it does add more com

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215448#comment-14215448 ] Sandy Ryza commented on SPARK-4452: --- Ah, true. > Shuffle data structures can starve oth

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215446#comment-14215446 ] Andrew Or commented on SPARK-4452: -- [~sandyr] hash-based shuffle can still use two Extern

[jira] [Updated] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4452: - Target Version/s: 1.1.1, 1.2.0 > Shuffle data structures can starve others on the same thread for memory

[jira] [Updated] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4452: - Priority: Blocker (was: Major) > Shuffle data structures can starve others on the same thread for memory

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215436#comment-14215436 ] Sandy Ryza commented on SPARK-4452: --- [~andrewor14], IIUC, (2) shouldn't happen in hash-b

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Tianshuo Deng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215434#comment-14215434 ] Tianshuo Deng commented on SPARK-4452: -- Hi, [~andrewor14], Yeah exactly. Actually thi

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215427#comment-14215427 ] Andrew Or commented on SPARK-4452: -- I see, in other words, there are two separate issues

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215425#comment-14215425 ] Matei Zaharia commented on SPARK-4452: -- How much of this gets fixed if you fix the el

[jira] [Commented] (SPARK-3633) Fetches failure observed after SPARK-2711

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215403#comment-14215403 ] Andrew Or commented on SPARK-3633: -- Hey [~nravi] [~arahuja] were you using sort-based or

[jira] [Commented] (SPARK-3633) Fetches failure observed after SPARK-2711

2014-11-17 Thread Arun Ahuja (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215417#comment-14215417 ] Arun Ahuja commented on SPARK-3633: --- [~andrewor14] We were using Hash-Based shuffle when

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Tianshuo Deng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215418#comment-14215418 ] Tianshuo Deng commented on SPARK-4452: -- Hi, [~andrewor14]: The elementsRead bug that

[jira] [Commented] (SPARK-3633) Fetches failure observed after SPARK-2711

2014-11-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215415#comment-14215415 ] Marcelo Vanzin commented on SPARK-3633: --- Nishkam was using hash-based shuffle (defau

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Tianshuo Deng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215411#comment-14215411 ] Tianshuo Deng commented on SPARK-4452: -- Hi, [~andrewor14]: Actually hash-based shuffl

[jira] [Closed] (SPARK-4460) RandomForest classification uses wrong threshold

2014-11-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-4460. Resolution: Invalid Realized this was invalid. Current implementation is fine, except for c

[jira] [Created] (SPARK-4463) Add (de)select all button for additional metrics in webUI

2014-11-17 Thread Kay Ousterhout (JIRA)
Kay Ousterhout created SPARK-4463: - Summary: Add (de)select all button for additional metrics in webUI Key: SPARK-4463 URL: https://issues.apache.org/jira/browse/SPARK-4463 Project: Spark Iss

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215395#comment-14215395 ] Andrew Or commented on SPARK-4452: -- Hey [~tianshuo] do you see this issue only for sort-b

[jira] [Commented] (SPARK-4266) Avoid expensive JavaScript for StagePages with huge numbers of tasks

2014-11-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215391#comment-14215391 ] Apache Spark commented on SPARK-4266: - User 'kayousterhout' has created a pull request

[jira] [Updated] (SPARK-2178) createSchemaRDD is not thread safe

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2178: Target Version/s: 1.3.0 (was: 1.2.0) > createSchemaRDD is not thread safe > ---

[jira] [Updated] (SPARK-4453) Simplify Parquet record filter generation

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4453: Priority: Critical (was: Major) > Simplify Parquet record filter generation > -

[jira] [Updated] (SPARK-4453) Simplify Parquet record filter generation

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4453: Assignee: Cheng Lian > Simplify Parquet record filter generation > -

[jira] [Updated] (SPARK-2551) Cleanup FilteringParquetRowInputFormat

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2551: Target Version/s: 1.3.0 (was: 1.2.0) > Cleanup FilteringParquetRowInputFormat > ---

[jira] [Updated] (SPARK-2449) Spark sql reflection code requires a constructor taking all the fields for the table

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2449: Target Version/s: 1.3.0 (was: 1.2.0) > Spark sql reflection code requires a constructor tak

[jira] [Updated] (SPARK-2686) Add Length support to Spark SQL and HQL and Strlen support to SQL

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2686: Target Version/s: 1.3.0 (was: 1.2.0) > Add Length support to Spark SQL and HQL and Strlen s

[jira] [Updated] (SPARK-3955) Different versions between jackson-mapper-asl and jackson-core-asl

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3955: Target Version/s: 1.3.0 (was: 1.1.1, 1.2.0) > Different versions between jackson-mapper-asl

[jira] [Updated] (SPARK-3955) Different versions between jackson-mapper-asl and jackson-core-asl

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3955: Component/s: Build > Different versions between jackson-mapper-asl and jackson-core-asl > --

[jira] [Updated] (SPARK-4269) Make wait time in BroadcastHashJoin configurable

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4269: Target Version/s: 1.3.0 (was: 1.2.0) > Make wait time in BroadcastHashJoin configurable > -

[jira] [Updated] (SPARK-3379) Implement 'POWER' for sql

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3379: Target Version/s: 1.3.0 (was: 1.2.0) > Implement 'POWER' for sql >

[jira] [Updated] (SPARK-2554) CountDistinct and SumDistinct should do partial aggregation

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2554: Target Version/s: 1.3.0 (was: 1.2.0) > CountDistinct and SumDistinct should do partial aggr

[jira] [Updated] (SPARK-3298) [SQL] registerAsTable / registerTempTable overwrites old tables

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3298: Target Version/s: 1.3.0 (was: 1.2.0) > [SQL] registerAsTable / registerTempTable overwrites

[jira] [Updated] (SPARK-2472) Spark SQL Thrift server sometimes assigns wrong job group name

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2472: Target Version/s: 1.3.0 (was: 1.2.0) > Spark SQL Thrift server sometimes assigns wrong job

[jira] [Updated] (SPARK-3298) [SQL] registerAsTable / registerTempTable overwrites old tables

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3298: Assignee: (was: Michael Armbrust) > [SQL] registerAsTable / registerTempTable overwrites

[jira] [Resolved] (SPARK-3720) support ORC in spark sql

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-3720. - Resolution: Duplicate > support ORC in spark sql > > >

[jira] [Updated] (SPARK-4074) No exception for drop nonexistent table

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4074: Target Version/s: 1.3.0 (was: 1.2.0) > No exception for drop nonexistent table > --

[jira] [Updated] (SPARK-4443) Statistics bug for external table in spark sql hive

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4443: Priority: Critical (was: Major) > Statistics bug for external table in spark sql hive > ---

[jira] [Updated] (SPARK-2873) OOM happens when group by and join operation with big data

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2873: Target Version/s: 1.3.0 (was: 1.2.0) > OOM happens when group by and join operation with bi

[jira] [Updated] (SPARK-3184) Allow user to specify num tasks to use for a table

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3184: Target Version/s: 1.3.0 (was: 1.2.0) > Allow user to specify num tasks to use for a table >

[jira] [Created] (SPARK-4462) flume-sink build broken in SBT

2014-11-17 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-4462: --- Summary: flume-sink build broken in SBT Key: SPARK-4462 URL: https://issues.apache.org/jira/browse/SPARK-4462 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-4338) Remove yarn-alpha support

2014-11-17 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4338: - Assignee: Sandy Ryza > Remove yarn-alpha support > - > > Key: SPAR

[jira] [Updated] (SPARK-4180) SparkContext constructor should throw exception if another SparkContext is already running

2014-11-17 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4180: -- Target Version/s: 1.2.0, 1.0.3, 1.1.2 (was: 1.1.1, 1.2.0, 1.0.3) > SparkContext constructor should thro

[jira] [Updated] (SPARK-2087) Clean Multi-user semantics for thrift JDBC/ODBC server.

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2087: Target Version/s: 1.3.0 (was: 1.2.0) > Clean Multi-user semantics for thrift JDBC/ODBC serv

[jira] [Updated] (SPARK-2087) Clean Multi-user semantics for thrift JDBC/ODBC server.

2014-11-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2087: Assignee: (was: Zongheng Yang) > Clean Multi-user semantics for thrift JDBC/ODBC server.

[jira] [Updated] (SPARK-4461) Spark should not relies on mapred-site.xml for classpath

2014-11-17 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-4461: -- Description: Currently spark read mapred-site.xml to get the class path. From hadoop 2.6, the library i

[jira] [Comment Edited] (SPARK-1867) Spark Documentation Error causes java.lang.IllegalStateException: unread block data

2014-11-17 Thread Anson Abraham (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14212852#comment-14212852 ] Anson Abraham edited comment on SPARK-1867 at 11/17/14 10:40 PM: ---

[jira] [Commented] (SPARK-4395) Running a Spark SQL SELECT command from PySpark causes a hang for ~ 1 hour

2014-11-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215327#comment-14215327 ] Davies Liu commented on SPARK-4395: --- Workaround: remove cache() or cache() after inferS

[jira] [Created] (SPARK-4461) Spark should not relies on mapred-site.xml for classpath

2014-11-17 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-4461: - Summary: Spark should not relies on mapred-site.xml for classpath Key: SPARK-4461 URL: https://issues.apache.org/jira/browse/SPARK-4461 Project: Spark Issue Type:

  1   2   >