[jira] [Updated] (SPARK-3581) RDD API(distinct/subtract) does not work for RDD of Dictionaries

2014-09-18 Thread Shawn Guo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Guo updated SPARK-3581: - Affects Version/s: 1.0.0 1.0.2 RDD API(distinct/subtract) does not work for RDD

[jira] [Commented] (SPARK-3321) Defining a class within python main script

2014-09-18 Thread Shawn Guo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138569#comment-14138569 ] Shawn Guo commented on SPARK-3321: -- No idea yet, I use --py-files Null.py instead. it

[jira] [Created] (SPARK-3583) Spark run slow after unexpected repartition

2014-09-18 Thread ShiShu (JIRA)
ShiShu created SPARK-3583: - Summary: Spark run slow after unexpected repartition Key: SPARK-3583 URL: https://issues.apache.org/jira/browse/SPARK-3583 Project: Spark Issue Type: Bug Affects

[jira] [Updated] (SPARK-3583) Spark run slow after unexpected repartition

2014-09-18 Thread ShiShu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ShiShu updated SPARK-3583: -- Attachment: spark_q_006.jpg spark_q_005.jpg spark_q_004.jpg

[jira] [Commented] (SPARK-3578) GraphGenerators.sampleLogNormal sometimes returns too-large result

2014-09-18 Thread Ankur Dave (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138638#comment-14138638 ] Ankur Dave commented on SPARK-3578: --- [~pwendell] Sorry, I forgot to do that this time.

[jira] [Resolved] (SPARK-1353) IllegalArgumentException when writing to disk

2014-09-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-1353. Resolution: Duplicate IllegalArgumentException when writing to disk

[jira] [Commented] (SPARK-3525) Gradient boosting in MLLib

2014-09-18 Thread Egor Pakhomov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138674#comment-14138674 ] Egor Pakhomov commented on SPARK-3525: -- https://github.com/apache/spark/pull/2394

[jira] [Created] (SPARK-3584) sbin/slaves doesn't work when we use password authentication for SSH

2014-09-18 Thread Kousuke Saruta (JIRA)
Kousuke Saruta created SPARK-3584: - Summary: sbin/slaves doesn't work when we use password authentication for SSH Key: SPARK-3584 URL: https://issues.apache.org/jira/browse/SPARK-3584 Project: Spark

[jira] [Commented] (SPARK-3584) sbin/slaves doesn't work when we use password authentication for SSH

2014-09-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138716#comment-14138716 ] Apache Spark commented on SPARK-3584: - User 'sarutak' has created a pull request for

[jira] [Created] (SPARK-3585) Probability Values

2014-09-18 Thread Tamilselvan Palani (JIRA)
Tamilselvan Palani created SPARK-3585: - Summary: Probability Values Key: SPARK-3585 URL: https://issues.apache.org/jira/browse/SPARK-3585 Project: Spark Issue Type: Question

[jira] [Updated] (SPARK-3585) Probability Values in Logistic Regression/Decision Tree output

2014-09-18 Thread Tamilselvan Palani (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamilselvan Palani updated SPARK-3585: -- Summary: Probability Values in Logistic Regression/Decision Tree output (was:

[jira] [Created] (SPARK-3586) spark streaming

2014-09-18 Thread wangxj (JIRA)
wangxj created SPARK-3586: - Summary: spark streaming Key: SPARK-3586 URL: https://issues.apache.org/jira/browse/SPARK-3586 Project: Spark Issue Type: Bug Components: Streaming Affects

[jira] [Created] (SPARK-3587) Spark SQL can't support lead() over() window function

2014-09-18 Thread caoli (JIRA)
caoli created SPARK-3587: Summary: Spark SQL can't support lead() over() window function Key: SPARK-3587 URL: https://issues.apache.org/jira/browse/SPARK-3587 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-3588) Gaussian Mixture Model clustering

2014-09-18 Thread Meethu Mathew (JIRA)
Meethu Mathew created SPARK-3588: Summary: Gaussian Mixture Model clustering Key: SPARK-3588 URL: https://issues.apache.org/jira/browse/SPARK-3588 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-3588) Gaussian Mixture Model clustering

2014-09-18 Thread Meethu Mathew (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Meethu Mathew updated SPARK-3588: - Description: Gaussian Mixture Models (GMM) is a popular technique for soft clustering. GMM

[jira] [Updated] (SPARK-3588) Gaussian Mixture Model clustering

2014-09-18 Thread Meethu Mathew (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Meethu Mathew updated SPARK-3588: - Attachment: GMMSpark.py Gaussian Mixture Model clustering -

[jira] [Commented] (SPARK-3588) Gaussian Mixture Model clustering

2014-09-18 Thread Meethu Mathew (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138782#comment-14138782 ] Meethu Mathew commented on SPARK-3588: -- We are interested in contributing this

[jira] [Commented] (SPARK-2175) Null values when using App trait.

2014-09-18 Thread Philip Wills (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138803#comment-14138803 ] Philip Wills commented on SPARK-2175: - Whilst the workaround for this is trivial,

[jira] [Created] (SPARK-3589) [Minor]Remove redundant code in deploy module

2014-09-18 Thread WangTaoTheTonic (JIRA)
WangTaoTheTonic created SPARK-3589: -- Summary: [Minor]Remove redundant code in deploy module Key: SPARK-3589 URL: https://issues.apache.org/jira/browse/SPARK-3589 Project: Spark Issue Type:

[jira] [Commented] (SPARK-3589) [Minor]Remove redundant code in deploy module

2014-09-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138821#comment-14138821 ] Apache Spark commented on SPARK-3589: - User 'WangTaoTheTonic' has created a pull

[jira] [Commented] (SPARK-3403) NaiveBayes crashes with blas/lapack native libraries for breeze (netlib-java)

2014-09-18 Thread Alexander Ulanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138829#comment-14138829 ] Alexander Ulanov commented on SPARK-3403: - Thank you, your answers are really

[jira] [Commented] (SPARK-3321) Defining a class within python main script

2014-09-18 Thread Matthew Farrellee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138867#comment-14138867 ] Matthew Farrellee commented on SPARK-3321: -- [~guoxu1231] i think so too. ok if i

[jira] [Commented] (SPARK-3447) Kryo NPE when serializing JListWrapper

2014-09-18 Thread mohan gaddam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138873#comment-14138873 ] mohan gaddam commented on SPARK-3447: - I am also facing the same issue with spark

[jira] [Commented] (SPARK-3447) Kryo NPE when serializing JListWrapper

2014-09-18 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138884#comment-14138884 ] Yin Huai commented on SPARK-3447: - [~mohan.gadm] From the trace, seems the NPE was caused

[jira] [Commented] (SPARK-3447) Kryo NPE when serializing JListWrapper

2014-09-18 Thread mohan gaddam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138899#comment-14138899 ] mohan gaddam commented on SPARK-3447: - sorry for the mistake, those are the project

[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-09-18 Thread Helena Edelson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138906#comment-14138906 ] Helena Edelson commented on SPARK-2593: --- [~matei] +1 for spark streaming, that is a

[jira] [Commented] (SPARK-3447) Kryo NPE when serializing JListWrapper

2014-09-18 Thread mohan gaddam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138911#comment-14138911 ] mohan gaddam commented on SPARK-3447: - record KeyValueObject {

[jira] [Comment Edited] (SPARK-3447) Kryo NPE when serializing JListWrapper

2014-09-18 Thread mohan gaddam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138873#comment-14138873 ] mohan gaddam edited comment on SPARK-3447 at 9/18/14 1:21 PM: --

[jira] [Commented] (SPARK-1987) More memory-efficient graph construction

2014-09-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138970#comment-14138970 ] Apache Spark commented on SPARK-1987: - User 'larryxiao' has created a pull request for

[jira] [Resolved] (SPARK-3557) Yarn client config prioritization is backwards

2014-09-18 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-3557. -- Resolution: Duplicate Yarn client config prioritization is backwards

[jira] [Commented] (SPARK-3557) Yarn client config prioritization is backwards

2014-09-18 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138993#comment-14138993 ] Thomas Graves commented on SPARK-3557: -- This is a dup of SPARK-2872, although this

[jira] [Commented] (SPARK-2872) Fix conflict between code and doc in YarnClientSchedulerBackend

2014-09-18 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138996#comment-14138996 ] Thomas Graves commented on SPARK-2872: -- adding description from spark-3557 as it

[jira] [Commented] (SPARK-3389) Add converter class to make reading Parquet files easy with PySpark

2014-09-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139010#comment-14139010 ] Apache Spark commented on SPARK-3389: - User 'patmcdonough' has created a pull request

[jira] [Commented] (SPARK-3580) Add Consistent Method To Get Number of RDD Partitions Across Different Languages

2014-09-18 Thread Matthew Farrellee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139043#comment-14139043 ] Matthew Farrellee commented on SPARK-3580: -- what do you think about going the

[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped

2014-09-18 Thread Gino Bustelo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139061#comment-14139061 ] Gino Bustelo commented on SPARK-2892: - Any update on this? Will it get fixed for 1.0.3

[jira] [Comment Edited] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped

2014-09-18 Thread Gino Bustelo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139061#comment-14139061 ] Gino Bustelo edited comment on SPARK-2892 at 9/18/14 3:30 PM: --

[jira] [Updated] (SPARK-3270) Spark API for Application Extensions

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3270: - Issue Type: New Feature (was: Improvement) Spark API for Application Extensions

[jira] [Resolved] (SPARK-1576) Passing of JAVA_OPTS to YARN on command line

2014-09-18 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-1576. --- Resolution: Not a Problem spark-submit already supports this with existing options. Passing

[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-09-18 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139163#comment-14139163 ] Matei Zaharia commented on SPARK-2593: -- Sure, it would be great to do this for

[jira] [Resolved] (SPARK-3547) Maybe we should not simply make return code 1 equal to CLASS_NOT_FOUND

2014-09-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3547. Resolution: Fixed Fix Version/s: 1.2.0 Assignee: WangTaoTheTonic Resolved

[jira] [Resolved] (SPARK-3579) Jekyll doc generation is different across environments

2014-09-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3579. Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2443

[jira] [Resolved] (SPARK-1477) Add the lifecycle interface

2014-09-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1477. Resolution: Won't Fix Unless we are planning to interact with these components in a generic

[jira] [Created] (SPARK-3593) Support Sorting of Binary Type Data

2014-09-18 Thread Paul Magid (JIRA)
Paul Magid created SPARK-3593: - Summary: Support Sorting of Binary Type Data Key: SPARK-3593 URL: https://issues.apache.org/jira/browse/SPARK-3593 Project: Spark Issue Type: New Feature

[jira] [Comment Edited] (SPARK-1477) Add the lifecycle interface

2014-09-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139218#comment-14139218 ] Patrick Wendell edited comment on SPARK-1477 at 9/18/14 5:34 PM:

[jira] [Commented] (SPARK-3530) Pipeline and Parameters

2014-09-18 Thread Li Pu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139233#comment-14139233 ] Li Pu commented on SPARK-3530: -- Nice design doc! I had some experiences on the parameter

[jira] [Commented] (SPARK-3592) applySchema to an RDD of Row

2014-09-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139299#comment-14139299 ] Apache Spark commented on SPARK-3592: - User 'davies' has created a pull request for

[jira] [Commented] (SPARK-3560) In yarn-cluster mode, jars are distributed through multiple mechanisms.

2014-09-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139339#comment-14139339 ] Apache Spark commented on SPARK-3560: - User 'Victsm' has created a pull request for

[jira] [Resolved] (SPARK-3566) .gitignore and .rat-excludes should consider Windows cmd file and Emacs' backup files

2014-09-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3566. Resolution: Fixed Fix Version/s: 1.2.0 Assignee: Kousuke Saruta

[jira] [Resolved] (SPARK-3589) [Minor]Remove redundant code in deploy module

2014-09-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3589. Resolution: Fixed Fix Version/s: 1.2.0 1.1.1 Assignee:

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-09-18 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139396#comment-14139396 ] Zhan Zhang commented on SPARK-1537: --- Do you have any update on this, or any schedule in

[jira] [Updated] (SPARK-3560) In yarn-cluster mode, the same jars are distributed through multiple mechanisms.

2014-09-18 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated SPARK-3560: -- Summary: In yarn-cluster mode, the same jars are distributed through multiple mechanisms. (was: In

[jira] [Created] (SPARK-3595) Spark should respect configured OutputCommitter when using saveAsHadoopFile

2014-09-18 Thread Ian Hummel (JIRA)
Ian Hummel created SPARK-3595: - Summary: Spark should respect configured OutputCommitter when using saveAsHadoopFile Key: SPARK-3595 URL: https://issues.apache.org/jira/browse/SPARK-3595 Project: Spark

[jira] [Commented] (SPARK-3403) NaiveBayes crashes with blas/lapack native libraries for breeze (netlib-java)

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139461#comment-14139461 ] Xiangrui Meng commented on SPARK-3403: -- Sorry, it should be netlib-java, but the real

[jira] [Created] (SPARK-3596) Support changing the yarn client monitor interval

2014-09-18 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-3596: Summary: Support changing the yarn client monitor interval Key: SPARK-3596 URL: https://issues.apache.org/jira/browse/SPARK-3596 Project: Spark Issue Type:

[jira] [Commented] (SPARK-3595) Spark should respect configured OutputCommitter when using saveAsHadoopFile

2014-09-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139509#comment-14139509 ] Apache Spark commented on SPARK-3595: - User 'themodernlife' has created a pull request

[jira] [Commented] (SPARK-1486) Support multi-model training in MLlib

2014-09-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139549#comment-14139549 ] Apache Spark commented on SPARK-1486: - User 'brkyvz' has created a pull request for

[jira] [Updated] (SPARK-3340) Deprecate ADD_JARS and ADD_FILES

2014-09-18 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3340: - Assignee: (was: Andrew Or) Deprecate ADD_JARS and ADD_FILES

[jira] [Commented] (SPARK-3530) Pipeline and Parameters

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139600#comment-14139600 ] Xiangrui Meng commented on SPARK-3530: -- [~eustache] The default implementation of

[jira] [Comment Edited] (SPARK-3530) Pipeline and Parameters

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139600#comment-14139600 ] Xiangrui Meng edited comment on SPARK-3530 at 9/18/14 10:06 PM:

[jira] [Updated] (SPARK-3573) Dataset

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3573: - Description: This JIRA is for discussion of ML dataset, essentially a SchemaRDD with extra

[jira] [Closed] (SPARK-3560) In yarn-cluster mode, the same jars are distributed through multiple mechanisms.

2014-09-18 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-3560. Resolution: Fixed Fix Version/s: 1.2.0 1.1.1 Fixed by

[jira] [Reopened] (SPARK-3560) In yarn-cluster mode, the same jars are distributed through multiple mechanisms.

2014-09-18 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or reopened SPARK-3560: -- Assignee: Min Shen Reopening just to reassign. Closing right afterwards, please disregard. In

[jira] [Closed] (SPARK-3560) In yarn-cluster mode, the same jars are distributed through multiple mechanisms.

2014-09-18 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-3560. Resolution: Fixed In yarn-cluster mode, the same jars are distributed through multiple mechanisms.

[jira] [Updated] (SPARK-3587) Spark SQL can't support lead() over() window function

2014-09-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3587: --- Labels: (was: features) Spark SQL can't support lead() over() window function

[jira] [Updated] (SPARK-3574) Shuffle finish time always reported as -1

2014-09-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3574: --- Component/s: Spark Core Shuffle finish time always reported as -1

[jira] [Updated] (SPARK-2672) Support compression in wholeFile()

2014-09-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2672: --- Summary: Support compression in wholeFile() (was: support compressed file in wholeFile())

[jira] [Updated] (SPARK-2761) Merge similar code paths in ExternalSorter and EAOM

2014-09-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2761: --- Component/s: Spark Core Merge similar code paths in ExternalSorter and EAOM

[jira] [Updated] (SPARK-3535) Spark on Mesos not correctly setting heap overhead

2014-09-18 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3535: - Target Version/s: 1.1.1, 1.2.0 (was: 1.1.1) Spark on Mesos not correctly setting heap overhead

[jira] [Comment Edited] (SPARK-3535) Spark on Mesos not correctly setting heap overhead

2014-09-18 Thread Brenden Matthews (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139763#comment-14139763 ] Brenden Matthews edited comment on SPARK-3535 at 9/18/14 11:58 PM:

[jira] [Commented] (SPARK-3535) Spark on Mesos not correctly setting heap overhead

2014-09-18 Thread Brenden Matthews (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139763#comment-14139763 ] Brenden Matthews commented on SPARK-3535: - After some even futher digging, I

[jira] [Commented] (SPARK-3562) Periodic cleanup event logs

2014-09-18 Thread Matthew Farrellee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139786#comment-14139786 ] Matthew Farrellee commented on SPARK-3562: -- is logrotate an option for you?

[jira] [Resolved] (SPARK-3554) handle large dataset in closure of PySpark

2014-09-18 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3554. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2417

[jira] [Commented] (SPARK-3535) Spark on Mesos not correctly setting heap overhead

2014-09-18 Thread Vinod Kone (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139845#comment-14139845 ] Vinod Kone commented on SPARK-3535: --- This can happen if the spark executor doesn't use

[jira] [Created] (SPARK-3597) MesosSchedulerBackend does not implement `killTask`

2014-09-18 Thread Brenden Matthews (JIRA)
Brenden Matthews created SPARK-3597: --- Summary: MesosSchedulerBackend does not implement `killTask` Key: SPARK-3597 URL: https://issues.apache.org/jira/browse/SPARK-3597 Project: Spark

[jira] [Commented] (SPARK-3581) RDD API(distinct/subtract) does not work for RDD of Dictionaries

2014-09-18 Thread Shawn Guo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139902#comment-14139902 ] Shawn Guo commented on SPARK-3581: -- Yes, please. Thanks for clarification. RDD

[jira] [Created] (SPARK-3598) cast to timestamp should be the same as hive

2014-09-18 Thread Adrian Wang (JIRA)
Adrian Wang created SPARK-3598: -- Summary: cast to timestamp should be the same as hive Key: SPARK-3598 URL: https://issues.apache.org/jira/browse/SPARK-3598 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-3321) Defining a class within python main script

2014-09-18 Thread Shawn Guo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139903#comment-14139903 ] Shawn Guo commented on SPARK-3321: -- Yes please, thanks for clarification. Defining a

[jira] [Created] (SPARK-3599) Avoid loading and printing properties file content frequently

2014-09-18 Thread WangTaoTheTonic (JIRA)
WangTaoTheTonic created SPARK-3599: -- Summary: Avoid loading and printing properties file content frequently Key: SPARK-3599 URL: https://issues.apache.org/jira/browse/SPARK-3599 Project: Spark

[jira] [Closed] (SPARK-3581) RDD API(distinct/subtract) does not work for RDD of Dictionaries

2014-09-18 Thread Matthew Farrellee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Farrellee closed SPARK-3581. Resolution: Not a Problem RDD API(distinct/subtract) does not work for RDD of Dictionaries

[jira] [Closed] (SPARK-3321) Defining a class within python main script

2014-09-18 Thread Matthew Farrellee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Farrellee closed SPARK-3321. Resolution: Not a Problem Defining a class within python main script

[jira] [Commented] (SPARK-3573) Dataset

2014-09-18 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139958#comment-14139958 ] Sandy Ryza commented on SPARK-3573: --- Currently SchemaRDD lives inside SQL. Would we

[jira] [Commented] (SPARK-3250) More Efficient Sampling

2014-09-18 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139965#comment-14139965 ] Erik Erlandson commented on SPARK-3250: --- PR:

[jira] [Commented] (SPARK-3573) Dataset

2014-09-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140017#comment-14140017 ] Patrick Wendell commented on SPARK-3573: [~sandyr] This is a good question I'm not

[jira] [Commented] (SPARK-2058) SPARK_CONF_DIR should override all present configs

2014-09-18 Thread David Rosenstrauch (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140023#comment-14140023 ] David Rosenstrauch commented on SPARK-2058: --- I'm wondering the same: has this

[jira] [Commented] (SPARK-3270) Spark API for Application Extensions

2014-09-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140043#comment-14140043 ] Patrick Wendell commented on SPARK-3270: Hey There, For the particular use case