[jira] [Created] (SPARK-5408) MaxPermGen is ignored by ExecutorRunner and DriverRunner

2015-01-26 Thread Jacek Lewandowski (JIRA)
Jacek Lewandowski created SPARK-5408: Summary: MaxPermGen is ignored by ExecutorRunner and DriverRunner Key: SPARK-5408 URL: https://issues.apache.org/jira/browse/SPARK-5408 Project: Spark

[jira] [Updated] (SPARK-5408) MaxPermSize is ignored by ExecutorRunner and DriverRunner

2015-01-26 Thread Jacek Lewandowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated SPARK-5408: - Summary: MaxPermSize is ignored by ExecutorRunner and DriverRunner (was: MaxPermGen is ig

[jira] [Updated] (SPARK-5408) MaxPermSize is ignored by ExecutorRunner and DriverRunner

2015-01-26 Thread Jacek Lewandowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated SPARK-5408: - Description: ExecutorRunner and DriverRunner uses CommandUtils to build the command which

[jira] [Commented] (SPARK-5408) MaxPermSize is ignored by ExecutorRunner and DriverRunner

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291581#comment-14291581 ] Apache Spark commented on SPARK-5408: - User 'jacek-lewandowski' has created a pull req

[jira] [Updated] (SPARK-5408) MaxPermSize is ignored by ExecutorRunner and DriverRunner

2015-01-26 Thread Jacek Lewandowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated SPARK-5408: - Fix Version/s: 1.2.1 1.3.0 > MaxPermSize is ignored by ExecutorRunner a

[jira] [Commented] (SPARK-5408) MaxPermSize is ignored by ExecutorRunner and DriverRunner

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291583#comment-14291583 ] Apache Spark commented on SPARK-5408: - User 'jacek-lewandowski' has created a pull req

[jira] [Commented] (SPARK-5408) MaxPermSize is ignored by ExecutorRunner and DriverRunner

2015-01-26 Thread Jacek Lewandowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291585#comment-14291585 ] Jacek Lewandowski commented on SPARK-5408: -- Can anybody take a look? > MaxPermSi

[jira] [Updated] (SPARK-5406) LocalLAPACK mode in RowMatrix.computeSVD should have much smaller upper bound

2015-01-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-5406: -- Description: In RowMatrix.computeSVD, under LocalLAPACK mode, the code would invoke brzSvd. Yet breeze

[jira] [Updated] (SPARK-5406) LocalLAPACK mode in RowMatrix.computeSVD should have much smaller upper bound

2015-01-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-5406: -- Description: In RowMatrix.computeSVD, under LocalLAPACK mode, the code would invoke brzSvd. Yet breeze

[jira] [Commented] (SPARK-5300) Spark loads file partitions in inconsistent order on native filesystems

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291605#comment-14291605 ] Apache Spark commented on SPARK-5300: - User 'ehiggs' has created a pull request for th

[jira] [Commented] (SPARK-3789) Python bindings for GraphX

2015-01-26 Thread Kushal Datta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291710#comment-14291710 ] Kushal Datta commented on SPARK-3789: - Hi Ameet, I have created the first pull reques

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark "application"

2015-01-26 Thread Murat Eken (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291724#comment-14291724 ] Murat Eken commented on SPARK-2389: --- +1. We're using a Spark cluster as a real-time quer

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark "application"

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291737#comment-14291737 ] Sean Owen commented on SPARK-2389: -- Why can't N front-ends talk to a process built around

[jira] [Commented] (SPARK-5399) tree Losses strings should match loss names

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291741#comment-14291741 ] Apache Spark commented on SPARK-5399: - User 'Lewuathe' has created a pull request for

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark "application"

2015-01-26 Thread Murat Eken (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291755#comment-14291755 ] Murat Eken commented on SPARK-2389: --- Yes [~sowen], it's about HA for the driver. Our app

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark "application"

2015-01-26 Thread Robert Stupp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291764#comment-14291764 ] Robert Stupp commented on SPARK-2389: - [~srowen] that *one long-running* Spark app is

[jira] [Closed] (SPARK-5407) No 1.2 AMI available for ec2

2015-01-26 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HÃ¥kan Jonsson closed SPARK-5407. Resolution: Invalid Error on my side. > No 1.2 AMI available for ec2 >

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark "application"

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291788#comment-14291788 ] Sean Owen commented on SPARK-2389: -- Yes, the SPOF problem makes sense. It doesn't seem to

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark "application"

2015-01-26 Thread Murat Eken (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291795#comment-14291795 ] Murat Eken commented on SPARK-2389: --- [~sowen], I think Robert is talking about fault tol

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark "application"

2015-01-26 Thread Robert Stupp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291797#comment-14291797 ] Robert Stupp commented on SPARK-2389: - bq. That aside, why doesn't it scale? Simply b

[jira] [Commented] (SPARK-665) Create RPM packages for Spark

2015-01-26 Thread Christian Tzolov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291799#comment-14291799 ] Christian Tzolov commented on SPARK-665: I've looked at the JRPM maven plugin but u

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark "application"

2015-01-26 Thread Robert Stupp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291798#comment-14291798 ] Robert Stupp commented on SPARK-2389: - bq. fault tolerance when he mentions scalabilit

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark "application"

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291804#comment-14291804 ] Sean Owen commented on SPARK-2389: -- Yes, makes sense. Maxing out one driver isn't an issu

[jira] [Closed] (SPARK-5303) applySchema returns NullPointerException

2015-01-26 Thread Mauro Pirrone (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mauro Pirrone closed SPARK-5303. Resolution: Not a Problem > applySchema returns NullPointerException > -

[jira] [Created] (SPARK-5409) Broken link in documentation

2015-01-26 Thread Mauro Pirrone (JIRA)
Mauro Pirrone created SPARK-5409: Summary: Broken link in documentation Key: SPARK-5409 URL: https://issues.apache.org/jira/browse/SPARK-5409 Project: Spark Issue Type: Documentation

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark "application"

2015-01-26 Thread Robert Stupp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291818#comment-14291818 ] Robert Stupp commented on SPARK-2389: - [~srowen] yes, the problem is that drivers cann

[jira] [Commented] (SPARK-5409) Broken link in documentation

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291823#comment-14291823 ] Sean Owen commented on SPARK-5409: -- Should just be https://github.com/apache/spark/blob/

[jira] [Commented] (SPARK-5324) Results of describe can't be queried

2015-01-26 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291987#comment-14291987 ] Yanbo Liang commented on SPARK-5324: [~marmbrus] I have pull a request for this issue

[jira] [Resolved] (SPARK-4430) Apache RAT Checks fail spuriously on test files

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-4430. -- Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Sean Owen > Apache RAT Checks fail spur

[jira] [Commented] (SPARK-5324) Results of describe can't be queried

2015-01-26 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292008#comment-14292008 ] Yanbo Liang commented on SPARK-5324: https://github.com/apache/spark/pull/4207 > Resu

[jira] [Resolved] (SPARK-3852) Document spark.driver.extra* configs

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-3852. -- Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Sean Owen Target Version/s:

[jira] [Created] (SPARK-5410) Error parsing scientific notation in a select statement

2015-01-26 Thread Hugo Ferrira (JIRA)
Hugo Ferrira created SPARK-5410: --- Summary: Error parsing scientific notation in a select statement Key: SPARK-5410 URL: https://issues.apache.org/jira/browse/SPARK-5410 Project: Spark Issue Typ

[jira] [Commented] (SPARK-5355) SparkConf is not thread-safe

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292048#comment-14292048 ] Apache Spark commented on SPARK-5355: - User 'davies' has created a pull request for th

[jira] [Commented] (SPARK-595) Document "local-cluster" mode

2015-01-26 Thread Vladimir Grigor (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292052#comment-14292052 ] Vladimir Grigor commented on SPARK-595: --- +1 for reopen > Document "local-cluster" mo

[jira] [Commented] (SPARK-5162) Python yarn-cluster mode

2015-01-26 Thread Vladimir Grigor (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292069#comment-14292069 ] Vladimir Grigor commented on SPARK-5162: I second [~jared.holmb...@orchestro.com]

[jira] [Commented] (SPARK-5395) Large number of Python workers causing resource depletion

2015-01-26 Thread Mark Khaitman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292121#comment-14292121 ] Mark Khaitman commented on SPARK-5395: -- Having the same issue in standalone deploymen

[jira] [Commented] (SPARK-5400) Rename GaussianMixtureEM to GaussianMixture

2015-01-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292176#comment-14292176 ] Joseph K. Bradley commented on SPARK-5400: -- I agree this could be done either way

[jira] [Commented] (SPARK-794) Remove sleep() in ClusterScheduler.stop

2015-01-26 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292303#comment-14292303 ] Brennon York commented on SPARK-794: [~joshrosen] How is this PR holding up? I haven't

[jira] [Reopened] (SPARK-595) Document "local-cluster" mode

2015-01-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reopened SPARK-595: -- I've re-opened this issue. Folks are using the API in the wild and we're not going to break compatibility f

[jira] [Commented] (SPARK-3644) REST API for Spark application info (jobs / stages / tasks / storage info)

2015-01-26 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292370#comment-14292370 ] Imran Rashid commented on SPARK-3644: - [~joshrosen] Hi Josh, I've got time to implemen

[jira] [Updated] (SPARK-5236) java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.MutableAny cannot be cast to org.apache.spark.sql.catalyst.expressions.MutableInt

2015-01-26 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-5236: -- Description: {code} 15/01/14 05:39:27 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 18.0 (TID 2

[jira] [Commented] (SPARK-5226) Add DBSCAN Clustering Algorithm to MLlib

2015-01-26 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292399#comment-14292399 ] Dmitriy Lyubimov commented on SPARK-5226: - All attempts to parallelize dbscan in l

[jira] [Commented] (SPARK-2688) Need a way to run multiple data pipeline concurrently

2015-01-26 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292415#comment-14292415 ] Xuefu Zhang commented on SPARK-2688: #1 above is exactly what Hive needs badly. > Nee

[jira] [Resolved] (SPARK-5339) build/mvn doesn't work because of invalid URL for maven's tgz.

2015-01-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5339. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Kousuke Saruta > build/mvn

[jira] [Commented] (SPARK-2688) Need a way to run multiple data pipeline concurrently

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292431#comment-14292431 ] Sean Owen commented on SPARK-2688: -- As [~irashid] says, #1 is just syntactic sugar on wha

[jira] [Commented] (SPARK-2688) Need a way to run multiple data pipeline concurrently

2015-01-26 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292448#comment-14292448 ] Xuefu Zhang commented on SPARK-2688: Yeah. We don't need a syntactic suger, but a tran

[jira] [Commented] (SPARK-3789) Python bindings for GraphX

2015-01-26 Thread Kushal Datta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292468#comment-14292468 ] Kushal Datta commented on SPARK-3789: - Hi Ameet, Sorry for asking this question again

[jira] [Commented] (SPARK-3789) Python bindings for GraphX

2015-01-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292477#comment-14292477 ] Reynold Xin commented on SPARK-3789: Unfortunately this is not going to make it into 1

[jira] [Created] (SPARK-5411) Allow SparkListeners to be specified in SparkConf and loaded when creating SparkContext

2015-01-26 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-5411: - Summary: Allow SparkListeners to be specified in SparkConf and loaded when creating SparkContext Key: SPARK-5411 URL: https://issues.apache.org/jira/browse/SPARK-5411 Proje

[jira] [Commented] (SPARK-5411) Allow SparkListeners to be specified in SparkConf and loaded when creating SparkContext

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292494#comment-14292494 ] Apache Spark commented on SPARK-5411: - User 'JoshRosen' has created a pull request for

[jira] [Commented] (SPARK-3789) Python bindings for GraphX

2015-01-26 Thread Kushal Datta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292499#comment-14292499 ] Kushal Datta commented on SPARK-3789: - Sure, i will write up the design document. @ Am

[jira] [Updated] (SPARK-4147) Reduce log4j dependency

2015-01-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4147: --- Assignee: Sean Owen > Reduce log4j dependency > --- > > Ke

[jira] [Created] (SPARK-5412) Cannot bind Master to a specific hostname as per the documentation

2015-01-26 Thread Alexis Seigneurin (JIRA)
Alexis Seigneurin created SPARK-5412: Summary: Cannot bind Master to a specific hostname as per the documentation Key: SPARK-5412 URL: https://issues.apache.org/jira/browse/SPARK-5412 Project: Spa

[jira] [Updated] (SPARK-4147) Reduce log4j dependency

2015-01-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4147: --- Affects Version/s: 1.2.0 > Reduce log4j dependency > --- > >

[jira] [Resolved] (SPARK-4147) Reduce log4j dependency

2015-01-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4147. Resolution: Fixed > Reduce log4j dependency > --- > > Ke

[jira] [Updated] (SPARK-4147) Reduce log4j dependency

2015-01-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4147: --- Fix Version/s: 1.2.1 1.3.0 > Reduce log4j dependency >

[jira] [Resolved] (SPARK-960) JobCancellationSuite "two jobs sharing the same stage" is broken

2015-01-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-960. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4180 [https://github.com/apa

[jira] [Updated] (SPARK-960) JobCancellationSuite "two jobs sharing the same stage" is broken

2015-01-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-960: - Assignee: Sean Owen > JobCancellationSuite "two jobs sharing the same stage" is broken > --

[jira] [Commented] (SPARK-926) spark_ec2 script when ssh/scp-ing should pipe UserknowHostFile to /dev/null

2015-01-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292549#comment-14292549 ] Josh Rosen commented on SPARK-926: -- I think it is; there's now a PR to fix this, since it'

[jira] [Commented] (SPARK-3789) Python bindings for GraphX

2015-01-26 Thread Ameet Talwalkar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292602#comment-14292602 ] Ameet Talwalkar commented on SPARK-3789: Unfortunately not -- I plan to use a stan

[jira] [Commented] (SPARK-5395) Large number of Python workers causing resource depletion

2015-01-26 Thread Sven Krasser (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292630#comment-14292630 ] Sven Krasser commented on SPARK-5395: - [~mkman84], do you also see this for both spark

[jira] [Commented] (SPARK-5395) Large number of Python workers causing resource depletion

2015-01-26 Thread Mark Khaitman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292668#comment-14292668 ] Mark Khaitman commented on SPARK-5395: -- [~skrasser], I actually only managed to have

[jira] [Comment Edited] (SPARK-5395) Large number of Python workers causing resource depletion

2015-01-26 Thread Mark Khaitman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292668#comment-14292668 ] Mark Khaitman edited comment on SPARK-5395 at 1/26/15 11:57 PM:

[jira] [Updated] (SPARK-5384) Vectors.sqdist return inconsistent result for sparse/dense vectors when the vectors have different lengths

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5384: - Priority: Minor (was: Critical) > Vectors.sqdist return inconsistent result for sparse/dense vect

[jira] [Updated] (SPARK-5384) Vectors.sqdist return inconsistent result for sparse/dense vectors when the vectors have different lengths

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5384: - Affects Version/s: (was: 1.2.1) 1.3.0 > Vectors.sqdist return inconsist

[jira] [Updated] (SPARK-5384) Vectors.sqdist return inconsistent result for sparse/dense vectors when the vectors have different lengths

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5384: - Target Version/s: 1.3.0 (was: 1.3.0, 1.2.1) > Vectors.sqdist return inconsistent result for spars

[jira] [Updated] (SPARK-5384) Vectors.sqdist return inconsistent result for sparse/dense vectors when the vectors have different lengths

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5384: - Target Version/s: 1.3.0, 1.2.1 (was: 1.2.1) > Vectors.sqdist return inconsistent result for spars

[jira] [Commented] (SPARK-3439) Add Canopy Clustering Algorithm

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292688#comment-14292688 ] Xiangrui Meng commented on SPARK-3439: -- [~angellandros] The public API and the comple

[jira] [Commented] (SPARK-5400) Rename GaussianMixtureEM to GaussianMixture

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292695#comment-14292695 ] Xiangrui Meng commented on SPARK-5400: -- I like `GaussianMixture` better. I don't thin

[jira] [Updated] (SPARK-4587) Model export/import

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4587: - Assignee: Joseph K. Bradley > Model export/import > --- > > Key: S

[jira] [Updated] (SPARK-1856) Standardize MLlib interfaces

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1856: - Target Version/s: (was: 1.3.0) > Standardize MLlib interfaces > > >

[jira] [Updated] (SPARK-1856) Standardize MLlib interfaces

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1856: - Priority: Critical (was: Blocker) > Standardize MLlib interfaces > >

[jira] [Updated] (SPARK-1486) Support multi-model training in MLlib

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1486: - Target Version/s: (was: 1.3.0) > Support multi-model training in MLlib > ---

[jira] [Closed] (SPARK-4589) ML add-ons to SchemaRDD

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-4589. Resolution: Duplicate I'm closing this JIRA in favor of the DataFrame API. > ML add-ons to SchemaRD

[jira] [Updated] (SPARK-5094) Python API for gradient-boosted trees

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5094: - Assignee: Kazuki Taniguchi > Python API for gradient-boosted trees > -

[jira] [Updated] (SPARK-3717) DecisionTree, RandomForest: Partition by feature

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3717: - Target Version/s: (was: 1.3.0) > DecisionTree, RandomForest: Partition by feature >

[jira] [Updated] (SPARK-5321) Add transpose() method to Matrix

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5321: - Assignee: Burak Yavuz > Add transpose() method to Matrix > > >

[jira] [Updated] (SPARK-5114) Should Evaluator be a PipelineStage

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5114: - Summary: Should Evaluator be a PipelineStage (was: Should Evaluator by a PipelineStage) > Should

[jira] [Updated] (SPARK-5413) Upgrade "metrics" dependency to 3.1.0

2015-01-26 Thread Ryan Williams (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Williams updated SPARK-5413: - Description: Spark currently uses Coda Hale's metrics library version {{3.0.0}}. Version {{3.1.0}

[jira] [Created] (SPARK-5413) Upgrade "metrics" dependency to 3.1.0

2015-01-26 Thread Ryan Williams (JIRA)
Ryan Williams created SPARK-5413: Summary: Upgrade "metrics" dependency to 3.1.0 Key: SPARK-5413 URL: https://issues.apache.org/jira/browse/SPARK-5413 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-5413) Upgrade "metrics" dependency to 3.1.0

2015-01-26 Thread Ryan Williams (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Williams updated SPARK-5413: - Description: Spark currently uses Coda Hale's metrics library version {{3.0.0}}. Version {{3.1.0}

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield "OutOfMemoryError: Requested array size exceeds VM limit"

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292718#comment-14292718 ] Xiangrui Meng commented on SPARK-4846: -- [~josephtang] Are you working on this issue?

[jira] [Created] (SPARK-5414) Add SparkListener implementation that allows users to receive all listener events in one method

2015-01-26 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-5414: - Summary: Add SparkListener implementation that allows users to receive all listener events in one method Key: SPARK-5414 URL: https://issues.apache.org/jira/browse/SPARK-5414

[jira] [Commented] (SPARK-5413) Upgrade "metrics" dependency to 3.1.0

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292719#comment-14292719 ] Apache Spark commented on SPARK-5413: - User 'ryan-williams' has created a pull request

[jira] [Updated] (SPARK-4979) Add streaming logistic regression

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4979: - Target Version/s: (was: 1.3.0) > Add streaming logistic regression > ---

[jira] [Commented] (SPARK-5414) Add SparkListener implementation that allows users to receive all listener events in one method

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292720#comment-14292720 ] Apache Spark commented on SPARK-5414: - User 'JoshRosen' has created a pull request for

[jira] [Created] (SPARK-5415) Upgrade sbt to 0.13.7

2015-01-26 Thread Ryan Williams (JIRA)
Ryan Williams created SPARK-5415: Summary: Upgrade sbt to 0.13.7 Key: SPARK-5415 URL: https://issues.apache.org/jira/browse/SPARK-5415 Project: Spark Issue Type: Improvement Compone

[jira] [Updated] (SPARK-5414) Add SparkListener implementation that allows users to receive all listener events in one method

2015-01-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5414: -- Component/s: Spark Core > Add SparkListener implementation that allows users to receive all listener >

[jira] [Commented] (SPARK-5415) Upgrade sbt to 0.13.7

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292723#comment-14292723 ] Sean Owen commented on SPARK-5415: -- FWIW I use 0.13.7 locally and have had no problems.

[jira] [Commented] (SPARK-5415) Upgrade sbt to 0.13.7

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292722#comment-14292722 ] Apache Spark commented on SPARK-5415: - User 'ryan-williams' has created a pull request

[jira] [Resolved] (SPARK-5409) Broken link in documentation

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5409. -- Resolution: Duplicate Actually this was already fixed > Broken link in documentation >

[jira] [Commented] (SPARK-5261) In some cases ,The value of word's vector representation is too big

2015-01-26 Thread Kai Sasaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292750#comment-14292750 ] Kai Sasaki commented on SPARK-5261: --- [~gq] Can you provide us data set? I tried with som

[jira] [Commented] (SPARK-5395) Large number of Python workers causing resource depletion

2015-01-26 Thread Mark Khaitman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292769#comment-14292769 ] Mark Khaitman commented on SPARK-5395: -- This may prove to be useful... I'm watching

[jira] [Created] (SPARK-5416) Initialize Executor.threadPool before ExecutorSource

2015-01-26 Thread Ryan Williams (JIRA)
Ryan Williams created SPARK-5416: Summary: Initialize Executor.threadPool before ExecutorSource Key: SPARK-5416 URL: https://issues.apache.org/jira/browse/SPARK-5416 Project: Spark Issue Type

[jira] [Commented] (SPARK-5416) Initialize Executor.threadPool before ExecutorSource

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292784#comment-14292784 ] Apache Spark commented on SPARK-5416: - User 'ryan-williams' has created a pull request

[jira] [Created] (SPARK-5417) Remove redundant executor-ID set() call

2015-01-26 Thread Ryan Williams (JIRA)
Ryan Williams created SPARK-5417: Summary: Remove redundant executor-ID set() call Key: SPARK-5417 URL: https://issues.apache.org/jira/browse/SPARK-5417 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-5417) Remove redundant executor-ID set() call

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292790#comment-14292790 ] Apache Spark commented on SPARK-5417: - User 'ryan-williams' has created a pull request

[jira] [Updated] (SPARK-3880) HBase as data source to SparkSQL

2015-01-26 Thread Yan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan updated SPARK-3880: --- Attachment: (was: SparkSQLOnHBase_v2.docx) > HBase as data source to SparkSQL > >

[jira] [Commented] (SPARK-3562) Periodic cleanup event logs

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292818#comment-14292818 ] Apache Spark commented on SPARK-3562: - User 'viper-kun' has created a pull request for

[jira] [Updated] (SPARK-3880) HBase as data source to SparkSQL

2015-01-26 Thread Yan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan updated SPARK-3880: --- Attachment: SparkSQLOnHBase_v2.0.docx > HBase as data source to SparkSQL > > >

  1   2   >