[jira] [Updated] (SPARK-11567) Add Python API for corr aggregate function

2015-11-06 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-11567: - Summary: Add Python API for corr aggregate function (was: Add Python API for corr in agg) > Add

[jira] [Assigned] (SPARK-11567) Add Python API for corr in agg

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11567: Assignee: Apache Spark > Add Python API for corr in agg > -- >

[jira] [Assigned] (SPARK-11567) Add Python API for corr in agg

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11567: Assignee: (was: Apache Spark) > Add Python API for corr in agg > -

[jira] [Commented] (SPARK-11567) Add Python API for corr in agg

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995106#comment-14995106 ] Apache Spark commented on SPARK-11567: -- User 'felixcheung' has created a pull reques

[jira] [Created] (SPARK-11567) Add Python API for corr in agg

2015-11-06 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-11567: Summary: Add Python API for corr in agg Key: SPARK-11567 URL: https://issues.apache.org/jira/browse/SPARK-11567 Project: Spark Issue Type: Task Affects V

[jira] [Resolved] (SPARK-8467) Add LDAModel.describeTopics() in Python

2015-11-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-8467. --- Resolution: Fixed Fix Version/s: 1.6.0 > Add LDAModel.describeTopics() in Python >

[jira] [Resolved] (SPARK-11112) DAG visualization: display RDD callsite

2015-11-06 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-2. --- Resolution: Fixed Fix Version/s: 1.6.0 > DAG visualization: display RDD callsite > ---

[jira] [Assigned] (SPARK-11515) QuantileDiscretizer should take random seed

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11515: Assignee: Apache Spark > QuantileDiscretizer should take random seed > ---

[jira] [Commented] (SPARK-11515) QuantileDiscretizer should take random seed

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994936#comment-14994936 ] Apache Spark commented on SPARK-11515: -- User 'yu-iskw' has created a pull request fo

[jira] [Assigned] (SPARK-11515) QuantileDiscretizer should take random seed

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11515: Assignee: (was: Apache Spark) > QuantileDiscretizer should take random seed >

[jira] [Resolved] (SPARK-11389) Add support for off-heap memory to MemoryManager

2015-11-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-11389. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9344 [https://github.c

[jira] [Commented] (SPARK-11566) Refactoring GaussianMixtureModel.gaussians in Python

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994862#comment-14994862 ] Apache Spark commented on SPARK-11566: -- User 'yu-iskw' has created a pull request fo

[jira] [Assigned] (SPARK-11566) Refactoring GaussianMixtureModel.gaussians in Python

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11566: Assignee: (was: Apache Spark) > Refactoring GaussianMixtureModel.gaussians in Python >

[jira] [Assigned] (SPARK-11566) Refactoring GaussianMixtureModel.gaussians in Python

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11566: Assignee: Apache Spark > Refactoring GaussianMixtureModel.gaussians in Python > --

[jira] [Updated] (SPARK-11566) Refactoring GaussianMixtureModel.gaussians in Python

2015-11-06 Thread Yu Ishikawa (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Ishikawa updated SPARK-11566: Component/s: PySpark > Refactoring GaussianMixtureModel.gaussians in Python > -

[jira] [Commented] (SPARK-11439) Optimization of creating sparse feature without dense one

2015-11-06 Thread Nakul Jindal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994861#comment-14994861 ] Nakul Jindal commented on SPARK-11439: -- This is the piece of R code that is used as

[jira] [Created] (SPARK-11566) Refactoring GaussianMixtureModel.gaussians in Python

2015-11-06 Thread Yu Ishikawa (JIRA)
Yu Ishikawa created SPARK-11566: --- Summary: Refactoring GaussianMixtureModel.gaussians in Python Key: SPARK-11566 URL: https://issues.apache.org/jira/browse/SPARK-11566 Project: Spark Issue Type

[jira] [Updated] (SPARK-11453) append data to partitioned table will messes up the result

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11453: - Target Version/s: 1.6.0 > append data to partitioned table will messes up the result > --

[jira] [Updated] (SPARK-11453) append data to partitioned table will messes up the result

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11453: - Assignee: Wenchen Fan > append data to partitioned table will messes up the result >

[jira] [Resolved] (SPARK-11546) Thrift server makes too many logs about result schema

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11546. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9514 [http

[jira] [Updated] (SPARK-11546) Thrift server makes too many logs about result schema

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11546: - Assignee: Navis > Thrift server makes too many logs about result schema > ---

[jira] [Updated] (SPARK-11500) Not deterministic order of columns when using merging schemas.

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11500: - Target Version/s: 1.6.0 > Not deterministic order of columns when using merging schemas.

[jira] [Commented] (SPARK-11565) replace deprecated DigestUtils.shaHex call

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994770#comment-14994770 ] Apache Spark commented on SPARK-11565: -- User 'gliptak' has created a pull request fo

[jira] [Assigned] (SPARK-11565) replace deprecated DigestUtils.shaHex call

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11565: Assignee: Apache Spark > replace deprecated DigestUtils.shaHex call >

[jira] [Assigned] (SPARK-11565) replace deprecated DigestUtils.shaHex call

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11565: Assignee: (was: Apache Spark) > replace deprecated DigestUtils.shaHex call > -

[jira] [Created] (SPARK-11565) replace deprecated DigestUtils.shaHex call

2015-11-06 Thread Gabor Liptak (JIRA)
Gabor Liptak created SPARK-11565: Summary: replace deprecated DigestUtils.shaHex call Key: SPARK-11565 URL: https://issues.apache.org/jira/browse/SPARK-11565 Project: Spark Issue Type: Improv

[jira] [Updated] (SPARK-11522) input_file_name() returns "" for external tables

2015-11-06 Thread Xin Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xin Wu updated SPARK-11522: --- Description: Given an external table definition where the data consists of many CSV files, {{input_file_name

[jira] [Commented] (SPARK-11522) input_file_name() returns "" for external tables

2015-11-06 Thread Xin Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994763#comment-14994763 ] Xin Wu commented on SPARK-11522: running full test against the fix now. will submit the P

[jira] [Resolved] (SPARK-9241) Supporting multiple DISTINCT columns

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-9241. - Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9406 [https:/

[jira] [Updated] (SPARK-9241) Supporting multiple DISTINCT columns

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9241: Assignee: Herman van Hovell > Supporting multiple DISTINCT columns > ---

[jira] [Commented] (SPARK-11482) Maven repo in IsolatedClientLoader should be configurable.

2015-11-06 Thread Xiu Guo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994721#comment-14994721 ] Xiu Guo commented on SPARK-11482: - I am working on a PR for this, will submit shortly. >

[jira] [Assigned] (SPARK-11564) Dataset Java API

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11564: Assignee: Apache Spark (was: Reynold Xin) > Dataset Java API > > >

[jira] [Commented] (SPARK-11564) Dataset Java API

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994719#comment-14994719 ] Apache Spark commented on SPARK-11564: -- User 'rxin' has created a pull request for t

[jira] [Assigned] (SPARK-11564) Dataset Java API

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11564: Assignee: Reynold Xin (was: Apache Spark) > Dataset Java API > > >

[jira] [Created] (SPARK-11564) Dataset Java API

2015-11-06 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-11564: --- Summary: Dataset Java API Key: SPARK-11564 URL: https://issues.apache.org/jira/browse/SPARK-11564 Project: Spark Issue Type: Sub-task Components: SQL

[jira] [Created] (SPARK-11563) Use RpcEnv to transfer generated classes in spark-shell

2015-11-06 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-11563: -- Summary: Use RpcEnv to transfer generated classes in spark-shell Key: SPARK-11563 URL: https://issues.apache.org/jira/browse/SPARK-11563 Project: Spark I

[jira] [Resolved] (SPARK-11555) spark on yarn spark-class --num-workers doesn't work

2015-11-06 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-11555. Resolution: Fixed Fix Version/s: 1.6.0 1.5.3 > spark on yarn spar

[jira] [Commented] (SPARK-11140) Replace file server in driver with RPC-based alternative

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994673#comment-14994673 ] Apache Spark commented on SPARK-11140: -- User 'vanzin' has created a pull request for

[jira] [Assigned] (SPARK-11140) Replace file server in driver with RPC-based alternative

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11140: Assignee: (was: Apache Spark) > Replace file server in driver with RPC-based alternati

[jira] [Assigned] (SPARK-11140) Replace file server in driver with RPC-based alternative

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11140: Assignee: Apache Spark > Replace file server in driver with RPC-based alternative > --

[jira] [Commented] (SPARK-11562) Provide user an option to init SQLContext or HiveContext in spark shell

2015-11-06 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994661#comment-14994661 ] Zhan Zhang commented on SPARK-11562: Thanks [~jerrylam] report the issue and provide

[jira] [Created] (SPARK-11562) Provide user an option to init SQLContext or HiveContext in spark shell

2015-11-06 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-11562: -- Summary: Provide user an option to init SQLContext or HiveContext in spark shell Key: SPARK-11562 URL: https://issues.apache.org/jira/browse/SPARK-11562 Project: Spark

[jira] [Updated] (SPARK-7424) spark.ml classification, regression abstractions should add metadata to output column

2015-11-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7424: - Assignee: (was: Joseph K. Bradley) > spark.ml classification, regression abstractions

[jira] [Resolved] (SPARK-11217) Model import/export for non-meta estimators and transformers

2015-11-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-11217. --- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9454 [ht

[jira] [Commented] (SPARK-11509) ipython notebooks do not work on clusters created using spark-1.5.1-bin-hadoop2.6/ec2/spark-ec2 script

2015-11-06 Thread Andrew Davidson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994616#comment-14994616 ] Andrew Davidson commented on SPARK-11509: - okay after a couple of days hacking it

[jira] [Commented] (SPARK-10086) Flaky StreamingKMeans test in PySpark

2015-11-06 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994615#comment-14994615 ] Bryan Cutler commented on SPARK-10086: -- I've been able to reproduce this locally, bu

[jira] [Resolved] (SPARK-11561) Rename text data source's column name to value

2015-11-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-11561. - Resolution: Fixed Fix Version/s: 1.6.0 > Rename text data source's column name to value >

[jira] [Assigned] (SPARK-11476) Incorrect function referred to in MLib Random data generation documentation

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11476: Assignee: Apache Spark > Incorrect function referred to in MLib Random data generation doc

[jira] [Commented] (SPARK-11476) Incorrect function referred to in MLib Random data generation documentation

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994560#comment-14994560 ] Apache Spark commented on SPARK-11476: -- User 'srowen' has created a pull request for

[jira] [Assigned] (SPARK-11476) Incorrect function referred to in MLib Random data generation documentation

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11476: Assignee: (was: Apache Spark) > Incorrect function referred to in MLib Random data gen

[jira] [Commented] (SPARK-11560) Optimize KMeans implementation

2015-11-06 Thread Jun Zheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994556#comment-14994556 ] Jun Zheng commented on SPARK-11560: --- By simplification, do you mean we assume var "runs

[jira] [Commented] (SPARK-7425) spark.ml Predictor should support other numeric types for label

2015-11-06 Thread Stefano Baghino (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994527#comment-14994527 ] Stefano Baghino commented on SPARK-7425: My bad, I misread the code, thank you. >

[jira] [Commented] (SPARK-11373) Add metrics to the History Server and providers

2015-11-06 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994525#comment-14994525 ] Steve Loughran commented on SPARK-11373: I have in my head roughly how to do this

[jira] [Resolved] (SPARK-8829) Improve expression performance

2015-11-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-8829. Resolution: Fixed Fix Version/s: 1.6.0 > Improve expression performance > ---

[jira] [Reopened] (SPARK-11509) ipython notebooks do not work on clusters created using spark-1.5.1-bin-hadoop2.6/ec2/spark-ec2 script

2015-11-06 Thread Andrew Davidson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Davidson reopened SPARK-11509: - My issue is not resolved I am able to use ipython notebooks on my local mac but still can no

[jira] [Commented] (SPARK-11269) Java API support & test cases

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994375#comment-14994375 ] Apache Spark commented on SPARK-11269: -- User 'rxin' has created a pull request for t

[jira] [Resolved] (SPARK-11450) Add support for UnsafeRow to Expand

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11450. -- Resolution: Fixed Fix Version/s: 1.6.0 > Add support for UnsafeRow to Expand > -

[jira] [Updated] (SPARK-11450) Add support for UnsafeRow to Expand

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11450: - Assignee: Herman van Hovell > Add support for UnsafeRow to Expand > -

[jira] [Commented] (SPARK-9837) Provide R-like summary statistics for GLMs via iteratively reweighted least squares

2015-11-06 Thread Soila Kavulya (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994364#comment-14994364 ] Soila Kavulya commented on SPARK-9837: -- [~mengxr] It is open-sourced. We compute the

[jira] [Commented] (SPARK-11516) Spark application cannot be found from JSON API even though it exists

2015-11-06 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994356#comment-14994356 ] Matt Cheah commented on SPARK-11516: Ah event logging may be what I'm missing. Then a

[jira] [Updated] (SPARK-9162) Implement code generation for ScalaUDF

2015-11-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-9162: - Assignee: Liang-Chi Hsieh > Implement code generation for ScalaUDF > -

[jira] [Commented] (SPARK-11390) Query plan with/without filterPushdown indistinguishable

2015-11-06 Thread Zee Chen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994349#comment-14994349 ] Zee Chen commented on SPARK-11390: -- The required change to support this feature is a bit

[jira] [Updated] (SPARK-10116) XORShiftRandom should generate uniform seeds

2015-11-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-10116: -- Priority: Minor (was: Trivial) > XORShiftRandom should generate uniform seeds > --

[jira] [Resolved] (SPARK-10116) XORShiftRandom should generate uniform seeds

2015-11-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-10116. --- Resolution: Fixed Fix Version/s: 1.6.0 1.7.0 Issue resolved by pull request

[jira] [Assigned] (SPARK-11561) Rename text data source's column name to value

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11561: Assignee: Reynold Xin (was: Apache Spark) > Rename text data source's column name to valu

[jira] [Commented] (SPARK-11561) Rename text data source's column name to value

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994337#comment-14994337 ] Apache Spark commented on SPARK-11561: -- User 'rxin' has created a pull request for t

[jira] [Assigned] (SPARK-11561) Rename text data source's column name to value

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11561: Assignee: Apache Spark (was: Reynold Xin) > Rename text data source's column name to valu

[jira] [Created] (SPARK-11561) Rename text data source's column name to value

2015-11-06 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-11561: --- Summary: Rename text data source's column name to value Key: SPARK-11561 URL: https://issues.apache.org/jira/browse/SPARK-11561 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-11497) PySpark RowMatrix Constructor Has Type Erasure Issue

2015-11-06 Thread Mike Dusenberry (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Dusenberry updated SPARK-11497: Affects Version/s: 1.6.0 1.5.0 1.5.1 > PySpar

[jira] [Commented] (SPARK-7148) Configure Parquet block size (row group size) for ML model import/export

2015-11-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994326#comment-14994326 ] Joseph K. Bradley commented on SPARK-7148: -- Well, for my issue (LDA example faili

[jira] [Updated] (SPARK-11069) Add RegexTokenizer option to convert to lowercase

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-11069: -- Assignee: yuhao yang > Add RegexTokenizer option to convert to lowercase >

[jira] [Updated] (SPARK-7492) Convert LocalDataFrame to LocalMatrix

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7492: - Target Version/s: 1.7.0 (was: 1.6.0) > Convert LocalDataFrame to LocalMatrix > --

[jira] [Updated] (SPARK-11259) Params.validateParams() should be called automatically

2015-11-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-11259: -- Target Version/s: (was: 1.6.0) > Params.validateParams() should be called automatical

[jira] [Updated] (SPARK-8514) LU factorization on BlockMatrix

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-8514: - Target Version/s: 1.7.0 (was: 1.6.0) > LU factorization on BlockMatrix >

[jira] [Updated] (SPARK-8514) LU factorization on BlockMatrix

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-8514: - Priority: Critical (was: Major) > LU factorization on BlockMatrix > -

[jira] [Updated] (SPARK-7809) MultivariateOnlineSummarizer should allow users to configure what to compute

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7809: - Target Version/s: 1.7.0 (was: 1.6.0) > MultivariateOnlineSummarizer should allow users to configu

[jira] [Updated] (SPARK-8971) Support balanced class labels when splitting train/cross validation sets

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-8971: - Target Version/s: 1.7.0 (was: 1.6.0) > Support balanced class labels when splitting train/cross v

[jira] [Created] (SPARK-11560) Optimize KMeans implementation

2015-11-06 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-11560: - Summary: Optimize KMeans implementation Key: SPARK-11560 URL: https://issues.apache.org/jira/browse/SPARK-11560 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-5832) Add Affinity Propagation clustering algorithm

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5832: - Target Version/s: 1.7.0 (was: 1.6.0) > Add Affinity Propagation clustering algorithm > --

[jira] [Updated] (SPARK-9301) collect_set and collect_list aggregate functions

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9301: Priority: Critical (was: Major) > collect_set and collect_list aggregate functions > --

[jira] [Created] (SPARK-11559) Make `runs` no effect in k-means

2015-11-06 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-11559: - Summary: Make `runs` no effect in k-means Key: SPARK-11559 URL: https://issues.apache.org/jira/browse/SPARK-11559 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-11259) Params.validateParams() should be called automatically

2015-11-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-11259: -- Shepherd: Joseph K. Bradley Assignee: Yanbo Liang Target Version

[jira] [Updated] (SPARK-11559) Make `runs` no effect in k-means

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-11559: -- Description: We deprecated `runs` in Spark 1.6 (SPARK-11358). In 1.7.0, we can either remove `r

[jira] [Updated] (SPARK-10645) Bivariate Statistics: Spearman's Correlation in DataFrames

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10645: -- Summary: Bivariate Statistics: Spearman's Correlation in DataFrames (was: Bivariate Statistics

[jira] [Resolved] (SPARK-10329) Cost RDD in k-means|| initialization is not storage-efficient

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-10329. --- Resolution: Later Marked this as "Later". After we fully deprecate `runs`, we can do some op

[jira] [Commented] (SPARK-7130) spark.ml RandomForest* should always do bootstrapping

2015-11-06 Thread Bill Chambers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994307#comment-14994307 ] Bill Chambers commented on SPARK-7130: -- Looking at this issue, the change needs to oc

[jira] [Comment Edited] (SPARK-7130) spark.ml RandomForest* should always do bootstrapping

2015-11-06 Thread Bill Chambers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994307#comment-14994307 ] Bill Chambers edited comment on SPARK-7130 at 11/6/15 7:39 PM: -

[jira] [Updated] (SPARK-10371) Optimize sequential projections

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10371: -- Priority: Critical (was: Major) > Optimize sequential projections > --

[jira] [Updated] (SPARK-10371) Optimize sequential projections

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10371: -- Assignee: Nong Li > Optimize sequential projections > --- > >

[jira] [Updated] (SPARK-9656) Add missing methods to linalg.distributed

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-9656: - Target Version/s: (was: 1.6.0) > Add missing methods to linalg.distributed > ---

[jira] [Issue Comment Deleted] (SPARK-9835) Iteratively reweighted least squares solver for GLMs

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-9835: - Comment: was deleted (was: User 'davies' has created a pull request for this issue: https://github

[jira] [Updated] (SPARK-9961) ML prediction abstractions should have defaultEvaluator fields

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-9961: - Target Version/s: 1.7.0 (was: 1.6.0) > ML prediction abstractions should have defaultEvaluator fi

[jira] [Updated] (SPARK-9835) Iteratively reweighted least squares solver for GLMs

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-9835: - Priority: Critical (was: Major) > Iteratively reweighted least squares solver for GLMs >

[jira] [Updated] (SPARK-9835) Iteratively reweighted least squares solver for GLMs

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-9835: - Target Version/s: 1.7.0 (was: 1.6.0) > Iteratively reweighted least squares solver for GLMs > ---

[jira] [Updated] (SPARK-9919) Matrices should respect Java's equals and hashCode contract

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-9919: - Target Version/s: 1.7.0 (was: 1.6.0) > Matrices should respect Java's equals and hashCode contrac

[jira] [Updated] (SPARK-10478) Improve spark.ml.ann implementations for MLP

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10478: -- Priority: Major (was: Critical) > Improve spark.ml.ann implementations for MLP > -

[jira] [Updated] (SPARK-10478) Improve spark.ml.ann implementations for MLP

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10478: -- Issue Type: Improvement (was: Bug) > Improve spark.ml.ann implementations for MLP > --

[jira] [Updated] (SPARK-10385) Bivariate statistics in DataFrames

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10385: -- Description: Similar to SPARK-10384, it would be nice to have bivariate statistics support in

[jira] [Updated] (SPARK-10385) Bivariate statistics in DataFrames

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10385: -- Summary: Bivariate statistics in DataFrames (was: Bivariate statistics as UDAFs) > Bivariate

[jira] [Updated] (SPARK-10385) Bivariate statistics in DataFrames

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10385: -- Description: Similar to SPARK-10384, it would be nice to have bivariate statistics support in

  1   2   3   >