[jira] [Commented] (SPARK-10086) Flaky StreamingKMeans test in PySpark

2015-11-06 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994615#comment-14994615 ] Bryan Cutler commented on SPARK-10086: -- I've been able to reproduce this locally, but haven't found

[jira] [Resolved] (SPARK-11561) Rename text data source's column name to value

2015-11-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-11561. - Resolution: Fixed Fix Version/s: 1.6.0 > Rename text data source's column name to value >

[jira] [Updated] (SPARK-7424) spark.ml classification, regression abstractions should add metadata to output column

2015-11-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7424: - Assignee: (was: Joseph K. Bradley) > spark.ml classification, regression abstractions

[jira] [Commented] (SPARK-11562) Provide user an option to init SQLContext or HiveContext in spark shell

2015-11-06 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994661#comment-14994661 ] Zhan Zhang commented on SPARK-11562: Thanks [~jerrylam] report the issue and provide the suggestion.

[jira] [Updated] (SPARK-11522) input_file_name() returns "" for external tables

2015-11-06 Thread Xin Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xin Wu updated SPARK-11522: --- Description: Given an external table definition where the data consists of many CSV files,

[jira] [Commented] (SPARK-11522) input_file_name() returns "" for external tables

2015-11-06 Thread Xin Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994763#comment-14994763 ] Xin Wu commented on SPARK-11522: running full test against the fix now. will submit the PR soon. >

[jira] [Updated] (SPARK-11453) append data to partitioned table will messes up the result

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11453: - Assignee: Wenchen Fan > append data to partitioned table will messes up the result >

[jira] [Updated] (SPARK-11453) append data to partitioned table will messes up the result

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11453: - Target Version/s: 1.6.0 > append data to partitioned table will messes up the result >

[jira] [Commented] (SPARK-11509) ipython notebooks do not work on clusters created using spark-1.5.1-bin-hadoop2.6/ec2/spark-ec2 script

2015-11-06 Thread Andrew Davidson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994616#comment-14994616 ] Andrew Davidson commented on SPARK-11509: - okay after a couple of days hacking it looks like my

[jira] [Commented] (SPARK-11564) Dataset Java API

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994719#comment-14994719 ] Apache Spark commented on SPARK-11564: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-11564) Dataset Java API

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11564: Assignee: Apache Spark (was: Reynold Xin) > Dataset Java API > > >

[jira] [Assigned] (SPARK-11564) Dataset Java API

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11564: Assignee: Reynold Xin (was: Apache Spark) > Dataset Java API > > >

[jira] [Assigned] (SPARK-11565) replace deprecated DigestUtils.shaHex call

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11565: Assignee: Apache Spark > replace deprecated DigestUtils.shaHex call >

[jira] [Commented] (SPARK-11565) replace deprecated DigestUtils.shaHex call

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994770#comment-14994770 ] Apache Spark commented on SPARK-11565: -- User 'gliptak' has created a pull request for this issue:

[jira] [Assigned] (SPARK-11565) replace deprecated DigestUtils.shaHex call

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11565: Assignee: (was: Apache Spark) > replace deprecated DigestUtils.shaHex call >

[jira] [Resolved] (SPARK-11389) Add support for off-heap memory to MemoryManager

2015-11-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-11389. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9344

[jira] [Commented] (SPARK-11482) Maven repo in IsolatedClientLoader should be configurable.

2015-11-06 Thread Xiu Guo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994721#comment-14994721 ] Xiu Guo commented on SPARK-11482: - I am working on a PR for this, will submit shortly. > Maven repo in

[jira] [Commented] (SPARK-11439) Optimization of creating sparse feature without dense one

2015-11-06 Thread Nakul Jindal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994861#comment-14994861 ] Nakul Jindal commented on SPARK-11439: -- This is the piece of R code that is used as reference for

[jira] [Created] (SPARK-11566) Refactoring GaussianMixtureModel.gaussians in Python

2015-11-06 Thread Yu Ishikawa (JIRA)
Yu Ishikawa created SPARK-11566: --- Summary: Refactoring GaussianMixtureModel.gaussians in Python Key: SPARK-11566 URL: https://issues.apache.org/jira/browse/SPARK-11566 Project: Spark Issue

[jira] [Resolved] (SPARK-11217) Model import/export for non-meta estimators and transformers

2015-11-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-11217. --- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9454

[jira] [Commented] (SPARK-11140) Replace file server in driver with RPC-based alternative

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994673#comment-14994673 ] Apache Spark commented on SPARK-11140: -- User 'vanzin' has created a pull request for this issue:

[jira] [Resolved] (SPARK-11555) spark on yarn spark-class --num-workers doesn't work

2015-11-06 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-11555. Resolution: Fixed Fix Version/s: 1.6.0 1.5.3 > spark on yarn

[jira] [Resolved] (SPARK-9241) Supporting multiple DISTINCT columns

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-9241. - Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9406

[jira] [Assigned] (SPARK-11515) QuantileDiscretizer should take random seed

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11515: Assignee: (was: Apache Spark) > QuantileDiscretizer should take random seed >

[jira] [Assigned] (SPARK-11515) QuantileDiscretizer should take random seed

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11515: Assignee: Apache Spark > QuantileDiscretizer should take random seed >

[jira] [Commented] (SPARK-11515) QuantileDiscretizer should take random seed

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994936#comment-14994936 ] Apache Spark commented on SPARK-11515: -- User 'yu-iskw' has created a pull request for this issue:

[jira] [Updated] (SPARK-9241) Supporting multiple DISTINCT columns

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9241: Assignee: Herman van Hovell > Supporting multiple DISTINCT columns >

[jira] [Created] (SPARK-11565) replace deprecated DigestUtils.shaHex call

2015-11-06 Thread Gabor Liptak (JIRA)
Gabor Liptak created SPARK-11565: Summary: replace deprecated DigestUtils.shaHex call Key: SPARK-11565 URL: https://issues.apache.org/jira/browse/SPARK-11565 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-11140) Replace file server in driver with RPC-based alternative

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11140: Assignee: Apache Spark > Replace file server in driver with RPC-based alternative >

[jira] [Assigned] (SPARK-11140) Replace file server in driver with RPC-based alternative

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11140: Assignee: (was: Apache Spark) > Replace file server in driver with RPC-based

[jira] [Created] (SPARK-11563) Use RpcEnv to transfer generated classes in spark-shell

2015-11-06 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-11563: -- Summary: Use RpcEnv to transfer generated classes in spark-shell Key: SPARK-11563 URL: https://issues.apache.org/jira/browse/SPARK-11563 Project: Spark

[jira] [Updated] (SPARK-11500) Not deterministic order of columns when using merging schemas.

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11500: - Target Version/s: 1.6.0 > Not deterministic order of columns when using merging schemas.

[jira] [Updated] (SPARK-11546) Thrift server makes too many logs about result schema

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11546: - Assignee: Navis > Thrift server makes too many logs about result schema >

[jira] [Resolved] (SPARK-11546) Thrift server makes too many logs about result schema

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11546. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9514

[jira] [Created] (SPARK-11562) Provide user an option to init SQLContext or HiveContext in spark shell

2015-11-06 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-11562: -- Summary: Provide user an option to init SQLContext or HiveContext in spark shell Key: SPARK-11562 URL: https://issues.apache.org/jira/browse/SPARK-11562 Project: Spark

[jira] [Created] (SPARK-11564) Dataset Java API

2015-11-06 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-11564: --- Summary: Dataset Java API Key: SPARK-11564 URL: https://issues.apache.org/jira/browse/SPARK-11564 Project: Spark Issue Type: Sub-task Components:

[jira] [Commented] (SPARK-11566) Refactoring GaussianMixtureModel.gaussians in Python

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994862#comment-14994862 ] Apache Spark commented on SPARK-11566: -- User 'yu-iskw' has created a pull request for this issue:

[jira] [Assigned] (SPARK-11566) Refactoring GaussianMixtureModel.gaussians in Python

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11566: Assignee: (was: Apache Spark) > Refactoring GaussianMixtureModel.gaussians in Python

[jira] [Assigned] (SPARK-11566) Refactoring GaussianMixtureModel.gaussians in Python

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11566: Assignee: Apache Spark > Refactoring GaussianMixtureModel.gaussians in Python >

[jira] [Updated] (SPARK-11566) Refactoring GaussianMixtureModel.gaussians in Python

2015-11-06 Thread Yu Ishikawa (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Ishikawa updated SPARK-11566: Component/s: PySpark > Refactoring GaussianMixtureModel.gaussians in Python >

[jira] [Resolved] (SPARK-11112) DAG visualization: display RDD callsite

2015-11-06 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-2. --- Resolution: Fixed Fix Version/s: 1.6.0 > DAG visualization: display RDD callsite >

[jira] [Resolved] (SPARK-8467) Add LDAModel.describeTopics() in Python

2015-11-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-8467. --- Resolution: Fixed Fix Version/s: 1.6.0 > Add LDAModel.describeTopics() in Python >

[jira] [Created] (SPARK-11567) Add Python API for corr in agg

2015-11-06 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-11567: Summary: Add Python API for corr in agg Key: SPARK-11567 URL: https://issues.apache.org/jira/browse/SPARK-11567 Project: Spark Issue Type: Task Affects

[jira] [Assigned] (SPARK-11567) Add Python API for corr in agg

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11567: Assignee: Apache Spark > Add Python API for corr in agg > --

[jira] [Commented] (SPARK-11567) Add Python API for corr in agg

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995106#comment-14995106 ] Apache Spark commented on SPARK-11567: -- User 'felixcheung' has created a pull request for this

[jira] [Assigned] (SPARK-11567) Add Python API for corr in agg

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11567: Assignee: (was: Apache Spark) > Add Python API for corr in agg >

[jira] [Updated] (SPARK-11567) Add Python API for corr aggregate function

2015-11-06 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-11567: - Summary: Add Python API for corr aggregate function (was: Add Python API for corr in agg) >

[jira] [Commented] (SPARK-10141) Number of tasks on executors still become negative after failures

2015-11-06 Thread Jay Luan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993323#comment-14993323 ] Jay Luan commented on SPARK-10141: -- I am also getting similar errors and Spark UI also shows -1. Spark

[jira] [Issue Comment Deleted] (SPARK-11551) Replace example code in ml-features.md using include_example

2015-11-06 Thread Vikas Nelamangala (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Nelamangala updated SPARK-11551: -- Comment: was deleted (was: I am fixing this) > Replace example code in ml-features.md

[jira] [Commented] (SPARK-11551) Replace example code in ml-features.md using include_example

2015-11-06 Thread somil deshmukh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993354#comment-14993354 ] somil deshmukh commented on SPARK-11551: Pls assign to me ,I would like to work on this issue >

[jira] [Commented] (SPARK-11551) Replace example code in ml-features.md using include_example

2015-11-06 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993453#comment-14993453 ] Xusen Yin commented on SPARK-11551: --- Please go ahead and start doing it. This will assign to you after

[jira] [Commented] (SPARK-11551) Replace example code in ml-features.md using include_example

2015-11-06 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993452#comment-14993452 ] Xusen Yin commented on SPARK-11551: --- Please go ahead and start doing it. This will assign to you after

[jira] [Commented] (SPARK-11516) Spark application cannot be found from JSON API even though it exists

2015-11-06 Thread Charles Yeh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993457#comment-14993457 ] Charles Yeh commented on SPARK-11516: - This is because it errors when rebuilding the Spark UI if

[jira] [Resolved] (SPARK-11472) SparkContext creation error after sc.stop() when Spark is compiled for Hive

2015-11-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-11472. --- Resolution: Not A Problem You can reopen, or just make a new one. > SparkContext creation error

[jira] [Commented] (SPARK-11548) Replace example code in mllib-collaborative-filtering.md using include_example

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993393#comment-14993393 ] Apache Spark commented on SPARK-11548: -- User 'rishabhbhardwaj' has created a pull request for this

[jira] [Assigned] (SPARK-11548) Replace example code in mllib-collaborative-filtering.md using include_example

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11548: Assignee: Apache Spark > Replace example code in mllib-collaborative-filtering.md using

[jira] [Commented] (SPARK-10141) Number of tasks on executors still become negative after failures

2015-11-06 Thread Ohad Zadok (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993406#comment-14993406 ] Ohad Zadok commented on SPARK-10141: no, from the log I believe it's some kind of communication

[jira] [Issue Comment Deleted] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] patcharee updated SPARK-11087: -- Comment: was deleted (was: I found a scenario where the problem exists) >

[jira] [Issue Comment Deleted] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] patcharee updated SPARK-11087: -- Comment: was deleted (was: Hi [~zzhan], the problem actually happens when I generates orc file by

[jira] [Assigned] (SPARK-11507) Error thrown when using BlockMatrix.add

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11507: Assignee: Apache Spark > Error thrown when using BlockMatrix.add >

[jira] [Commented] (SPARK-11507) Error thrown when using BlockMatrix.add

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993440#comment-14993440 ] Apache Spark commented on SPARK-11507: -- User 'hhbyyh' has created a pull request for this issue:

[jira] [Assigned] (SPARK-11507) Error thrown when using BlockMatrix.add

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11507: Assignee: (was: Apache Spark) > Error thrown when using BlockMatrix.add >

[jira] [Commented] (SPARK-11549) Replace example code in mllib-evaluation-metrics.md using include_example

2015-11-06 Thread Vikas Nelamangala (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993367#comment-14993367 ] Vikas Nelamangala commented on SPARK-11549: --- I am working on this > Replace example code in

[jira] [Reopened] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] patcharee reopened SPARK-11087: --- I found a scenario where the problem exists > spark.sql.orc.filterPushdown does not work, No ORC

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-11-06 Thread patcharee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993398#comment-14993398 ] patcharee commented on SPARK-11087: --- Hi [~zzhan], the problem actually happens when I generates orc

[jira] [Assigned] (SPARK-11548) Replace example code in mllib-collaborative-filtering.md using include_example

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11548: Assignee: (was: Apache Spark) > Replace example code in

[jira] [Updated] (SPARK-11530) Return eigenvalues with PCA model

2015-11-06 Thread Christos Iraklis Tsatsoulis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christos Iraklis Tsatsoulis updated SPARK-11530: Description: For data scientists & statisticians, PCA is of little

[jira] [Assigned] (SPARK-11535) StringIndexer should handle empty String specially

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11535: Assignee: Apache Spark > StringIndexer should handle empty String specially >

[jira] [Commented] (SPARK-11535) StringIndexer should handle empty String specially

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993546#comment-14993546 ] Apache Spark commented on SPARK-11535: -- User 'pravingadakh' has created a pull request for this

[jira] [Assigned] (SPARK-11535) StringIndexer should handle empty String specially

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11535: Assignee: (was: Apache Spark) > StringIndexer should handle empty String specially >

[jira] [Commented] (SPARK-11530) Return eigenvalues with PCA model

2015-11-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993619#comment-14993619 ] Sean Owen commented on SPARK-11530: --- I agree, and it should be fairly easy to get this info out of the

[jira] [Created] (SPARK-11553) row.getInt(i) if row[i]=null returns 0

2015-11-06 Thread Tofigh (JIRA)
Tofigh created SPARK-11553: -- Summary: row.getInt(i) if row[i]=null returns 0 Key: SPARK-11553 URL: https://issues.apache.org/jira/browse/SPARK-11553 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-11554) add map/flatMap to GroupedDataset

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993532#comment-14993532 ] Apache Spark commented on SPARK-11554: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Commented] (SPARK-7424) spark.ml classification, regression abstractions should add metadata to output column

2015-11-06 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994194#comment-14994194 ] holdenk commented on SPARK-7424: Noticed it got bumped again, would this be a good thing for someone else

[jira] [Updated] (SPARK-11219) Make Parameter Description Format Consistent in PySpark.MLlib

2015-11-06 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-11219: - Description: There are several different formats for describing params in PySpark.MLlib, making

[jira] [Resolved] (SPARK-9162) Implement code generation for ScalaUDF

2015-11-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-9162. --- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9270

[jira] [Commented] (SPARK-7675) PySpark spark.ml Params type conversions

2015-11-06 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994212#comment-14994212 ] holdenk commented on SPARK-7675: I'll give this a shot since I've been doing some other work in the

[jira] [Comment Edited] (SPARK-9039) Jobs page shows nonsensical task-progress-bar numbers when speculation occurs

2015-11-06 Thread Alexander Bozarth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994217#comment-14994217 ] Alexander Bozarth edited comment on SPARK-9039 at 11/6/15 6:58 PM: --- I've

[jira] [Commented] (SPARK-11531) PySpark SparseVector should have "Found duplicate indices" error message

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994230#comment-14994230 ] Apache Spark commented on SPARK-11531: -- User 'rekhajoshm' has created a pull request for this issue:

[jira] [Assigned] (SPARK-11531) PySpark SparseVector should have "Found duplicate indices" error message

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11531: Assignee: Apache Spark > PySpark SparseVector should have "Found duplicate indices" error

[jira] [Assigned] (SPARK-11531) PySpark SparseVector should have "Found duplicate indices" error message

2015-11-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11531: Assignee: (was: Apache Spark) > PySpark SparseVector should have "Found duplicate

[jira] [Commented] (SPARK-10309) Some tasks failed with Unable to acquire memory

2015-11-06 Thread Jit Ken Tan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994256#comment-14994256 ] Jit Ken Tan commented on SPARK-10309: - Abhishek, try disabling tungsten? ie.

[jira] [Updated] (SPARK-8520) Improve GLM's scalability on number of features

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-8520: - Target Version/s: (was: 1.6.0) > Improve GLM's scalability on number of features >

[jira] [Resolved] (SPARK-9930) Feature transformers in 1.6

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-9930. -- Resolution: Done Fix Version/s: 1.6.0 > Feature transformers in 1.6 >

[jira] [Updated] (SPARK-11521) LinearRegressionSummary needs to clarify which metrics are weighted

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-11521: -- Target Version/s: 1.6.0 (was: 1.7.0) > LinearRegressionSummary needs to clarify which metrics

[jira] [Updated] (SPARK-11521) LinearRegressionSummary needs to clarify which metrics are weighted in the documentation

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-11521: -- Summary: LinearRegressionSummary needs to clarify which metrics are weighted in the

[jira] [Updated] (SPARK-9837) Provide R-like summary statistics for GLMs via iteratively reweighted least squares

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-9837: - Target Version/s: 1.7.0 (was: 1.6.0) > Provide R-like summary statistics for GLMs via

[jira] [Updated] (SPARK-9840) Support popular link functions in SparkR:::glm

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-9840: - Issue Type: New Feature (was: Sub-task) Parent: (was: SPARK-9647) > Support popular

[jira] [Updated] (SPARK-9840) Support popular link functions in SparkR:::glm

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-9840: - Target Version/s: 1.7.0 (was: 1.6.0) > Support popular link functions in SparkR:::glm >

[jira] [Commented] (SPARK-11511) Creating an InputDStream but not using it throws NPE

2015-11-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994267#comment-14994267 ] Shixiong Zhu commented on SPARK-11511: -- Got it. Thanks! > Creating an InputDStream but not using it

[jira] [Updated] (SPARK-10936) UDAF "mode" for categorical variables

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10936: -- Issue Type: New Feature (was: Sub-task) Parent: (was: SPARK-10384) > UDAF "mode"

[jira] [Updated] (SPARK-10936) UDAF "mode" for categorical variables

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10936: -- Target Version/s: 1.7.0 > UDAF "mode" for categorical variables >

[jira] [Commented] (SPARK-11373) Add metrics to the History Server and providers

2015-11-06 Thread Charles Yeh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994285#comment-14994285 ] Charles Yeh commented on SPARK-11373: - I could work on this but I need help getting started. I think

[jira] [Updated] (SPARK-10385) Bivariate statistics as UDAFs

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10385: -- Target Version/s: (was: 1.6.0) > Bivariate statistics as UDAFs >

[jira] [Commented] (SPARK-5114) Should Evaluator be a PipelineStage

2015-11-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994296#comment-14994296 ] Joseph K. Bradley commented on SPARK-5114: -- [~srowen] I agree we're too ambitious with setting

[jira] [Updated] (SPARK-10385) Bivariate statistics in DataFrames

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10385: -- Summary: Bivariate statistics in DataFrames (was: Bivariate statistics as UDAFs) > Bivariate

[jira] [Updated] (SPARK-10385) Bivariate statistics in DataFrames

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10385: -- Description: Similar to SPARK-10384, it would be nice to have bivariate statistics support in

[jira] [Issue Comment Deleted] (SPARK-9835) Iteratively reweighted least squares solver for GLMs

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-9835: - Comment: was deleted (was: User 'davies' has created a pull request for this issue:

[jira] [Updated] (SPARK-9961) ML prediction abstractions should have defaultEvaluator fields

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-9961: - Target Version/s: 1.7.0 (was: 1.6.0) > ML prediction abstractions should have defaultEvaluator

[jira] [Updated] (SPARK-9919) Matrices should respect Java's equals and hashCode contract

2015-11-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-9919: - Target Version/s: 1.7.0 (was: 1.6.0) > Matrices should respect Java's equals and hashCode

  1   2   3   >