[jira] [Created] (SPARK-18856) Newly created catalog table assumed to have 0 rows and 0 bytes

2016-12-13 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18856: --- Summary: Newly created catalog table assumed to have 0 rows and 0 bytes Key: SPARK-18856 URL: https://issues.apache.org/jira/browse/SPARK-18856 Project: Spark

[jira] [Updated] (SPARK-18827) Cann't read broadcast if broadcast blocks are stored on-disk,

2016-12-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-18827: Summary: Cann't read broadcast if broadcast blocks are stored on-disk, (was: Cann't cache

[jira] [Updated] (SPARK-18827) Cann't read broadcast if broadcast blocks are stored on-disk

2016-12-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-18827: Summary: Cann't read broadcast if broadcast blocks are stored on-disk (was: Cann't read broadcast

[jira] [Updated] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18854: Target Version/s: 2.0.3, 2.1.1, 2.2.0 (was: 2.1.1, 2.2.0) > getNodeNumbered and

[jira] [Assigned] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18854: Assignee: (was: Apache Spark) > getNodeNumbered and generateTreeString are not

[jira] [Commented] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747497#comment-15747497 ] Apache Spark commented on SPARK-18854: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18854: Assignee: Apache Spark > getNodeNumbered and generateTreeString are not consistent >

[jira] [Updated] (SPARK-18356) KMeans should cache RDD before training

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18356: -- Summary: KMeans should cache RDD before training (was: Issue + Resolution: Kmeans

[jira] [Updated] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18854: Description: This is a bug introduced by subquery handling. generateTreeString numbers trees

[jira] [Updated] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18854: Description: This is a bug introduced by subquery handling. generateTreeString numbers trees

[jira] [Commented] (SPARK-18823) Assignation by column name variable not available or bug?

2016-12-13 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747433#comment-15747433 ] Felix Cheung commented on SPARK-18823: -- We will address both of your suggestions. As for x$y <-

[jira] [Commented] (SPARK-18823) Assignation by column name variable not available or bug?

2016-12-13 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747429#comment-15747429 ] Felix Cheung commented on SPARK-18823: -- For #2, I do agree it could get messy, but I was thinking

[jira] [Assigned] (SPARK-18855) Add RDD flatten function

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18855: Assignee: (was: Apache Spark) > Add RDD flatten function > >

[jira] [Assigned] (SPARK-18855) Add RDD flatten function

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18855: Assignee: Apache Spark > Add RDD flatten function > > >

[jira] [Commented] (SPARK-18855) Add RDD flatten function

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747415#comment-15747415 ] Apache Spark commented on SPARK-18855: -- User 'linbojin' has created a pull request for this issue:

[jira] [Commented] (SPARK-18849) Vignettes final checks for Spark 2.1

2016-12-13 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747412#comment-15747412 ] Felix Cheung commented on SPARK-18849: -- probably would be good to check for warning or error in

[jira] [Commented] (SPARK-18825) Eliminate duplicate links in SparkR API doc index

2016-12-13 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747410#comment-15747410 ] Felix Cheung commented on SPARK-18825: -- I will see what I can do... > Eliminate duplicate links in

[jira] [Commented] (SPARK-13587) Support virtualenv in PySpark

2016-12-13 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747407#comment-15747407 ] Jeff Zhang commented on SPARK-13587: If it is pretty large cluster, then I would suggest to set up a

[jira] [Updated] (SPARK-18855) Add RDD flatten function

2016-12-13 Thread Linbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Linbo updated SPARK-18855: -- Target Version/s: (was: 2.1.0) > Add RDD flatten function > > >

[jira] [Commented] (SPARK-13587) Support virtualenv in PySpark

2016-12-13 Thread Prasanna Santhanam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747379#comment-15747379 ] Prasanna Santhanam commented on SPARK-13587: [~zjffdu] In case of Anaconda Python the

[jira] [Commented] (SPARK-13587) Support virtualenv in PySpark

2016-12-13 Thread Prasanna Santhanam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747381#comment-15747381 ] Prasanna Santhanam commented on SPARK-13587: [~zjffdu] In case of Anaconda Python the

[jira] [Issue Comment Deleted] (SPARK-13587) Support virtualenv in PySpark

2016-12-13 Thread Prasanna Santhanam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanna Santhanam updated SPARK-13587: --- Comment: was deleted (was: [~zjffdu] In case of Anaconda Python the environment is

[jira] [Created] (SPARK-18855) Add RDD flatten function

2016-12-13 Thread Linbo (JIRA)
Linbo created SPARK-18855: - Summary: Add RDD flatten function Key: SPARK-18855 URL: https://issues.apache.org/jira/browse/SPARK-18855 Project: Spark Issue Type: New Feature Components:

[jira] [Commented] (SPARK-13587) Support virtualenv in PySpark

2016-12-13 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747338#comment-15747338 ] Jeff Zhang commented on SPARK-13587: [~prasanna.santha...@icloud.com] I don't understand how this can

[jira] [Commented] (SPARK-13587) Support virtualenv in PySpark

2016-12-13 Thread Prasanna Santhanam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747291#comment-15747291 ] Prasanna Santhanam commented on SPARK-13587: [~nchammas] sorry, this got buried in several

[jira] [Commented] (SPARK-18278) Support native submission of spark jobs to a kubernetes cluster

2016-12-13 Thread Shuai Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747151#comment-15747151 ] Shuai Lin commented on SPARK-18278: --- bq. If I had to choose between maintaining a fork versus cleaning

[jira] [Comment Edited] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2016-12-13 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747120#comment-15747120 ] Liang-Chi Hsieh edited comment on SPARK-18281 at 12/14/16 3:48 AM: ---

[jira] [Commented] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2016-12-13 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747120#comment-15747120 ] Liang-Chi Hsieh commented on SPARK-18281: - [~mwdus...@us.ibm.com] Thanks for reporting this

[jira] [Resolved] (SPARK-18566) remove OverwriteOptions

2016-12-13 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-18566. - Resolution: Fixed Fix Version/s: 2.2.0 > remove OverwriteOptions >

[jira] [Commented] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747080#comment-15747080 ] Reynold Xin commented on SPARK-18854: - To test this, introduce a subquery and call

[jira] [Updated] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18854: Description: This is a bug introduced by subquery handling. generateTreeString numbers trees

[jira] [Created] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18854: --- Summary: getNodeNumbered and generateTreeString are not consistent Key: SPARK-18854 URL: https://issues.apache.org/jira/browse/SPARK-18854 Project: Spark

[jira] [Commented] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747078#comment-15747078 ] Reynold Xin commented on SPARK-18854: - cc [~smilegator] > getNodeNumbered and generateTreeString are

[jira] [Assigned] (SPARK-18588) KafkaSourceStressForDontFailOnDataLossSuite is flaky

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18588: Assignee: Shixiong Zhu (was: Apache Spark) > KafkaSourceStressForDontFailOnDataLossSuite

[jira] [Commented] (SPARK-18588) KafkaSourceStressForDontFailOnDataLossSuite is flaky

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746957#comment-15746957 ] Apache Spark commented on SPARK-18588: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18588) KafkaSourceStressForDontFailOnDataLossSuite is flaky

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18588: Assignee: Apache Spark (was: Shixiong Zhu) > KafkaSourceStressForDontFailOnDataLossSuite

[jira] [Resolved] (SPARK-18746) Add implicit encoders for BigDecimal, timestamp and date

2016-12-13 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-18746. - Resolution: Fixed Assignee: Weiqing Yang Fix Version/s: 2.2.0 > Add implicit

[jira] [Resolved] (SPARK-18840) HDFSCredentialProvider throws exception in non-HDFS security environment

2016-12-13 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-18840. Resolution: Fixed > HDFSCredentialProvider throws exception in non-HDFS security

[jira] [Commented] (SPARK-18840) HDFSCredentialProvider throws exception in non-HDFS security environment

2016-12-13 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746898#comment-15746898 ] Marcelo Vanzin commented on SPARK-18840: Ok. Since it's not a regression, let's do it if it

[jira] [Commented] (SPARK-18853) Project (UnaryNode) is way too aggressive in estimating statistics

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746888#comment-15746888 ] Apache Spark commented on SPARK-18853: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18853) Project (UnaryNode) is way too aggressive in estimating statistics

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18853: Assignee: (was: Apache Spark) > Project (UnaryNode) is way too aggressive in

[jira] [Assigned] (SPARK-18853) Project (UnaryNode) is way too aggressive in estimating statistics

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18853: Assignee: Apache Spark > Project (UnaryNode) is way too aggressive in estimating

[jira] [Commented] (SPARK-18840) HDFSCredentialProvider throws exception in non-HDFS security environment

2016-12-13 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746877#comment-15746877 ] Saisai Shao commented on SPARK-18840: - [~vanzin], is it necessary to fix it in old version (2.0/1.6),

[jira] [Commented] (SPARK-18814) CheckAnalysis rejects TPCDS query 32

2016-12-13 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746846#comment-15746846 ] Nattavut Sutyanyong commented on SPARK-18814: - As this JIRA will be brought to close shortly,

[jira] [Commented] (SPARK-7253) Add example of belief propagation with GraphX

2016-12-13 Thread Alexander Ulanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746841#comment-15746841 ] Alexander Ulanov commented on SPARK-7253: - Here is the implementation of belief propagation

[jira] [Resolved] (SPARK-18793) SparkR vignette update: random forest

2016-12-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-18793. --- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 Issue resolved by

[jira] [Resolved] (SPARK-18794) SparkR vignette update: gbt

2016-12-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-18794. --- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 Issue resolved by

[jira] [Updated] (SPARK-17455) IsotonicRegression takes non-polynomial time for some inputs

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-17455: -- Target Version/s: 2.0.3, 2.1.1, 2.2.0 > IsotonicRegression takes non-polynomial time

[jira] [Updated] (SPARK-18853) Project (UnaryNode) is way too aggressive in estimating statistics

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18853: Summary: Project (UnaryNode) is way too aggressive in estimating statistics (was: Project is way

[jira] [Created] (SPARK-18853) Project is way too aggressive in estimating statistics

2016-12-13 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18853: --- Summary: Project is way too aggressive in estimating statistics Key: SPARK-18853 URL: https://issues.apache.org/jira/browse/SPARK-18853 Project: Spark Issue

[jira] [Commented] (SPARK-18783) ML StringIndexer does not work with nested fields

2016-12-13 Thread manuel garrido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746637#comment-15746637 ] manuel garrido commented on SPARK-18783: I agree, it's not a deal breaker. Good to have it

[jira] [Commented] (SPARK-18783) ML StringIndexer does not work with nested fields

2016-12-13 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746570#comment-15746570 ] Bryan Cutler commented on SPARK-18783: -- I believe this is the case for most ML

[jira] [Commented] (SPARK-18837) Very long stage descriptions do not wrap in the UI

2016-12-13 Thread Alex Bozarth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746511#comment-15746511 ] Alex Bozarth commented on SPARK-18837: -- Did some digging and this *seems* to have been caused by

[jira] [Commented] (SPARK-18852) StreamingQuery.lastProgress should be null when recentProgress is empty

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746496#comment-15746496 ] Apache Spark commented on SPARK-18852: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18852) StreamingQuery.lastProgress should be null when recentProgress is empty

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18852: Assignee: (was: Apache Spark) > StreamingQuery.lastProgress should be null when

[jira] [Assigned] (SPARK-18852) StreamingQuery.lastProgress should be null when recentProgress is empty

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18852: Assignee: Apache Spark > StreamingQuery.lastProgress should be null when recentProgress

[jira] [Created] (SPARK-18852) StreamingQuery.lastProgress should be null when recentProgress is empty

2016-12-13 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-18852: Summary: StreamingQuery.lastProgress should be null when recentProgress is empty Key: SPARK-18852 URL: https://issues.apache.org/jira/browse/SPARK-18852 Project:

[jira] [Updated] (SPARK-18852) StreamingQuery.lastProgress should be null when recentProgress is empty

2016-12-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-18852: - Affects Version/s: 2.1.0 > StreamingQuery.lastProgress should be null when recentProgress is

[jira] [Resolved] (SPARK-18851) DataSet Limit into Aggregate Results in NPE in Codegen

2016-12-13 Thread Russell Spitzer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Spitzer resolved SPARK-18851. - Resolution: Duplicate > DataSet Limit into Aggregate Results in NPE in Codegen >

[jira] [Updated] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4591: - Target Version/s: (was: 2.2.0) > Algorithm/model parity for spark.ml (Scala) >

[jira] [Commented] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746473#comment-15746473 ] Joseph K. Bradley commented on SPARK-4591: -- I also removed the target version since this includes

[jira] [Updated] (SPARK-18813) MLlib 2.2 Roadmap

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18813: -- Description: *PROPOSAL: This includes a proposal for the 2.2 roadmap process for

[jira] [Commented] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746471#comment-15746471 ] Joseph K. Bradley commented on SPARK-4591: -- I just updated this a bit. I did not finish linking

[jira] [Updated] (SPARK-18851) DataSet Limit into Aggregate Results in NPE in Codegen

2016-12-13 Thread Russell Spitzer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Spitzer updated SPARK-18851: Labels: regresion (was: ) > DataSet Limit into Aggregate Results in NPE in Codegen >

[jira] [Updated] (SPARK-14709) spark.ml API for linear SVM

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14709: -- Shepherd: Joseph K. Bradley Target Version/s: 2.2.0 > spark.ml API for

[jira] [Commented] (SPARK-14709) spark.ml API for linear SVM

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746462#comment-15746462 ] Joseph K. Bradley commented on SPARK-14709: --- Marking myself as shepherd per the 2.2 roadmap

[jira] [Updated] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4591: - Description: This is an umbrella JIRA for porting spark.mllib implementations to use the

[jira] [Resolved] (SPARK-18834) Expose event time time stats through StreamingQueryProgress

2016-12-13 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-18834. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 16258

[jira] [Commented] (SPARK-18850) Make StreamExecution serializable

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746436#comment-15746436 ] Apache Spark commented on SPARK-18850: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18850) Make StreamExecution serializable

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18850: Assignee: Apache Spark (was: Shixiong Zhu) > Make StreamExecution serializable >

[jira] [Assigned] (SPARK-18850) Make StreamExecution serializable

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18850: Assignee: Shixiong Zhu (was: Apache Spark) > Make StreamExecution serializable >

[jira] [Updated] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4591: - Description: This is an umbrella JIRA for porting spark.mllib implementations to use the

[jira] [Resolved] (SPARK-18843) Fix timeout in awaitResultInForkJoinSafely

2016-12-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-18843. -- Resolution: Fixed Fix Version/s: 2.1.1 2.0.3 > Fix timeout in

[jira] [Updated] (SPARK-18851) DataSet Limit into Aggregate Results in NPE in Codegen

2016-12-13 Thread Russell Spitzer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Spitzer updated SPARK-18851: Summary: DataSet Limit into Aggregate Results in NPE in Codegen (was: DataSet

[jira] [Created] (SPARK-18851) DataSet limit.distinct Results in NPE in Codegen

2016-12-13 Thread Russell Spitzer (JIRA)
Russell Spitzer created SPARK-18851: --- Summary: DataSet limit.distinct Results in NPE in Codegen Key: SPARK-18851 URL: https://issues.apache.org/jira/browse/SPARK-18851 Project: Spark Issue

[jira] [Commented] (SPARK-18676) Spark 2.x query plan data size estimation can crash join queries versus 1.x

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746414#comment-15746414 ] Reynold Xin commented on SPARK-18676: - That's the other option I was considering. It'd be good to

[jira] [Created] (SPARK-18850) Make StreamExecution serializable

2016-12-13 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-18850: Summary: Make StreamExecution serializable Key: SPARK-18850 URL: https://issues.apache.org/jira/browse/SPARK-18850 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4591: - Description: This is an umbrella JIRA for porting spark.mllib implementations to use the

[jira] [Updated] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4591: - Description: This is an umbrella JIRA for porting spark.mllib implementations to use the

[jira] [Created] (SPARK-18849) Vignettes final checks for Spark 2.1

2016-12-13 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-18849: - Summary: Vignettes final checks for Spark 2.1 Key: SPARK-18849 URL: https://issues.apache.org/jira/browse/SPARK-18849 Project: Spark Issue Type:

[jira] [Commented] (SPARK-18847) PageRank gives incorrect results for graphs with sinks

2016-12-13 Thread Andrew Ray (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746395#comment-15746395 ] Andrew Ray commented on SPARK-18847: I have and have not found any relevant. I'm currently working on

[jira] [Updated] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4591: - Description: This is an umbrella JIRA for porting spark.mllib implementations to use the

[jira] [Updated] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4591: - Description: This is an umbrella JIRA for porting spark.mllib implementations to use the

[jira] [Updated] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4591: - Description: This is an umbrella JIRA for porting spark.mllib implementations to use the

[jira] [Commented] (SPARK-18845) PageRank has incorrect initialization value that leads to slow convergence

2016-12-13 Thread Andrew Ray (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746385#comment-15746385 ] Andrew Ray commented on SPARK-18845: [~srowen] No that's a different thing just whether the result

[jira] [Commented] (SPARK-18676) Spark 2.x query plan data size estimation can crash join queries versus 1.x

2016-12-13 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746382#comment-15746382 ] Michael Allman commented on SPARK-18676: Yeah, I was wondering how that would work with the

[jira] [Commented] (SPARK-18845) PageRank has incorrect initialization value that leads to slow convergence

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746375#comment-15746375 ] Apache Spark commented on SPARK-18845: -- User 'aray' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18845) PageRank has incorrect initialization value that leads to slow convergence

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18845: Assignee: (was: Apache Spark) > PageRank has incorrect initialization value that

[jira] [Assigned] (SPARK-18845) PageRank has incorrect initialization value that leads to slow convergence

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18845: Assignee: Apache Spark > PageRank has incorrect initialization value that leads to slow

[jira] [Comment Edited] (SPARK-18676) Spark 2.x query plan data size estimation can crash join queries versus 1.x

2016-12-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746370#comment-15746370 ] Davies Liu edited comment on SPARK-18676 at 12/13/16 9:47 PM: -- I had a

[jira] [Commented] (SPARK-18676) Spark 2.x query plan data size estimation can crash join queries versus 1.x

2016-12-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746370#comment-15746370 ] Davies Liu commented on SPARK-18676: I had a working prototype, but in introduce some weird behavior,

[jira] [Comment Edited] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746350#comment-15746350 ] Joseph K. Bradley edited comment on SPARK-4591 at 12/13/16 9:41 PM:

[jira] [Commented] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2016-12-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746350#comment-15746350 ] Joseph K. Bradley commented on SPARK-4591: -- Good point. It should be. I'll add it. >

[jira] [Resolved] (SPARK-18816) executor page fails to show log links if executors are added after an app is launched

2016-12-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18816. --- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 Issue resolved by pull

[jira] [Commented] (SPARK-18847) PageRank gives incorrect results for graphs with sinks

2016-12-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746333#comment-15746333 ] Sean Owen commented on SPARK-18847: --- Before you open more can you review old JIRAs about this? >

[jira] [Resolved] (SPARK-18848) PageRank gives incorrect results for graphs with sinks

2016-12-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18848. --- Resolution: Duplicate > PageRank gives incorrect results for graphs with sinks >

[jira] [Created] (SPARK-18848) PageRank gives incorrect results for graphs with sinks

2016-12-13 Thread Andrew Ray (JIRA)
Andrew Ray created SPARK-18848: -- Summary: PageRank gives incorrect results for graphs with sinks Key: SPARK-18848 URL: https://issues.apache.org/jira/browse/SPARK-18848 Project: Spark Issue

[jira] [Commented] (SPARK-18846) Fix flakiness in SchedulerIntegrationSuite

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746326#comment-15746326 ] Apache Spark commented on SPARK-18846: -- User 'squito' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18846) Fix flakiness in SchedulerIntegrationSuite

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18846: Assignee: Imran Rashid (was: Apache Spark) > Fix flakiness in SchedulerIntegrationSuite

[jira] [Assigned] (SPARK-18846) Fix flakiness in SchedulerIntegrationSuite

2016-12-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18846: Assignee: Apache Spark (was: Imran Rashid) > Fix flakiness in SchedulerIntegrationSuite

  1   2   >