[jira] [Commented] (SPARK-14906) Move VectorUDT and MatrixUDT in PySpark to new ML package

2016-04-29 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263660#comment-15263660 ] Yanbo Liang commented on SPARK-14906: - I'm working on another issue today, so it's be

[jira] [Resolved] (SPARK-14886) RankingMetrics.ndcgAt throw java.lang.ArrayIndexOutOfBoundsException

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14886. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12756 [https:/

[jira] [Updated] (SPARK-14886) RankingMetrics.ndcgAt throw java.lang.ArrayIndexOutOfBoundsException

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14886: --- Assignee: Sean Owen > RankingMetrics.ndcgAt throw java.lang.ArrayIndexOutOfBoundsException

[jira] [Resolved] (SPARK-12660) Rewrite except using anti-join

2016-04-29 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-12660. - Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12736 [https://githu

[jira] [Resolved] (SPARK-14967) EXCEPT does not follow SQL compliance

2016-04-29 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-14967. - Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12736 [https://githu

[jira] [Updated] (SPARK-14967) EXCEPT does not follow SQL compliance

2016-04-29 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-14967: Assignee: Xiao Li > EXCEPT does not follow SQL compliance > - >

[jira] [Updated] (SPARK-12660) Rewrite except using anti-join

2016-04-29 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-12660: Assignee: Xiao Li > Rewrite except using anti-join > -- > >

[jira] [Resolved] (SPARK-14996) Add TPCDS Benchmark Queries for SparkSQL

2016-04-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-14996. - Resolution: Fixed Assignee: Sameer Agarwal Fix Version/s: 2.0.0 > Add TPCDS Bench

[jira] [Commented] (SPARK-14963) YarnShuffleService should use YARN getRecoveryPath() for leveldb location

2016-04-29 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263706#comment-15263706 ] Saisai Shao commented on SPARK-14963: - Hi [~tgraves], would you mind me taking a crac

[jira] [Updated] (SPARK-1989) Exit executors faster if they get into a cycle of heavy GC

2016-04-29 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-1989: - Priority: Minor (was: Major) > Exit executors faster if they get into a cycle of heavy GC > -

[jira] [Resolved] (SPARK-1989) Exit executors faster if they get into a cycle of heavy GC

2016-04-29 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-1989. -- Resolution: Won't Fix > Exit executors faster if they get into a cycle of heavy GC > ---

[jira] [Commented] (SPARK-14977) Fine grained mode in Mesos is not fair

2016-04-29 Thread Luca Bruno (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263720#comment-15263720 ] Luca Bruno commented on SPARK-14977: Thanks for the reply. Yes, they are long running

[jira] [Comment Edited] (SPARK-14977) Fine grained mode in Mesos is not fair

2016-04-29 Thread Luca Bruno (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263720#comment-15263720 ] Luca Bruno edited comment on SPARK-14977 at 4/29/16 8:08 AM: -

[jira] [Resolved] (SPARK-14994) Remove execution hive from HiveSessionState

2016-04-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-14994. - Resolution: Fixed Fix Version/s: 2.0.0 > Remove execution hive from HiveSessionState > ---

[jira] [Resolved] (SPARK-14941) Remove runtime HiveConf

2016-04-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-14941. - Resolution: Fixed Fix Version/s: 2.0.0 > Remove runtime HiveConf > ---

[jira] [Commented] (SPARK-14591) Remove org.apache.spark.sql.catalyst.parser.DataTypeParser

2016-04-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263730#comment-15263730 ] Reynold Xin commented on SPARK-14591: - [~lian cheng] / [~cloud_fan] if you have some

[jira] [Commented] (SPARK-4820) Spark build encounters "File name too long" on some encrypted filesystems

2016-04-29 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263733#comment-15263733 ] Sean Owen commented on SPARK-4820: -- We can't commit this change to the build since it wil

[jira] [Commented] (SPARK-14962) spark.sql.orc.filterPushdown=true breaks DataFrame where functionality

2016-04-29 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263759#comment-15263759 ] Hyukjin Kwon commented on SPARK-14962: -- I see. This was because ORC tries to apply a

[jira] [Resolved] (SPARK-14511) Publish our forked genjavadoc for 2.12.0-M4 or stop using a forked version

2016-04-29 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-14511. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12707 [https://github.co

[jira] [Assigned] (SPARK-14891) ALS in ML never validates input schema

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-14891: -- Assignee: Nick Pentreath > ALS in ML never validates input schema > --

[jira] [Commented] (SPARK-14900) spark.ml classification metrics should include accuracy

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263778#comment-15263778 ] Nick Pentreath commented on SPARK-14900: Sure, go ahead > spark.ml classificatio

[jira] [Resolved] (SPARK-14969) Remove unnecessary compute function in LogisticGradient

2016-04-29 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-14969. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12747 [https://github.co

[jira] [Updated] (SPARK-14969) Remove unnecessary compute function in LogisticGradient

2016-04-29 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-14969: -- Assignee: ding > Remove unnecessary compute function in LogisticGradient >

[jira] [Created] (SPARK-14999) RDDs union checks if the ==

2016-04-29 Thread Noam Asor (JIRA)
Noam Asor created SPARK-14999: - Summary: RDDs union checks if the == Key: SPARK-14999 URL: https://issues.apache.org/jira/browse/SPARK-14999 Project: Spark Issue Type: Improvement Rep

[jira] [Closed] (SPARK-14999) RDDs union checks if the ==

2016-04-29 Thread Noam Asor (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noam Asor closed SPARK-14999. - Resolution: Invalid > RDDs union checks if the == > --- > > Key:

[jira] [Updated] (SPARK-14999) RDDs union

2016-04-29 Thread Noam Asor (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noam Asor updated SPARK-14999: -- Summary: RDDs union (was: RDDs union checks if the ==) > RDDs union > --- > >

[jira] [Commented] (SPARK-14962) spark.sql.orc.filterPushdown=true breaks DataFrame where functionality

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263948#comment-15263948 ] Apache Spark commented on SPARK-14962: -- User 'HyukjinKwon' has created a pull reques

[jira] [Assigned] (SPARK-14962) spark.sql.orc.filterPushdown=true breaks DataFrame where functionality

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14962: Assignee: Apache Spark > spark.sql.orc.filterPushdown=true breaks DataFrame where function

[jira] [Assigned] (SPARK-14962) spark.sql.orc.filterPushdown=true breaks DataFrame where functionality

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14962: Assignee: (was: Apache Spark) > spark.sql.orc.filterPushdown=true breaks DataFrame whe

[jira] [Updated] (SPARK-14975) Predicted Probability per training instance for Gradient Boosted Trees in mllib.

2016-04-29 Thread Partha Talukder (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Partha Talukder updated SPARK-14975: Labels: GradientBoostingTrees mllib (was: newbie) > Predicted Probability per training ins

[jira] [Commented] (SPARK-14963) YarnShuffleService should use YARN getRecoveryPath() for leveldb location

2016-04-29 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264010#comment-15264010 ] Thomas Graves commented on SPARK-14963: --- I'm definitely fine with it but someone el

[jira] [Created] (SPARK-15000) Spark hangs indefinitely if you cache a dataframe, then show it, then do some further processing on it

2016-04-29 Thread Jamie Hutton (JIRA)
Jamie Hutton created SPARK-15000: Summary: Spark hangs indefinitely if you cache a dataframe, then show it, then do some further processing on it Key: SPARK-15000 URL: https://issues.apache.org/jira/browse/SPARK-1

[jira] [Created] (SPARK-15001) Cherry-pick Wide Table Support for Parquet Codegen from Spark 2.0

2016-04-29 Thread Jerome Gagnon (JIRA)
Jerome Gagnon created SPARK-15001: - Summary: Cherry-pick Wide Table Support for Parquet Codegen from Spark 2.0 Key: SPARK-15001 URL: https://issues.apache.org/jira/browse/SPARK-15001 Project: Spark

[jira] [Created] (SPARK-15002) Calling unpersist can cause spark to hang indefinitely when writing out a result

2016-04-29 Thread Jamie Hutton (JIRA)
Jamie Hutton created SPARK-15002: Summary: Calling unpersist can cause spark to hang indefinitely when writing out a result Key: SPARK-15002 URL: https://issues.apache.org/jira/browse/SPARK-15002 Proj

[jira] [Commented] (SPARK-14315) GLMs model persistence in SparkR

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264056#comment-15264056 ] Apache Spark commented on SPARK-14315: -- User 'yanboliang' has created a pull request

[jira] [Commented] (SPARK-14314) K-means model persistence in SparkR

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264055#comment-15264055 ] Apache Spark commented on SPARK-14314: -- User 'yanboliang' has created a pull request

[jira] [Commented] (SPARK-12981) Dataframe distinct() followed by a filter(udf) in pyspark throws a casting error

2016-04-29 Thread Tom Arnfeld (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264084#comment-15264084 ] Tom Arnfeld commented on SPARK-12981: - Any chance we can get this in a 1.6.2 point re

[jira] [Resolved] (SPARK-14571) Log instrumentation in ALS

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14571. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12560 [https:/

[jira] [Updated] (SPARK-14571) Log instrumentation in ALS

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14571: --- Assignee: Miao Wang > Log instrumentation in ALS > -- > >

[jira] [Assigned] (SPARK-15003) Use ConcurrentHashMap in place of HashMap for NewAccumulator.originals

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15003: Assignee: (was: Apache Spark) > Use ConcurrentHashMap in place of HashMap for NewAccum

[jira] [Created] (SPARK-15003) Use ConcurrentHashMap in place of HashMap for NewAccumulator.originals

2016-04-29 Thread Ted Yu (JIRA)
Ted Yu created SPARK-15003: -- Summary: Use ConcurrentHashMap in place of HashMap for NewAccumulator.originals Key: SPARK-15003 URL: https://issues.apache.org/jira/browse/SPARK-15003 Project: Spark I

[jira] [Assigned] (SPARK-15003) Use ConcurrentHashMap in place of HashMap for NewAccumulator.originals

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15003: Assignee: Apache Spark > Use ConcurrentHashMap in place of HashMap for NewAccumulator.orig

[jira] [Commented] (SPARK-15003) Use ConcurrentHashMap in place of HashMap for NewAccumulator.originals

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264135#comment-15264135 ] Apache Spark commented on SPARK-15003: -- User 'tedyu' has created a pull request for

[jira] [Updated] (SPARK-15003) Use ConcurrentHashMap in place of HashMap for NewAccumulator.originals

2016-04-29 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-15003: -- Priority: Minor (was: Major) (Why would it improve performance?) > Use ConcurrentHashMap in place of

[jira] [Commented] (SPARK-14533) RowMatrix.computeCovariance inaccurate when values are very large

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264171#comment-15264171 ] Apache Spark commented on SPARK-14533: -- User 'srowen' has created a pull request for

[jira] [Updated] (SPARK-15001) Cherry-pick Wide Table Support for Parquet Codegen from Spark 2.0

2016-04-29 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-15001: -- Fix Version/s: (was: 1.6.2) Don't set fix version. This shouldn't be another JIRA; ask on the sourc

[jira] [Commented] (SPARK-14302) Python examples code merge and clean up

2016-04-29 Thread Saikat Kanjilal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264207#comment-15264207 ] Saikat Kanjilal commented on SPARK-14302: - Ok I finished my initial assessment of

[jira] [Commented] (SPARK-11316) coalesce doesn't handle UnionRDD with partial locality properly

2016-04-29 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264205#comment-15264205 ] Thomas Graves commented on SPARK-11316: --- Simple steps to reproduce an RDD with part

[jira] [Commented] (SPARK-14224) Cannot project all columns from a table with ~1,100 columns

2016-04-29 Thread Jerome Gagnon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264219#comment-15264219 ] Jerome Gagnon commented on SPARK-14224: --- Can this be backported to 1.6.2 ? > Cann

[jira] [Commented] (SPARK-8971) Support balanced class labels when splitting train/cross validation sets

2016-04-29 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264224#comment-15264224 ] Seth Hendrickson commented on SPARK-8971: - I meant label column. Sorry for the con

[jira] [Resolved] (SPARK-14987) Inline Hive thrift-server into Spark

2016-04-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-14987. - Resolution: Fixed Fix Version/s: 2.0.0 > Inline Hive thrift-server into Spark > --

[jira] [Resolved] (SPARK-14988) Implement catalog and conf API in Python SparkSession

2016-04-29 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-14988. --- Resolution: Fixed Fix Version/s: 2.0.0 > Implement catalog and conf API in Python SparkSession

[jira] [Assigned] (SPARK-15004) Remove zookeeper and service discovery related code

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15004: Assignee: Apache Spark (was: Reynold Xin) > Remove zookeeper and service discovery relate

[jira] [Assigned] (SPARK-15004) Remove zookeeper and service discovery related code

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15004: Assignee: Reynold Xin (was: Apache Spark) > Remove zookeeper and service discovery relate

[jira] [Commented] (SPARK-15004) Remove zookeeper and service discovery related code

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264304#comment-15264304 ] Apache Spark commented on SPARK-15004: -- User 'rxin' has created a pull request for t

[jira] [Updated] (SPARK-14831) Make ML APIs in SparkR consistent

2016-04-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-14831: -- Assignee: Timothy Hunter (was: Xiangrui Meng) > Make ML APIs in SparkR consistent > --

[jira] [Created] (SPARK-15004) Remove zookeeper and service discovery related code

2016-04-29 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-15004: --- Summary: Remove zookeeper and service discovery related code Key: SPARK-15004 URL: https://issues.apache.org/jira/browse/SPARK-15004 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-14314) K-means model persistence in SparkR

2016-04-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-14314. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12778 [https://g

[jira] [Resolved] (SPARK-14315) GLMs model persistence in SparkR

2016-04-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-14315. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12778 [https://g

[jira] [Commented] (SPARK-14831) Make ML APIs in SparkR consistent

2016-04-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264328#comment-15264328 ] Xiangrui Meng commented on SPARK-14831: --- Talked to [~timhunter] offline and he will

[jira] [Created] (SPARK-15005) Usage of Temp Table twice in Hive query fails with bad error

2016-04-29 Thread dciborow (JIRA)
dciborow created SPARK-15005: Summary: Usage of Temp Table twice in Hive query fails with bad error Key: SPARK-15005 URL: https://issues.apache.org/jira/browse/SPARK-15005 Project: Spark Issue T

[jira] [Created] (SPARK-15006) Generated JavaDoc should hide package private objects

2016-04-29 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-15006: - Summary: Generated JavaDoc should hide package private objects Key: SPARK-15006 URL: https://issues.apache.org/jira/browse/SPARK-15006 Project: Spark Issue

[jira] [Created] (SPARK-15007) Usage of Temp Table twice in Hive query

2016-04-29 Thread dciborow (JIRA)
dciborow created SPARK-15007: Summary: Usage of Temp Table twice in Hive query Key: SPARK-15007 URL: https://issues.apache.org/jira/browse/SPARK-15007 Project: Spark Issue Type: Wish

[jira] [Updated] (SPARK-15007) Usage of Temp Table twice in Hive query

2016-04-29 Thread dciborow (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dciborow updated SPARK-15007: - Component/s: SQL > Usage of Temp Table twice in Hive query > >

[jira] [Resolved] (SPARK-15007) Usage of Temp Table twice in Hive query

2016-04-29 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-15007. --- Resolution: Duplicate > Usage of Temp Table twice in Hive query > --

[jira] [Commented] (SPARK-14816) Update MLlib, GraphX, SparkR websites for 2.0

2016-04-29 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264370#comment-15264370 ] Timothy Hunter commented on SPARK-14816: Also, add a comment about the {{doparall

[jira] [Comment Edited] (SPARK-14816) Update MLlib, GraphX, SparkR websites for 2.0

2016-04-29 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264370#comment-15264370 ] Timothy Hunter edited comment on SPARK-14816 at 4/29/16 5:21 PM: --

[jira] [Resolved] (SPARK-14984) Update LinearRegression, LogisticRegression summary APIs

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-14984. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12763 [h

[jira] [Resolved] (SPARK-11940) Python API for ml.clustering.LDA

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-11940. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12723 [h

[jira] [Commented] (SPARK-14346) SHOW CREATE TABLE command (Native)

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264426#comment-15264426 ] Apache Spark commented on SPARK-14346: -- User 'liancheng' has created a pull request

[jira] [Commented] (SPARK-14224) Cannot project all columns from a table with ~1,100 columns

2016-04-29 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264434#comment-15264434 ] Davies Liu commented on SPARK-14224: [~jgagnon] This bug does not exists in 1.6 branc

[jira] [Commented] (SPARK-12981) Dataframe distinct() followed by a filter(udf) in pyspark throws a casting error

2016-04-29 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264437#comment-15264437 ] Davies Liu commented on SPARK-12981: This depends on several changes in 2.0, it's not

[jira] [Updated] (SPARK-14706) Python ML persistence integration test

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14706: -- Target Version/s: 2.1.0 (was: 2.0.0) > Python ML persistence integration test > --

[jira] [Updated] (SPARK-14931) Mismatched default values between pipelines in Spark and PySpark

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14931: -- Target Version/s: 2.0.0 > Mismatched default values between pipelines in Spark and PySp

[jira] [Created] (SPARK-15008) Python ML persistence integration test: OneVsRest

2016-04-29 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-15008: - Summary: Python ML persistence integration test: OneVsRest Key: SPARK-15008 URL: https://issues.apache.org/jira/browse/SPARK-15008 Project: Spark I

[jira] [Updated] (SPARK-14973) The CrossValidator and TrainValidationSplit miss the seed when saving and loading

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14973: -- Target Version/s: 2.0.0 > The CrossValidator and TrainValidationSplit miss the seed whe

[jira] [Commented] (SPARK-13786) Pyspark ml.tuning support export/import

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264482#comment-15264482 ] Joseph K. Bradley commented on SPARK-13786: --- Per discussion on [https://github.

[jira] [Commented] (SPARK-14975) Predicted Probability per training instance for Gradient Boosted Trees in mllib.

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264505#comment-15264505 ] Joseph K. Bradley commented on SPARK-14975: --- Thanks for reporting this. Please

[jira] [Updated] (SPARK-14975) Predicted Probability per training instance for Gradient Boosted Trees in mllib.

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14975: -- Component/s: ML > Predicted Probability per training instance for Gradient Boosted Tree

[jira] [Updated] (SPARK-14975) Predicted Probability per training instance for Gradient Boosted Trees in mllib.

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14975: -- Target Version/s: (was: 1.6.1) > Predicted Probability per training instance for Grad

[jira] [Comment Edited] (SPARK-13786) Pyspark ml.tuning support export/import

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264482#comment-15264482 ] Joseph K. Bradley edited comment on SPARK-13786 at 4/29/16 6:35 PM: ---

[jira] [Updated] (SPARK-14975) Predicted Probability per training instance for Gradient Boosted Trees in mllib.

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14975: -- Priority: Minor (was: Major) > Predicted Probability per training instance for Gradien

[jira] [Updated] (SPARK-14975) Predicted Probability per training instance for Gradient Boosted Trees in mllib.

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14975: -- Issue Type: New Feature (was: Improvement) > Predicted Probability per training instan

[jira] [Updated] (SPARK-14975) Predicted Probability per training instance for Gradient Boosted Trees in mllib.

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14975: -- Labels: mllib (was: GradientBoostingTrees mllib) > Predicted Probability per training

[jira] [Updated] (SPARK-14975) Predicted Probability per training instance for Gradient Boosted Trees in mllib.

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14975: -- Affects Version/s: (was: 1.6.1) > Predicted Probability per training instance for G

[jira] [Created] (SPARK-15009) PySpark CountVectorizerModel should be able to construct from vocabulary list

2016-04-29 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-15009: Summary: PySpark CountVectorizerModel should be able to construct from vocabulary list Key: SPARK-15009 URL: https://issues.apache.org/jira/browse/SPARK-15009 Project

[jira] [Commented] (SPARK-15009) PySpark CountVectorizerModel should be able to construct from vocabulary list

2016-04-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264515#comment-15264515 ] Bryan Cutler commented on SPARK-15009: -- I'm working on this > PySpark CountVectoriz

[jira] [Assigned] (SPARK-13786) Pyspark ml.tuning support export/import

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13786: Assignee: (was: Apache Spark) > Pyspark ml.tuning support export/import >

[jira] [Updated] (SPARK-13786) Pyspark ml.tuning support export/import

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13786: -- Fix Version/s: (was: 2.0.0) > Pyspark ml.tuning support export/import > ---

[jira] [Updated] (SPARK-13786) Pyspark ml.tuning support export/import

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13786: -- Target Version/s: 2.1.0 (was: 2.0.0) > Pyspark ml.tuning support export/import > -

[jira] [Reopened] (SPARK-13786) Pyspark ml.tuning support export/import

2016-04-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reopened SPARK-13786: --- Assignee: (was: Xusen Yin) > Pyspark ml.tuning support export/import >

[jira] [Assigned] (SPARK-13786) Pyspark ml.tuning support export/import

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13786: Assignee: Apache Spark > Pyspark ml.tuning support export/import > ---

[jira] [Commented] (SPARK-13786) Pyspark ml.tuning support export/import

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264528#comment-15264528 ] Apache Spark commented on SPARK-13786: -- User 'jkbradley' has created a pull request

[jira] [Created] (SPARK-15010) Lots of error messages about accumulator in Spark shell when a task takes some time to run

2016-04-29 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-15010: - Summary: Lots of error messages about accumulator in Spark shell when a task takes some time to run Key: SPARK-15010 URL: https://issues.apache.org/jira/browse/SPARK-15010

[jira] [Created] (SPARK-15011) org.apache.spark.sql.hive.StatisticsSuite.analyze MetastoreRelations fails when hadoop 2.3 or hadoop 2.4 is used

2016-04-29 Thread Yin Huai (JIRA)
Yin Huai created SPARK-15011: Summary: org.apache.spark.sql.hive.StatisticsSuite.analyze MetastoreRelations fails when hadoop 2.3 or hadoop 2.4 is used Key: SPARK-15011 URL: https://issues.apache.org/jira/browse/SPARK

[jira] [Assigned] (SPARK-15011) org.apache.spark.sql.hive.StatisticsSuite.analyze MetastoreRelations fails when hadoop 2.3 or hadoop 2.4 is used

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15011: Assignee: Apache Spark > org.apache.spark.sql.hive.StatisticsSuite.analyze MetastoreRelati

[jira] [Assigned] (SPARK-15011) org.apache.spark.sql.hive.StatisticsSuite.analyze MetastoreRelations fails when hadoop 2.3 or hadoop 2.4 is used

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15011: Assignee: (was: Apache Spark) > org.apache.spark.sql.hive.StatisticsSuite.analyze Meta

[jira] [Commented] (SPARK-15011) org.apache.spark.sql.hive.StatisticsSuite.analyze MetastoreRelations fails when hadoop 2.3 or hadoop 2.4 is used

2016-04-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264560#comment-15264560 ] Apache Spark commented on SPARK-15011: -- User 'yhuai' has created a pull request for

[jira] [Updated] (SPARK-15011) org.apache.spark.sql.hive.StatisticsSuite.analyze MetastoreRelations fails when hadoop 2.3 or hadoop 2.4 is used

2016-04-29 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-15011: - Labels: flaky-test (was: ) > org.apache.spark.sql.hive.StatisticsSuite.analyze MetastoreRelations fails

[jira] [Commented] (SPARK-15010) Lots of error messages about accumulator in Spark shell when a task takes some time to run

2016-04-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264576#comment-15264576 ] Josh Rosen commented on SPARK-15010: It looks like we're hitting this in the heartbea

  1   2   3   4   >