[jira] [Assigned] (SPARK-19650) Metastore-only operations shouldn't trigger a spark job

2017-02-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-19650: --- Assignee: Herman van Hovell (was: Sameer Agarwal) > Metastore-only operations shouldn't

[jira] [Resolved] (SPARK-19650) Metastore-only operations shouldn't trigger a spark job

2017-02-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19650. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17027

[jira] [Resolved] (SPARK-19735) Remove HOLD_DDLTIME from Catalog APIs

2017-02-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19735. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17063

[jira] [Commented] (SPARK-17075) Cardinality Estimation of Predicate Expressions

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884084#comment-15884084 ] Apache Spark commented on SPARK-17075: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Updated] (SPARK-19737) New analysis rule for reporting unregistered functions without relying on relation resolution

2017-02-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-19737: --- Description: Let's consider the following simple SQL query that reference an invalid function

[jira] [Updated] (SPARK-19737) New analysis rule for reporting unregistered functions without relying on relation resolution

2017-02-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-19737: --- Description: Let's consider the following simple SQL query that reference an invalid function

[jira] [Assigned] (SPARK-19736) refreshByPath should clear all cached plans with the specified path

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19736: Assignee: (was: Apache Spark) > refreshByPath should clear all cached plans with the

[jira] [Commented] (SPARK-19736) refreshByPath should clear all cached plans with the specified path

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884065#comment-15884065 ] Apache Spark commented on SPARK-19736: -- User 'viirya' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19736) refreshByPath should clear all cached plans with the specified path

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19736: Assignee: Apache Spark > refreshByPath should clear all cached plans with the specified

[jira] [Created] (SPARK-19737) New analysis rule for reporting unregistered functions without relying on relation resolution

2017-02-24 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-19737: -- Summary: New analysis rule for reporting unregistered functions without relying on relation resolution Key: SPARK-19737 URL: https://issues.apache.org/jira/browse/SPARK-19737

[jira] [Commented] (SPARK-15678) Not use cache on appends and overwrites

2017-02-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884061#comment-15884061 ] Liang-Chi Hsieh commented on SPARK-15678: - [~kiszk][~gen] I created SPARK-19736 for the reported

[jira] [Created] (SPARK-19736) refreshByPath should clear all cached plans with the specified path

2017-02-24 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-19736: --- Summary: refreshByPath should clear all cached plans with the specified path Key: SPARK-19736 URL: https://issues.apache.org/jira/browse/SPARK-19736 Project:

[jira] [Commented] (SPARK-19352) Sorting issues on relatively big datasets

2017-02-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884032#comment-15884032 ] Wenchen Fan commented on SPARK-19352: - I'm going to mark it as `not a problem`. Spark doesn't

[jira] [Resolved] (SPARK-19352) Sorting issues on relatively big datasets

2017-02-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19352. - Resolution: Not A Problem > Sorting issues on relatively big datasets >

[jira] [Commented] (SPARK-19352) Sorting issues on relatively big datasets

2017-02-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884016#comment-15884016 ] Liang-Chi Hsieh commented on SPARK-19352: - I think this is in fact solved by SPARK-19563.

[jira] [Commented] (SPARK-19735) Remove HOLD_DDLTIME from Catalog APIs

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884012#comment-15884012 ] Apache Spark commented on SPARK-19735: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19735) Remove HOLD_DDLTIME from Catalog APIs

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19735: Assignee: Xiao Li (was: Apache Spark) > Remove HOLD_DDLTIME from Catalog APIs >

[jira] [Assigned] (SPARK-19735) Remove HOLD_DDLTIME from Catalog APIs

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19735: Assignee: Apache Spark (was: Xiao Li) > Remove HOLD_DDLTIME from Catalog APIs >

[jira] [Created] (SPARK-19735) Remove HOLD_DDLTIME from Catalog APIs

2017-02-24 Thread Xiao Li (JIRA)
Xiao Li created SPARK-19735: --- Summary: Remove HOLD_DDLTIME from Catalog APIs Key: SPARK-19735 URL: https://issues.apache.org/jira/browse/SPARK-19735 Project: Spark Issue Type: Improvement

[jira] [Comment Edited] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2017-02-24 Thread Lin Ma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884007#comment-15884007 ] Lin Ma edited comment on SPARK-18281 at 2/25/17 4:13 AM: - Is this bug really

[jira] [Commented] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2017-02-24 Thread Lin Ma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884007#comment-15884007 ] Lin Ma commented on SPARK-18281: Is this bug really resolved? I am using the latest 2.1.0 release and

[jira] [Resolved] (SPARK-14079) Limit the number of queries on SQL UI

2017-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-14079. --- Resolution: Not A Problem > Limit the number of queries on SQL UI >

[jira] [Commented] (SPARK-14079) Limit the number of queries on SQL UI

2017-02-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883968#comment-15883968 ] Hyukjin Kwon commented on SPARK-14079: -- I am adding a link. Please correct this if wrong. > Limit

[jira] [Comment Edited] (SPARK-14079) Limit the number of queries on SQL UI

2017-02-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883966#comment-15883966 ] Hyukjin Kwon edited comment on SPARK-14079 at 2/25/17 2:47 AM: --- [~zsxwing],

[jira] [Commented] (SPARK-14079) Limit the number of queries on SQL UI

2017-02-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883966#comment-15883966 ] Hyukjin Kwon commented on SPARK-14079: -- [~shixi...@databricks.com], I am just curious if this JIRA

[jira] [Commented] (SPARK-19734) OneHotEncoder __init__ uses dropLast but doc strings all say includeFirst

2017-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883953#comment-15883953 ] Sean Owen commented on SPARK-19734: --- Agreed, feel free to open a PR to fix it. > OneHotEncoder

[jira] [Commented] (SPARK-17495) Hive hash implementation

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883951#comment-15883951 ] Apache Spark commented on SPARK-17495: -- User 'tejasapatil' has created a pull request for this

[jira] [Assigned] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13446: Assignee: Apache Spark > Spark need to support reading data from Hive 2.0.0 metastore >

[jira] [Assigned] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13446: Assignee: (was: Apache Spark) > Spark need to support reading data from Hive 2.0.0

[jira] [Commented] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883943#comment-15883943 ] Apache Spark commented on SPARK-13446: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Resolved] (SPARK-19725) different parquet dependency in spark2.x and Hive2.x cause failure of HoS when using parquet file format

2017-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-19725. --- Resolution: Not A Problem Hive 2 isn't supported, is it? Spark is already on Parquet 1.8. >

[jira] [Created] (SPARK-19734) OneHotEncoder __init__ uses dropLast but doc strings all say includeFirst

2017-02-24 Thread Corey (JIRA)
Corey created SPARK-19734: - Summary: OneHotEncoder __init__ uses dropLast but doc strings all say includeFirst Key: SPARK-19734 URL: https://issues.apache.org/jira/browse/SPARK-19734 Project: Spark

[jira] [Commented] (SPARK-19733) ALS performs unnecessary casting on item and user ids

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883800#comment-15883800 ] Apache Spark commented on SPARK-19733: -- User 'datumbox' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19733) ALS performs unnecessary casting on item and user ids

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19733: Assignee: (was: Apache Spark) > ALS performs unnecessary casting on item and user ids

[jira] [Assigned] (SPARK-19733) ALS performs unnecessary casting on item and user ids

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19733: Assignee: Apache Spark > ALS performs unnecessary casting on item and user ids >

[jira] [Created] (SPARK-19733) ALS performs unnecessary casting on item and user ids

2017-02-24 Thread Vasilis Vryniotis (JIRA)
Vasilis Vryniotis created SPARK-19733: - Summary: ALS performs unnecessary casting on item and user ids Key: SPARK-19733 URL: https://issues.apache.org/jira/browse/SPARK-19733 Project: Spark

[jira] [Assigned] (SPARK-15355) Pro-active block replenishment in case of node/executor failures

2017-02-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-15355: --- Assignee: Shubham Chopra > Pro-active block replenishment in case of node/executor failures

[jira] [Resolved] (SPARK-15355) Pro-active block replenishment in case of node/executor failures

2017-02-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-15355. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 14412

[jira] [Assigned] (SPARK-13330) PYTHONHASHSEED is not propgated to python worker

2017-02-24 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-13330: --- Assignee: Jeff Zhang > PYTHONHASHSEED is not propgated to python worker >

[jira] [Resolved] (SPARK-13330) PYTHONHASHSEED is not propgated to python worker

2017-02-24 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-13330. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 11211

[jira] [Commented] (SPARK-13947) PySpark DataFrames: The error message from using an invalid table reference is not clear

2017-02-24 Thread Ruben Berenguel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883688#comment-15883688 ] Ruben Berenguel commented on SPARK-13947: - I'll give a shot to this one as a first dive into the

[jira] [Updated] (SPARK-14561) History Server does not see new logs in S3

2017-02-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-14561: --- Component/s: Spark Core > History Server does not see new logs in S3 >

[jira] [Commented] (SPARK-19715) Option to Strip Paths in FileSource

2017-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883642#comment-15883642 ] Michael Armbrust commented on SPARK-19715: -- This isn't a hypothetical. A user of structured

[jira] [Commented] (SPARK-14501) spark.ml parity for fpm - frequent items

2017-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883542#comment-15883542 ] Joseph K. Bradley commented on SPARK-14501: --- I set the target for Scala to 2.2. Not sure if

[jira] [Updated] (SPARK-14501) spark.ml parity for fpm - frequent items

2017-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14501: -- Target Version/s: (was: 2.2.0) > spark.ml parity for fpm - frequent items >

[jira] [Assigned] (SPARK-14503) spark.ml Scala API for FPGrowth

2017-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-14503: - Assignee: yuhao yang > spark.ml Scala API for FPGrowth >

[jira] [Updated] (SPARK-14503) spark.ml Scala API for FPGrowth

2017-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14503: -- Target Version/s: 2.2.0 > spark.ml Scala API for FPGrowth >

[jira] [Updated] (SPARK-14503) spark.ml Scala API for FPGrowth

2017-02-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14503: -- Shepherd: Joseph K. Bradley (was: Nick Pentreath) > spark.ml Scala API for FPGrowth >

[jira] [Comment Edited] (SPARK-19732) DataFrame.fillna() does not work for bools in PySpark

2017-02-24 Thread Len Frodgers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883522#comment-15883522 ] Len Frodgers edited comment on SPARK-19732 at 2/24/17 9:12 PM: --- Actually

[jira] [Commented] (SPARK-19732) DataFrame.fillna() does not work for bools in PySpark

2017-02-24 Thread Len Frodgers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883522#comment-15883522 ] Len Frodgers commented on SPARK-19732: -- Actually there's another anomaly: Spark (and pyspark)

[jira] [Resolved] (SPARK-19597) ExecutorSuite should have test for tasks that are not deserialiazable

2017-02-24 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout resolved SPARK-19597. Resolution: Fixed Fix Version/s: 2.2.0 > ExecutorSuite should have test for tasks

[jira] [Updated] (SPARK-19732) DataFrame.fillna() does not work for bools in PySpark

2017-02-24 Thread Len Frodgers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Len Frodgers updated SPARK-19732: - Description: In PySpark, the fillna function of DataFrame inadvertently casts bools to ints, so

[jira] [Created] (SPARK-19732) DataFrame.fillna() does not work for bools in PySpark

2017-02-24 Thread Len Frodgers (JIRA)
Len Frodgers created SPARK-19732: Summary: DataFrame.fillna() does not work for bools in PySpark Key: SPARK-19732 URL: https://issues.apache.org/jira/browse/SPARK-19732 Project: Spark Issue

[jira] [Created] (SPARK-19731) IN Operator should support arrays

2017-02-24 Thread Shawn Lavelle (JIRA)
Shawn Lavelle created SPARK-19731: - Summary: IN Operator should support arrays Key: SPARK-19731 URL: https://issues.apache.org/jira/browse/SPARK-19731 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2017-02-24 Thread Roberto Mirizzi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883431#comment-15883431 ] Roberto Mirizzi commented on SPARK-14409: - [~mlnick] my implementation was conceptually close to

[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2017-02-24 Thread Yong Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883417#comment-15883417 ] Yong Tang commented on SPARK-14409: --- Thanks [~mlnick] for the reminder. I will take a look and update

[jira] [Closed] (SPARK-4681) Turn on executor level blacklisting by default

2017-02-24 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout closed SPARK-4681. - Resolution: Duplicate This was for the old blacklisting mechanism. The linked JIRAs introduce a

[jira] [Updated] (SPARK-19730) Predicate Subqueries do not push results of subqueries to data source

2017-02-24 Thread Shawn Lavelle (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Lavelle updated SPARK-19730: -- Description: When a SparkSQL query contains a subquery in the where clause, such as a

[jira] [Closed] (SPARK-19560) Improve tests for when DAGScheduler learns of "successful" ShuffleMapTask from a failed executor

2017-02-24 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout closed SPARK-19560. -- Resolution: Fixed Target Version/s: 2.2.0 > Improve tests for when DAGScheduler

[jira] [Updated] (SPARK-19730) Predicate Subqueries do not push results of subqueries to data source

2017-02-24 Thread Shawn Lavelle (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Lavelle updated SPARK-19730: -- Description: When a SparkSQL query contains a subquery in the where clause, such as a

[jira] [Updated] (SPARK-19730) Predicate Subqueries do not push results of subqueries to data source

2017-02-24 Thread Shawn Lavelle (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Lavelle updated SPARK-19730: -- Description: When a SparkSQL query contains a subquery in the where clause, such as a

[jira] [Commented] (SPARK-17495) Hive hash implementation

2017-02-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883394#comment-15883394 ] Reynold Xin commented on SPARK-17495: - Let me put some thoughts here Please let me know if I

[jira] [Updated] (SPARK-19730) Predicate Subqueries do not push results of subqueries to data source

2017-02-24 Thread Shawn Lavelle (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Lavelle updated SPARK-19730: -- Description: When a SparkSQL query contains a subquery in the where clause, such as a

[jira] [Updated] (SPARK-19730) Predicate Subqueries do not push results of subqueries to data source

2017-02-24 Thread Shawn Lavelle (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Lavelle updated SPARK-19730: -- Description: When a SparkSQL query contains a subquery in the where clause, such as a

[jira] [Updated] (SPARK-19730) Predicate Subqueries do not push results of subqueries to data source

2017-02-24 Thread Shawn Lavelle (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Lavelle updated SPARK-19730: -- Description: When a SparkSQL query contains a subquery in the where clause, such as a

[jira] [Updated] (SPARK-19572) Allow to disable hive in sparkR shell

2017-02-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-19572: - Target Version/s: (was: 2.1.1) > Allow to disable hive in sparkR shell >

[jira] [Commented] (SPARK-19711) Bug in gapply function

2017-02-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883337#comment-15883337 ] Felix Cheung commented on SPARK-19711: -- Thanks I'll look into this shortly. > Bug in gapply

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-24 Thread Charles Allen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883331#comment-15883331 ] Charles Allen commented on SPARK-19698: --- [~mridulm80] is there documentation somewhere that

[jira] [Resolved] (SPARK-2336) Approximate k-NN Models for MLLib

2017-02-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2336. -- Resolution: Won't Fix > Approximate k-NN Models for MLLib > - > >

[jira] [Assigned] (SPARK-17078) show estimated stats when doing explain

2017-02-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-17078: --- Assignee: Zhenhua Wang > show estimated stats when doing explain >

[jira] [Resolved] (SPARK-17078) show estimated stats when doing explain

2017-02-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17078. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16594

[jira] [Updated] (SPARK-19730) Predicate Subqueries do not push results of subqueries to data source

2017-02-24 Thread Shawn Lavelle (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Lavelle updated SPARK-19730: -- Description: When a SparkSQL query contains a subquery in the where clause, such as a

[jira] [Updated] (SPARK-19730) Predicate Subqueries do not push results of subqueries to data source

2017-02-24 Thread Shawn Lavelle (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Lavelle updated SPARK-19730: -- Description: When a SparkSQL query contains a subquery in the where clause, such as a

[jira] [Created] (SPARK-19730) Predicate Subqueries do not push results of subqueries to data source

2017-02-24 Thread Shawn Lavelle (JIRA)
Shawn Lavelle created SPARK-19730: - Summary: Predicate Subqueries do not push results of subqueries to data source Key: SPARK-19730 URL: https://issues.apache.org/jira/browse/SPARK-19730 Project:

[jira] [Comment Edited] (SPARK-17495) Hive hash implementation

2017-02-24 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883203#comment-15883203 ] Tejas Patil edited comment on SPARK-17495 at 2/24/17 5:57 PM: -- [~rxin] : No

[jira] [Commented] (SPARK-17495) Hive hash implementation

2017-02-24 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883203#comment-15883203 ] Tejas Patil commented on SPARK-17495: - [~rxin] : No probs. Any opinion about my comment from

[jira] [Created] (SPARK-19729) Strange behaviour with reading csv with schema into dataframe

2017-02-24 Thread Mazen Melouk (JIRA)
Mazen Melouk created SPARK-19729: Summary: Strange behaviour with reading csv with schema into dataframe Key: SPARK-19729 URL: https://issues.apache.org/jira/browse/SPARK-19729 Project: Spark

[jira] [Assigned] (SPARK-17495) Hive hash implementation

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17495: Assignee: Apache Spark (was: Tejas Patil) > Hive hash implementation >

[jira] [Assigned] (SPARK-17495) Hive hash implementation

2017-02-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17495: Assignee: Tejas Patil (was: Apache Spark) > Hive hash implementation >

[jira] [Commented] (SPARK-17495) Hive hash implementation

2017-02-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883189#comment-15883189 ] Reynold Xin commented on SPARK-17495: - Ah yes. I kept doing it ... :) > Hive hash implementation >

[jira] [Reopened] (SPARK-17495) Hive hash implementation

2017-02-24 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil reopened SPARK-17495: - Re-opening. This is not done yet as there are few datatypes that need to be handled and making

[jira] [Resolved] (SPARK-17495) Hive hash implementation

2017-02-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17495. - Resolution: Fixed Fix Version/s: 2.2.0 > Hive hash implementation >

[jira] [Commented] (SPARK-19351) Support for obtaining file splits from underlying InputFormat

2017-02-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883177#comment-15883177 ] Reynold Xin commented on SPARK-19351: - Approach 1 should be supported today. I actually think our

[jira] [Resolved] (SPARK-19038) Can't find keytab file when using Hive catalog

2017-02-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-19038. Resolution: Fixed Assignee: Saisai Shao Fix Version/s: 2.2.0

[jira] [Resolved] (SPARK-19707) Improve the invalid path check for sc.addJar

2017-02-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-19707. Resolution: Fixed Assignee: Saisai Shao Fix Version/s: 2.2.0

[jira] [Comment Edited] (SPARK-19714) Bucketizer Bug Regarding Handling Unbucketed Inputs

2017-02-24 Thread Bill Chambers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883125#comment-15883125 ] Bill Chambers edited comment on SPARK-19714 at 2/24/17 5:15 PM: The thing

[jira] [Commented] (SPARK-19714) Bucketizer Bug Regarding Handling Unbucketed Inputs

2017-02-24 Thread Bill Chambers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883125#comment-15883125 ] Bill Chambers commented on SPARK-19714: --- The thing is QuantileDiscretizer and Bucketizer do

[jira] [Commented] (SPARK-15678) Not use cache on appends and overwrites

2017-02-24 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883035#comment-15883035 ] Kazuaki Ishizaki commented on SPARK-15678: -- Sorry for being late to reply. According to the

[jira] [Commented] (SPARK-19161) Improving UDF Docstrings

2017-02-24 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883013#comment-15883013 ] holdenk commented on SPARK-19161: - Thanks for working on this [~zero323], having better docs for UDFs

[jira] [Resolved] (SPARK-19161) Improving UDF Docstrings

2017-02-24 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-19161. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16534

[jira] [Assigned] (SPARK-19161) Improving UDF Docstrings

2017-02-24 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-19161: --- Assignee: Maciej Szymkiewicz > Improving UDF Docstrings > > >

[jira] [Commented] (SPARK-19715) Option to Strip Paths in FileSource

2017-02-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882995#comment-15882995 ] Steve Loughran commented on SPARK-19715: This is a silly question, but has the situation " a

[jira] [Updated] (SPARK-19728) PythonUDF with multiple parents shouldn't be pushed down when used as a predicate

2017-02-24 Thread Maciej Szymkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz updated SPARK-19728: --- Summary: PythonUDF with multiple parents shouldn't be pushed down when used as a

[jira] [Created] (SPARK-19728) PythonUDF with multiple parents shouldn't be pushed down when used as a predicat

2017-02-24 Thread Maciej Szymkiewicz (JIRA)
Maciej Szymkiewicz created SPARK-19728: -- Summary: PythonUDF with multiple parents shouldn't be pushed down when used as a predicat Key: SPARK-19728 URL: https://issues.apache.org/jira/browse/SPARK-19728

[jira] [Commented] (SPARK-17147) Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction)

2017-02-24 Thread Dean Wampler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882778#comment-15882778 ] Dean Wampler commented on SPARK-17147: -- We're interested in this enhancement. Anyone know if and one

[jira] [Updated] (SPARK-19724) create a managed table with an existed default location should throw an exception

2017-02-24 Thread Song Jun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Song Jun updated SPARK-19724: - Summary: create a managed table with an existed default location should throw an exception (was: create

[jira] [Updated] (SPARK-19724) create managed table for hive tables with an existed default location should throw an exception

2017-02-24 Thread Song Jun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Song Jun updated SPARK-19724: - Description: This JIRA is a follow up work after

[jira] [Comment Edited] (SPARK-19711) Bug in gapply function

2017-02-24 Thread Luis Felipe Sant Ana (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882475#comment-15882475 ] Luis Felipe Sant Ana edited comment on SPARK-19711 at 2/24/17 11:24 AM:

[jira] [Comment Edited] (SPARK-19711) Bug in gapply function

2017-02-24 Thread Luis Felipe Sant Ana (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882475#comment-15882475 ] Luis Felipe Sant Ana edited comment on SPARK-19711 at 2/24/17 11:23 AM:

[jira] [Commented] (SPARK-19711) Bug in gapply function

2017-02-24 Thread Luis Felipe Sant Ana (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882475#comment-15882475 ] Luis Felipe Sant Ana commented on SPARK-19711: -- The problem seems to be in using the string

  1   2   >