[jira] [Commented] (SPARK-24434) Support user-specified driver and executor pod templates

2018-08-27 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594046#comment-16594046 ] Henry Robinson commented on SPARK-24434: Yeah, assignees are set after the PR is merged. I think

[jira] [Commented] (SPARK-24432) Add support for dynamic resource allocation

2018-06-18 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516305#comment-16516305 ] Henry Robinson commented on SPARK-24432: I'm really interested in this feature. What's the

[jira] [Commented] (SPARK-24374) SPIP: Support Barrier Scheduling in Apache Spark

2018-06-01 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498465#comment-16498465 ] Henry Robinson commented on SPARK-24374: The use case in the SPIP isn't 100% convincing. I'm

[jira] [Created] (SPARK-24393) SQL builtin: isinf

2018-05-25 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-24393: -- Summary: SQL builtin: isinf Key: SPARK-24393 URL: https://issues.apache.org/jira/browse/SPARK-24393 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-24254) Eagerly evaluate some subqueries over LocalRelation

2018-05-11 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-24254: -- Summary: Eagerly evaluate some subqueries over LocalRelation Key: SPARK-24254 URL: https://issues.apache.org/jira/browse/SPARK-24254 Project: Spark

[jira] [Commented] (SPARK-11150) Dynamic partition pruning

2018-05-10 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471311#comment-16471311 ] Henry Robinson commented on SPARK-11150: The title of this JIRA is 'dynamic partition pruning',

[jira] [Updated] (SPARK-23940) High-order function: transform_values(map<K, V1>, function<K, V1, V2>) → map<K, V2>

2018-04-30 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson updated SPARK-23940: --- Summary: High-order function: transform_values(map, function) → map

[jira] [Created] (SPARK-24128) Mention spark.sql.crossJoin.enabled in implicit cartesian product error msg

2018-04-30 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-24128: -- Summary: Mention spark.sql.crossJoin.enabled in implicit cartesian product error msg Key: SPARK-24128 URL: https://issues.apache.org/jira/browse/SPARK-24128

[jira] [Created] (SPARK-24125) Add quoting rules to SQL guide

2018-04-30 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-24125: -- Summary: Add quoting rules to SQL guide Key: SPARK-24125 URL: https://issues.apache.org/jira/browse/SPARK-24125 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-24 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451263#comment-16451263 ] Henry Robinson commented on SPARK-23852: Yes it has - the Parquet community are going to do a

[jira] [Commented] (SPARK-24020) Sort-merge join inner range optimization

2018-04-18 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443392#comment-16443392 ] Henry Robinson commented on SPARK-24020: This sounds like a 'band join' (e.g.

[jira] [Created] (SPARK-23973) Remove consecutive sorts

2018-04-12 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-23973: -- Summary: Remove consecutive sorts Key: SPARK-23973 URL: https://issues.apache.org/jira/browse/SPARK-23973 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-23972) Upgrade to Parquet 1.10

2018-04-12 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-23972: -- Summary: Upgrade to Parquet 1.10 Key: SPARK-23972 URL: https://issues.apache.org/jira/browse/SPARK-23972 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-12 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436292#comment-16436292 ] Henry Robinson commented on SPARK-23852: [Here's a

[jira] [Created] (SPARK-23957) Sorts in subqueries are redundant and can be removed

2018-04-10 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-23957: -- Summary: Sorts in subqueries are redundant and can be removed Key: SPARK-23957 URL: https://issues.apache.org/jira/browse/SPARK-23957 Project: Spark

[jira] [Commented] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-02 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423491#comment-16423491 ] Henry Robinson commented on SPARK-23852: Partly, but not completely. If the column is dictionary

[jira] [Updated] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-02 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson updated SPARK-23852: --- Description: Parquet MR 1.9.0 and 1.8.2 both have a bug, PARQUET-1217, that means that

[jira] [Created] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-02 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-23852: -- Summary: Parquet MR bug can lead to incorrect SQL results Key: SPARK-23852 URL: https://issues.apache.org/jira/browse/SPARK-23852 Project: Spark Issue

[jira] [Commented] (SPARK-23576) SparkSQL - Decimal data missing decimal point

2018-03-12 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16395803#comment-16395803 ] Henry Robinson commented on SPARK-23576: Do you have a smaller repro, or does it only reproduce

[jira] [Created] (SPARK-23634) AttributeReferences may be too conservative wrt nullability after optimization

2018-03-08 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-23634: -- Summary: AttributeReferences may be too conservative wrt nullability after optimization Key: SPARK-23634 URL: https://issues.apache.org/jira/browse/SPARK-23634

[jira] [Created] (SPARK-23606) Flakey FileBasedDataSourceSuite

2018-03-05 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-23606: -- Summary: Flakey FileBasedDataSourceSuite Key: SPARK-23606 URL: https://issues.apache.org/jira/browse/SPARK-23606 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-23604) ParquetInteroperabilityTest timestamp test should use Statistics.hasNonNullValue

2018-03-05 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson updated SPARK-23604: --- Description: We ran into an issue with a downstream build of Spark running against a custom

[jira] [Created] (SPARK-23604) ParquetInteroperabilityTest timestamp test should use Statistics.hasNonNullValue

2018-03-05 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-23604: -- Summary: ParquetInteroperabilityTest timestamp test should use Statistics.hasNonNullValue Key: SPARK-23604 URL: https://issues.apache.org/jira/browse/SPARK-23604

[jira] [Commented] (SPARK-23500) Filters on named_structs could be pushed into scans

2018-02-27 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379201#comment-16379201 ] Henry Robinson commented on SPARK-23500: Ok, I figured it out! {{SimplifyCreateStructOps}} does

[jira] [Comment Edited] (SPARK-23500) Filters on named_structs could be pushed into scans

2018-02-26 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377802#comment-16377802 ] Henry Robinson edited comment on SPARK-23500 at 2/27/18 7:08 AM: - There's

[jira] [Comment Edited] (SPARK-23500) Filters on named_structs could be pushed into scans

2018-02-26 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377802#comment-16377802 ] Henry Robinson edited comment on SPARK-23500 at 2/26/18 11:51 PM: --

[jira] [Commented] (SPARK-23500) Filters on named_structs could be pushed into scans

2018-02-26 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377802#comment-16377802 ] Henry Robinson commented on SPARK-23500: There's an optimizer rule, {{SimplifyCreateStructOps}},

[jira] [Updated] (SPARK-23500) Filters on named_structs could be pushed into scans

2018-02-23 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson updated SPARK-23500: --- Description: Simple filters on dataframes joined with {{joinWith()}} are missing an

[jira] [Created] (SPARK-23500) Filters on named_structs could be pushed into scans

2018-02-23 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-23500: -- Summary: Filters on named_structs could be pushed into scans Key: SPARK-23500 URL: https://issues.apache.org/jira/browse/SPARK-23500 Project: Spark

[jira] [Commented] (SPARK-23157) withColumn fails for a column that is a result of mapped DataSet

2018-01-29 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343962#comment-16343962 ] Henry Robinson commented on SPARK-23157: [~kretes] - I can see an argument for the behaviour

[jira] [Commented] (SPARK-23157) withColumn fails for a column that is a result of mapped DataSet

2018-01-25 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340412#comment-16340412 ] Henry Robinson commented on SPARK-23157: I'm not sure if this should actually be expected to

[jira] [Commented] (SPARK-23148) spark.read.csv with multiline=true gives FileNotFoundException if path contains spaces

2018-01-19 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332999#comment-16332999 ] Henry Robinson commented on SPARK-23148: It seems like the problem is that

[jira] [Comment Edited] (SPARK-23148) spark.read.csv with multiline=true gives FileNotFoundException if path contains spaces

2018-01-19 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332999#comment-16332999 ] Henry Robinson edited comment on SPARK-23148 at 1/19/18 11:25 PM: -- It

[jira] [Created] (SPARK-23062) EXCEPT documentation should make it clear that it's EXCEPT DISTINCT

2018-01-12 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-23062: -- Summary: EXCEPT documentation should make it clear that it's EXCEPT DISTINCT Key: SPARK-23062 URL: https://issues.apache.org/jira/browse/SPARK-23062 Project:

[jira] [Created] (SPARK-22736) Consider caching decoded dictionaries in VectorizedColumnReader

2017-12-07 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-22736: -- Summary: Consider caching decoded dictionaries in VectorizedColumnReader Key: SPARK-22736 URL: https://issues.apache.org/jira/browse/SPARK-22736 Project: Spark

[jira] [Commented] (SPARK-22211) LimitPushDown optimization for FullOuterJoin generates wrong results

2017-11-03 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238281#comment-16238281 ] Henry Robinson commented on SPARK-22211: Sounds good, thanks both. > LimitPushDown optimization

[jira] [Commented] (SPARK-22211) LimitPushDown optimization for FullOuterJoin generates wrong results

2017-11-03 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237778#comment-16237778 ] Henry Robinson commented on SPARK-22211: [~smilegator] - sounds good! What will your approach be?

[jira] [Comment Edited] (SPARK-22211) LimitPushDown optimization for FullOuterJoin generates wrong results

2017-10-30 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225816#comment-16225816 ] Henry Robinson edited comment on SPARK-22211 at 10/30/17 9:59 PM: --

[jira] [Commented] (SPARK-22211) LimitPushDown optimization for FullOuterJoin generates wrong results

2017-10-30 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225816#comment-16225816 ] Henry Robinson commented on SPARK-22211: Thinking about it a more, I think the optimization

[jira] [Commented] (SPARK-22211) LimitPushDown optimization for FullOuterJoin generates wrong results

2017-10-27 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16223073#comment-16223073 ] Henry Robinson commented on SPARK-22211: I think the optimization proposed works only for a