[jira] [Commented] (SPARK-27281) Wrong latest offsets returned by DirectKafkaInputDStream#latestOffsets

2021-01-30 Thread SeaAndHill (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275799#comment-17275799 ] SeaAndHill commented on SPARK-27281: [~yuanyuan.xia] do you fix it ? > Wrong latest offsets

[jira] [Updated] (SPARK-34283) Combines all adjacent 'Union' operators into a single 'Union' when using 'Dataset.union.distinct'

2021-01-30 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-34283: Affects Version/s: (was: 2.4.7) (was: 3.0.1)

[jira] [Updated] (SPARK-34297) Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-34297: Component/s: SQL > Add metrics for data loss and offset out range for KafkaMicroBatchStream >

[jira] [Commented] (SPARK-34184) Remove redundant repartition nodes in the optimizer

2021-01-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275789#comment-17275789 ] Jungtaek Lim commented on SPARK-34184: -- Tests need to be restored once SPARK-34255 is merged:

[jira] [Commented] (SPARK-29220) Flaky test: org.apache.spark.deploy.yarn.LocalityPlacementStrategySuite.handle large number of containers and tasks (SPARK-18750) [hadoop-3.2][java11]

2021-01-30 Thread Attila Zsolt Piros (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275786#comment-17275786 ] Attila Zsolt Piros commented on SPARK-29220: I think we can close this as with

[jira] [Commented] (SPARK-34280) Avoid migrating un-needed shuffle files

2021-01-30 Thread Attila Zsolt Piros (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275783#comment-17275783 ] Attila Zsolt Piros commented on SPARK-34280: It would be interesting to know whether the

[jira] [Commented] (SPARK-34277) Unexpected outcome from Python UDF where its return type is MapType

2021-01-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275782#comment-17275782 ] Hyukjin Kwon commented on SPARK-34277: -- To keep the order, you should better use an explicit array

[jira] [Resolved] (SPARK-34277) Unexpected outcome from Python UDF where its return type is MapType

2021-01-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-34277. -- Resolution: Not A Problem > Unexpected outcome from Python UDF where its return type is

[jira] [Commented] (SPARK-34277) Unexpected outcome from Python UDF where its return type is MapType

2021-01-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275781#comment-17275781 ] Hyukjin Kwon commented on SPARK-34277: -- Because maps in Spark do not guarantee the order. >

[jira] [Commented] (SPARK-34254) Document CREATE EXTERNAL datasource TABLE

2021-01-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275779#comment-17275779 ] Hyukjin Kwon commented on SPARK-34254: -- Ah, that's fine [~huaxingao]. [~cloud_fan] do you mind

[jira] [Commented] (SPARK-34285) Implement Parquet StringEndsWith、StringContains Filter

2021-01-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275778#comment-17275778 ] Hyukjin Kwon commented on SPARK-34285: -- Agree with [~attilapiros]. Parquet also supports

[jira] [Resolved] (SPARK-34285) Implement Parquet StringEndsWith、StringContains Filter

2021-01-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-34285. -- Resolution: Won't Fix > Implement Parquet StringEndsWith、StringContains Filter >

[jira] [Commented] (SPARK-34292) NOW is interpreted as the NOW SQL function

2021-01-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275777#comment-17275777 ] Hyukjin Kwon commented on SPARK-34292: -- Is this a duplicate of SPARK-34259? > NOW is interpreted

[jira] [Resolved] (SPARK-34292) NOW is interpreted as the NOW SQL function

2021-01-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-34292. -- Resolution: Duplicate > NOW is interpreted as the NOW SQL function >

[jira] [Comment Edited] (SPARK-34285) Implement Parquet StringEndsWith、StringContains Filter

2021-01-30 Thread Attila Zsolt Piros (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17274458#comment-17274458 ] Attila Zsolt Piros edited comment on SPARK-34285 at 1/31/21, 4:50 AM:

[jira] [Commented] (SPARK-27281) Wrong latest offsets returned by DirectKafkaInputDStream#latestOffsets

2021-01-30 Thread SeaAndHill (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275775#comment-17275775 ] SeaAndHill commented on SPARK-27281: 遇到了相同的问题 > Wrong latest offsets returned by

[jira] [Resolved] (SPARK-34299) Clean up ResolveSessionCatalog

2021-01-30 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-34299. -- Fix Version/s: 3.2.0 Resolution: Fixed Fixed in

[jira] [Updated] (SPARK-34137) The tree string does not contain statistics for nested scalar sub queries

2021-01-30 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-34137: Parent: (was: SPARK-34120) Issue Type: Bug (was: Sub-task) > The tree string does

[jira] [Issue Comment Deleted] (SPARK-33979) Filter predicate reorder

2021-01-30 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-33979: Comment: was deleted (was: I'm working on.) > Filter predicate reorder >

[jira] [Updated] (SPARK-33709) Refactor FileTable

2021-01-30 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-33709: Parent: (was: SPARK-26345) Issue Type: Task (was: Sub-task) > Refactor FileTable >

[jira] [Updated] (SPARK-27733) Upgrade to Avro 1.10.1

2021-01-30 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-27733: Parent: SPARK-26345 Issue Type: Sub-task (was: Improvement) > Upgrade to Avro 1.10.1 >

[jira] [Updated] (SPARK-26345) Parquet support Column indexes

2021-01-30 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-26345: Description: Parquet 1.11 supports column indexing. Spark can supports this feature for better

[jira] [Resolved] (SPARK-26345) Parquet support Column indexes

2021-01-30 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-26345. - Fix Version/s: 3.2.0 Assignee: Yuming Wang Resolution: Fixed > Parquet support

[jira] [Updated] (SPARK-26345) Parquet support Column indexes

2021-01-30 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-26345: Description: Parquet 1.11 supports column indexing. Spark can supports this feature for better

[jira] [Updated] (SPARK-26346) Upgrade parquet to 1.11.1

2021-01-30 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-26346: Description: Parquet 1.11 new features: PARQUET-1201 - Column indexes PARQUET-1253 - Support

[jira] [Commented] (SPARK-34300) Fix of typos in documentation of pyspark.sql.functions and output of lint-python

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275742#comment-17275742 ] Apache Spark commented on SPARK-34300: -- User 'DavidToneian' has created a pull request for this

[jira] [Commented] (SPARK-34300) Fix of typos in documentation of pyspark.sql.functions and output of lint-python

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275741#comment-17275741 ] Apache Spark commented on SPARK-34300: -- User 'DavidToneian' has created a pull request for this

[jira] [Assigned] (SPARK-34300) Fix of typos in documentation of pyspark.sql.functions and output of lint-python

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34300: Assignee: (was: Apache Spark) > Fix of typos in documentation of

[jira] [Assigned] (SPARK-34300) Fix of typos in documentation of pyspark.sql.functions and output of lint-python

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34300: Assignee: Apache Spark > Fix of typos in documentation of pyspark.sql.functions and

[jira] [Created] (SPARK-34300) Fix of typos in documentation of pyspark.sql.functions and output of lint-python

2021-01-30 Thread David Toneian (Jira)
David Toneian created SPARK-34300: - Summary: Fix of typos in documentation of pyspark.sql.functions and output of lint-python Key: SPARK-34300 URL: https://issues.apache.org/jira/browse/SPARK-34300

[jira] [Commented] (SPARK-34299) Clean up ResolveSessionCatalog

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275740#comment-17275740 ] Apache Spark commented on SPARK-34299: -- User 'imback82' has created a pull request for this issue:

[jira] [Assigned] (SPARK-34299) Clean up ResolveSessionCatalog

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34299: Assignee: (was: Apache Spark) > Clean up ResolveSessionCatalog >

[jira] [Assigned] (SPARK-34299) Clean up ResolveSessionCatalog

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34299: Assignee: Apache Spark > Clean up ResolveSessionCatalog > --

[jira] [Commented] (SPARK-34299) Clean up ResolveSessionCatalog

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275739#comment-17275739 ] Apache Spark commented on SPARK-34299: -- User 'imback82' has created a pull request for this issue:

[jira] [Created] (SPARK-34299) Clean up ResolveSessionCatalog

2021-01-30 Thread Terry Kim (Jira)
Terry Kim created SPARK-34299: - Summary: Clean up ResolveSessionCatalog Key: SPARK-34299 URL: https://issues.apache.org/jira/browse/SPARK-34299 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-34269) simplify view resolution

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275701#comment-17275701 ] Apache Spark commented on SPARK-34269: -- User 'imback82' has created a pull request for this issue:

[jira] [Commented] (SPARK-34269) simplify view resolution

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275700#comment-17275700 ] Apache Spark commented on SPARK-34269: -- User 'imback82' has created a pull request for this issue:

[jira] [Assigned] (SPARK-34259) Reading a partitioned dataset with a partition value of NOW causes the value to be parsed as a timestamp.

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34259: Assignee: (was: Apache Spark) > Reading a partitioned dataset with a partition value

[jira] [Commented] (SPARK-34259) Reading a partitioned dataset with a partition value of NOW causes the value to be parsed as a timestamp.

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275666#comment-17275666 ] Apache Spark commented on SPARK-34259: -- User 'd80tb7' has created a pull request for this issue:

[jira] [Assigned] (SPARK-34259) Reading a partitioned dataset with a partition value of NOW causes the value to be parsed as a timestamp.

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34259: Assignee: Apache Spark > Reading a partitioned dataset with a partition value of NOW

[jira] [Created] (SPARK-34298) SaveMode.Overwrite not usable when using s3a root paths

2021-01-30 Thread cornel creanga (Jira)
cornel creanga created SPARK-34298: -- Summary: SaveMode.Overwrite not usable when using s3a root paths Key: SPARK-34298 URL: https://issues.apache.org/jira/browse/SPARK-34298 Project: Spark

[jira] [Commented] (SPARK-34297) Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275536#comment-17275536 ] Apache Spark commented on SPARK-34297: -- User 'viirya' has created a pull request for this issue:

[jira] [Assigned] (SPARK-34297) Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34297: Assignee: L. C. Hsieh (was: Apache Spark) > Add metrics for data loss and offset out

[jira] [Assigned] (SPARK-34297) Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34297: Assignee: Apache Spark (was: L. C. Hsieh) > Add metrics for data loss and offset out

[jira] [Commented] (SPARK-34297) Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275535#comment-17275535 ] Apache Spark commented on SPARK-34297: -- User 'viirya' has created a pull request for this issue: