[jira] [Updated] (SPARK-47904) Preserve case in Avro schema when using enableStableIdentifiersForUnionType

2024-04-18 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-47904: - Description: When enableStableIdentifiersForUnionType is enabled, all of the types are

[jira] [Created] (SPARK-47904) Preserve case in Avro schema when using enableStableIdentifiersForUnionType

2024-04-18 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-47904: Summary: Preserve case in Avro schema when using enableStableIdentifiersForUnionType Key: SPARK-47904 URL: https://issues.apache.org/jira/browse/SPARK-47904 Project:

[jira] [Updated] (SPARK-47904) Preserve case in Avro schema when using enableStableIdentifiersForUnionType

2024-04-18 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-47904: - Description: When  > Preserve case in Avro schema when using

[jira] [Updated] (SPARK-47704) JSON parsing fails with "java.lang.ClassCastException: org.apache.spark.sql.catalyst.util.ArrayBasedMapData cannot be cast to org.apache.spark.sql.catalyst.util.ArrayDat

2024-04-03 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-47704: - Description: When reading the following JSON \{"a":[{"key":{"b":0}}]}:  {code:java} val df =

[jira] [Created] (SPARK-47704) JSON parsing fails with "java.lang.ClassCastException: org.apache.spark.sql.catalyst.util.ArrayBasedMapData cannot be cast to org.apache.spark.sql.catalyst.util.ArrayDat

2024-04-02 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-47704: Summary: JSON parsing fails with "java.lang.ClassCastException: org.apache.spark.sql.catalyst.util.ArrayBasedMapData cannot be cast to org.apache.spark.sql.catalyst.util.ArrayData" when

[jira] [Updated] (SPARK-47704) JSON parsing fails with "java.lang.ClassCastException: org.apache.spark.sql.catalyst.util.ArrayBasedMapData cannot be cast to org.apache.spark.sql.catalyst.util.ArrayDat

2024-04-02 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-47704: - Description: TODO: Please don't close the ticket, I will fill in the description. > JSON

[jira] [Commented] (SPARK-46990) Regression: Unable to load empty avro files emitted by event-hubs

2024-03-18 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828143#comment-17828143 ] Ivan Sadikov commented on SPARK-46990: -- Opened PR https://github.com/apache/spark/pull/45578. >

[jira] [Commented] (SPARK-46990) Regression: Unable to load empty avro files emitted by event-hubs

2024-03-18 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828107#comment-17828107 ] Ivan Sadikov commented on SPARK-46990: -- Thanks, Kamil. I am still debugging, will try to open a PR

[jira] [Commented] (SPARK-46990) Regression: Unable to load empty avro files emitted by event-hubs

2024-03-18 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828100#comment-17828100 ] Ivan Sadikov commented on SPARK-46990: -- Yes, sure. Thanks for reporting. I will take a look and

[jira] [Updated] (SPARK-46930) Add support for a custom prefix for fields of Avro union type when enableStableIdentifiersForUnionType is enabled

2024-01-30 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-46930: - Description: {{enableStableIdentifiersForUnionType}} allows to enable stable identifiers in

[jira] [Created] (SPARK-46930) Add support for a custom prefix for fields of Avro union type when enableStableIdentifiersForUnionType is enabled

2024-01-30 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-46930: Summary: Add support for a custom prefix for fields of Avro union type when enableStableIdentifiersForUnionType is enabled Key: SPARK-46930 URL:

[jira] [Updated] (SPARK-46633) Reading a non-empty Avro file with empty blocks returns 0 records

2024-01-08 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-46633: - Description: When an Avro file contains empty blocks, Spark returns 0 records while "fastavro"

[jira] [Created] (SPARK-46633) Reading a non-empty Avro file with empty blocks returns 0 records

2024-01-08 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-46633: Summary: Reading a non-empty Avro file with empty blocks returns 0 records Key: SPARK-46633 URL: https://issues.apache.org/jira/browse/SPARK-46633 Project: Spark

[jira] [Resolved] (SPARK-46482) Revert SPARK-43049 due to performance regression of using CLOB

2023-12-21 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov resolved SPARK-46482. -- Resolution: Duplicate > Revert SPARK-43049 due to performance regression of using CLOB >

[jira] [Updated] (SPARK-46482) Revert SPARK-43049 due to performance regression of using CLOB

2023-12-21 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-46482: - Affects Version/s: (was: 3.4.1) > Revert SPARK-43049 due to performance regression of using

[jira] [Updated] (SPARK-46482) Revert SPARK-43049 due to performance regression of using CLOB

2023-12-21 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-46482: - Description: SPARK-43049 causes performance regression when writing string fields to an Oracle

[jira] [Updated] (SPARK-46482) Revert SPARK-43049 due to performance regression

2023-12-21 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-46482: - Description: SPARK-43049 causes performance regression when writing string fields to an Oracle

[jira] [Updated] (SPARK-46482) Revert SPARK-43049 due to performance regression of using CLOB

2023-12-21 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-46482: - Summary: Revert SPARK-43049 due to performance regression of using CLOB (was: Revert

[jira] [Updated] (SPARK-45194) Parquet reads fail with "RuntimeException: Unable to create Parquet converter for data type "timestamp_ntz" due to incorrect schema inference

2023-09-17 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-45194: - Description: I found that Parquet reads could fail due to incorrect schema inference with two

[jira] [Updated] (SPARK-45194) Parquet reads fail with "RuntimeException: Unable to create Parquet converter for data type "timestamp_ntz" due to incorrect schema inference

2023-09-17 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-45194: - Description: I found that Parquet reads could fail due to incorrect schema inference with two

[jira] [Commented] (SPARK-45194) Parquet reads fail with "RuntimeException: Unable to create Parquet converter for data type "timestamp_ntz" due to incorrect schema inference

2023-09-17 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766198#comment-17766198 ] Ivan Sadikov commented on SPARK-45194: -- cc [~gengliang] [~cloud_fan] > Parquet reads fail with

[jira] [Created] (SPARK-45194) Parquet reads fail with "RuntimeException: Unable to create Parquet converter for data type "timestamp_ntz" due to incorrect schema inference

2023-09-17 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-45194: Summary: Parquet reads fail with "RuntimeException: Unable to create Parquet converter for data type "timestamp_ntz" due to incorrect schema inference Key: SPARK-45194 URL:

[jira] [Created] (SPARK-45139) Add DatabricksDialect to handle SQL type conversion

2023-09-12 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-45139: Summary: Add DatabricksDialect to handle SQL type conversion Key: SPARK-45139 URL: https://issues.apache.org/jira/browse/SPARK-45139 Project: Spark Issue

[jira] [Updated] (SPARK-45139) Add DatabricksDialect to handle SQL type conversion

2023-09-12 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-45139: - Description: Databricks SQL dialect is needed to refine type conversion when connecting to a

[jira] [Commented] (SPARK-44940) Improve performance of JSON parsing when "spark.sql.json.enablePartialResults" is enabled

2023-08-24 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758792#comment-17758792 ] Ivan Sadikov commented on SPARK-44940: -- Opened https://github.com/apache/spark/pull/42667. >

[jira] [Updated] (SPARK-44940) Improve performance of JSON parsing when "spark.sql.json.enablePartialResults" is enabled

2023-08-24 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-44940: - Summary: Improve performance of JSON parsing when "spark.sql.json.enablePartialResults" is

[jira] [Commented] (SPARK-44940) Improve performance of JSON parsing when partial results are enabled

2023-08-24 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758394#comment-17758394 ] Ivan Sadikov commented on SPARK-44940: -- I have prototyped the fix and will open a PR shortly. >

[jira] [Updated] (SPARK-44940) Improve performance of JSON parsing when partial results are enabled

2023-08-24 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-44940: - Description: Follow-up on https://issues.apache.org/jira/browse/SPARK-40646. I found that JSON

[jira] [Created] (SPARK-44940) Improve performance of JSON parsing when partial results are enabled

2023-08-24 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-44940: Summary: Improve performance of JSON parsing when partial results are enabled Key: SPARK-44940 URL: https://issues.apache.org/jira/browse/SPARK-44940 Project: Spark

[jira] [Commented] (SPARK-42534) Fix DB2 Limit clause

2023-02-22 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692438#comment-17692438 ] Ivan Sadikov commented on SPARK-42534: -- I am going to open a PR to fix this. > Fix DB2 Limit

[jira] [Created] (SPARK-42534) Fix DB2 Limit clause

2023-02-22 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-42534: Summary: Fix DB2 Limit clause Key: SPARK-42534 URL: https://issues.apache.org/jira/browse/SPARK-42534 Project: Spark Issue Type: Bug Components:

[jira] [Updated] (SPARK-42469) Update MSSQL Dialect to use parentheses for TOP and add tests for Limit clause

2023-02-16 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-42469: - Summary: Update MSSQL Dialect to use parentheses for TOP and add tests for Limit clause (was:

[jira] [Updated] (SPARK-42469) Update MSSQL Dialect to use parentheses for TOP and add tests for Limit clause #40059

2023-02-16 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-42469: - Summary: Update MSSQL Dialect to use parentheses for TOP and add tests for Limit clause #40059

[jira] [Updated] (SPARK-42469) Small fix for MSSQL dialect + tests

2023-02-16 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-42469: - Description: Follow up for SPARK-42131. (was: Follow up for  h1. SPARK-42131.) > Small fix for

[jira] [Created] (SPARK-42469) Small fix for MSSQL dialect + tests

2023-02-16 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-42469: Summary: Small fix for MSSQL dialect + tests Key: SPARK-42469 URL: https://issues.apache.org/jira/browse/SPARK-42469 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-42176) Cast boolean to timestamp fails with ClassCastException

2023-01-24 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-42176: Summary: Cast boolean to timestamp fails with ClassCastException Key: SPARK-42176 URL: https://issues.apache.org/jira/browse/SPARK-42176 Project: Spark

[jira] [Updated] (SPARK-42176) Cast boolean to timestamp fails with ClassCastException

2023-01-24 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-42176: - Description: When casting a boolean value to timestamp, the following error is thrown:

[jira] [Created] (SPARK-42128) Limit pushdown for MS SQL Server

2023-01-19 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-42128: Summary: Limit pushdown for MS SQL Server Key: SPARK-42128 URL: https://issues.apache.org/jira/browse/SPARK-42128 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-41248) Add config flag to control before of JSON partial results parsing in SPARK-40646

2022-11-23 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-41248: - Description: This is a follow-up for https://issues.apache.org/jira/browse/SPARK-40646.   It

[jira] [Updated] (SPARK-41248) Add config flag to control before of JSON partial results parsing in SPARK-40646

2022-11-23 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-41248: - Attachment: json-benchmark-without-SPARK-40646.log > Add config flag to control before of JSON

[jira] [Updated] (SPARK-41248) Add config flag to control before of JSON partial results parsing in SPARK-40646

2022-11-23 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-41248: - Attachment: without-SPARK-40646-commit.log > Add config flag to control before of JSON partial

[jira] [Updated] (SPARK-41248) Add config flag to control before of JSON partial results parsing in SPARK-40646

2022-11-23 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-41248: - Attachment: json-benchmark-with-SPARK-40646.log > Add config flag to control before of JSON

[jira] [Updated] (SPARK-41248) Add config flag to control before of JSON partial results parsing in SPARK-40646

2022-11-23 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-41248: - Attachment: (was: without-SPARK-40646-commit.log) > Add config flag to control before of

[jira] [Updated] (SPARK-41248) Add config flag to control before of JSON partial results parsing in SPARK-40646

2022-11-23 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-41248: - Description: This is a follow-up for https://issues.apache.org/jira/browse/SPARK-40646.   It

[jira] [Updated] (SPARK-41248) Add config flag to control before of JSON partial results parsing in SPARK-40646

2022-11-23 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-41248: - Description: This is a follow-up for https://issues.apache.org/jira/browse/SPARK-40646.   It

[jira] [Created] (SPARK-41248) Add config flag to control before of JSON partial results parsing in SPARK-40646

2022-11-23 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-41248: Summary: Add config flag to control before of JSON partial results parsing in SPARK-40646 Key: SPARK-41248 URL: https://issues.apache.org/jira/browse/SPARK-41248

[jira] [Updated] (SPARK-41209) Improve PySpark type inference in _merge_type method

2022-11-20 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-41209: - Summary: Improve PySpark type inference in _merge_type method (was: Improve Pyspark type

[jira] [Created] (SPARK-41209) Improve Pyspark type inference in _merge_type method

2022-11-20 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-41209: Summary: Improve Pyspark type inference in _merge_type method Key: SPARK-41209 URL: https://issues.apache.org/jira/browse/SPARK-41209 Project: Spark Issue

[jira] [Created] (SPARK-40815) SymlinkTextInputFormat returns incorrect result due to enabled spark.hadoopRDD.ignoreEmptySplits

2022-10-16 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-40815: Summary: SymlinkTextInputFormat returns incorrect result due to enabled spark.hadoopRDD.ignoreEmptySplits Key: SPARK-40815 URL: https://issues.apache.org/jira/browse/SPARK-40815

[jira] [Comment Edited] (SPARK-40541) NullPointerException with UTF8String.getBaseObject() when UDF

2022-10-14 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617914#comment-17617914 ] Ivan Sadikov edited comment on SPARK-40541 at 10/14/22 6:43 PM: I was

[jira] [Comment Edited] (SPARK-40541) NullPointerException with UTF8String.getBaseObject() when UDF

2022-10-14 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617914#comment-17617914 ] Ivan Sadikov edited comment on SPARK-40541 at 10/14/22 6:42 PM: I was

[jira] [Commented] (SPARK-40541) NullPointerException with UTF8String.getBaseObject() when UDF

2022-10-14 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617914#comment-17617914 ] Ivan Sadikov commented on SPARK-40541: -- I was asking about the actual problem. It is not clear what

[jira] [Comment Edited] (SPARK-39783) Column backticks are misplaced in the AnalysisException [UNRESOLVED_COLUMN] error message when using field with "."

2022-10-14 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617509#comment-17617509 ] Ivan Sadikov edited comment on SPARK-39783 at 10/14/22 7:14 AM: It is

[jira] [Updated] (SPARK-39783) Column backticks are misplaced in the AnalysisException [UNRESOLVED_COLUMN] error message when using field with "."

2022-10-14 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-39783: - Description: AnalysisException [UNRESOLVED_COLUMN] shows the wrong suggestion when a field

[jira] [Updated] (SPARK-39783) Column backticks are misplaced in the AnalysisException [UNRESOLVED_COLUMN] error message when using field with "."

2022-10-14 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-39783: - Description: AnalysisException [UNRESOLVED_COLUMN]   The following code references a nested

[jira] [Updated] (SPARK-39783) Column backticks are misplaced in the AnalysisException [UNRESOLVED_COLUMN] error message when using field with "."

2022-10-14 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-39783: - Summary: Column backticks are misplaced in the AnalysisException [UNRESOLVED_COLUMN] error

[jira] [Updated] (SPARK-39783) Column backticks are misplaced in the erroWrong column backticks in UNRESOLVED_COLUMN error

2022-10-14 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-39783: - Summary: Column backticks are misplaced in the erroWrong column backticks in UNRESOLVED_COLUMN

[jira] [Commented] (SPARK-39783) Wrong column backticks in UNRESOLVED_COLUMN error

2022-10-14 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617509#comment-17617509 ] Ivan Sadikov commented on SPARK-39783: -- It is not clear from the ticket, you should update the

[jira] (SPARK-39257) use spark.read.jdbc() to read data from SQL databse into dataframe, it fails silently, when the session is killed from SQL server side

2022-10-13 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39257 ] Ivan Sadikov deleted comment on SPARK-39257: -- was (Author: ivan.sadikov): I have had a similar issue before, you would need to do packet capture to figure out what the underlying issue is.

[jira] [Commented] (SPARK-39257) use spark.read.jdbc() to read data from SQL databse into dataframe, it fails silently, when the session is killed from SQL server side

2022-10-13 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617463#comment-17617463 ] Ivan Sadikov commented on SPARK-39257: -- I have had a similar issue before, you would need to do

[jira] [Commented] (SPARK-39783) Wrong column backticks in UNRESOLVED_COLUMN error

2022-10-13 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617460#comment-17617460 ] Ivan Sadikov commented on SPARK-39783: -- This is by design if I am not mistaken. Such columns need

[jira] [Commented] (SPARK-40430) Spark session does not update number of files for partition

2022-10-13 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617441#comment-17617441 ] Ivan Sadikov commented on SPARK-40430: -- Can you try FSCK REPAIR TABLE command on your table if you

[jira] [Commented] (SPARK-40541) NullPointerException with UTF8String.getBaseObject() when UDF

2022-10-13 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617439#comment-17617439 ] Ivan Sadikov commented on SPARK-40541: -- What is the question here? Does marking column as nullable

[jira] [Commented] (SPARK-40637) DataFrame can correctly encode BINARY type but SparkSQL cannot

2022-10-13 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617435#comment-17617435 ] Ivan Sadikov commented on SPARK-40637: -- You are not writing to the table in the first example but

[jira] [Commented] (SPARK-40584) Incorrect Count when reading CSV file

2022-10-05 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17613304#comment-17613304 ] Ivan Sadikov commented on SPARK-40584: -- Disabling "multiLine" also fixes the issue. Seems to be an

[jira] [Updated] (SPARK-40646) Fix returning partial results in JSON data source and JSON functions

2022-10-03 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40646: - Description: I recently found an issue when parsing the following JSON file: {code:java} {"a":

[jira] [Updated] (SPARK-40646) Fix returning partial results in JSON data source and JSON functions

2022-10-03 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40646: - Description: I recently found an issue when parsing the following JSON file: {code:java} {"a":

[jira] [Created] (SPARK-40646) Fix returning partial results in JSON data source and JSON functions

2022-10-03 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-40646: Summary: Fix returning partial results in JSON data source and JSON functions Key: SPARK-40646 URL: https://issues.apache.org/jira/browse/SPARK-40646 Project: Spark

[jira] [Updated] (SPARK-40527) Keep struct field names or map keys in CreateStruct

2022-09-22 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40527: - Summary: Keep struct field names or map keys in CreateStruct (was: Keep struct field names or

[jira] [Updated] (SPARK-40527) Keep struct/map field names for UnresolvedExtractValue in CreateNamedStruct

2022-09-22 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40527: - Summary: Keep struct/map field names for UnresolvedExtractValue in CreateNamedStruct (was:

[jira] [Updated] (SPARK-40527) Keep struct field names or map keys for UnresolvedExtractValue in CreateNamedStruct

2022-09-22 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40527: - Summary: Keep struct field names or map keys for UnresolvedExtractValue in CreateNamedStruct

[jira] [Updated] (SPARK-40527) Keep struct field names or map keys in CreateNamedStruct

2022-09-22 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40527: - Summary: Keep struct field names or map keys in CreateNamedStruct (was: Keep struct field

[jira] [Updated] (SPARK-40527) Generate field names for UnresolvedExtractValue in CreateNamedStruct

2022-09-22 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40527: - Description: Using index-like notation when extracting columns in a struct produces generated

[jira] [Updated] (SPARK-40527) Generate field names for UnresolvedExtractValue in CreateNamedStruct

2022-09-22 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40527: - Summary: Generate field names for UnresolvedExtractValue in CreateNamedStruct (was: Generate

[jira] [Created] (SPARK-40527) Generate names for UnresolvedExtractValue in CreateNamedStruct

2022-09-22 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-40527: Summary: Generate names for UnresolvedExtractValue in CreateNamedStruct Key: SPARK-40527 URL: https://issues.apache.org/jira/browse/SPARK-40527 Project: Spark

[jira] [Created] (SPARK-40496) Configs to control "enableDateTimeParsingFallback" are incorrectly swapped

2022-09-20 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-40496: Summary: Configs to control "enableDateTimeParsingFallback" are incorrectly swapped Key: SPARK-40496 URL: https://issues.apache.org/jira/browse/SPARK-40496 Project:

[jira] [Updated] (SPARK-40470) arrays_zip output unexpected alias column names when using GetMapValue and GetArrayStructFields

2022-09-16 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40470: - Summary: arrays_zip output unexpected alias column names when using GetMapValue and

[jira] [Updated] (SPARK-40470) arrays_zip output unexpected alias column names when using Map

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40470: - Description: This is a follow-up for https://issues.apache.org/jira/browse/SPARK-40292.  I

[jira] [Updated] (SPARK-40470) arrays_zip output unexpected alias column names when using Map

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40470: - Description: This is a follow-up for https://issues.apache.org/jira/browse/SPARK-40292.  I

[jira] [Updated] (SPARK-40470) arrays_zip output unexpected alias column names when using Map

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40470: - Description: This is a follow-up for https://issues.apache.org/jira/browse/SPARK-40292.  I

[jira] [Created] (SPARK-40470) arrays_zip output unexpected alias column names when using Map

2022-09-15 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-40470: Summary: arrays_zip output unexpected alias column names when using Map Key: SPARK-40470 URL: https://issues.apache.org/jira/browse/SPARK-40470 Project: Spark

[jira] [Updated] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40468: - Description: I have found that depending on the name of the corrupt record in CSV, the field

[jira] [Updated] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40468: - Description: I have found that depending on the name of the corrupt record in CSV, the field

[jira] [Updated] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40468: - Description: I have found that depending on the name of the corrupt record in CSV, the field

[jira] [Updated] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-15 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40468: - Description: I have found that depending on the name of the corrupt record in CSV, the field

[jira] [Created] (SPARK-40468) Column pruning is not handled correctly in CSV when _corrupt_record is used

2022-09-15 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-40468: Summary: Column pruning is not handled correctly in CSV when _corrupt_record is used Key: SPARK-40468 URL: https://issues.apache.org/jira/browse/SPARK-40468 Project:

[jira] [Commented] (SPARK-40292) arrays_zip output unexpected alias column names

2022-09-04 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600198#comment-17600198 ] Ivan Sadikov commented on SPARK-40292: -- I will take a look. > arrays_zip output unexpected alias

[jira] [Created] (SPARK-40215) Add SQL configs to control CSV/JSON date and timestamp parsing behaviour

2022-08-24 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-40215: Summary: Add SQL configs to control CSV/JSON date and timestamp parsing behaviour Key: SPARK-40215 URL: https://issues.apache.org/jira/browse/SPARK-40215 Project:

[jira] [Commented] (SPARK-40215) Add SQL configs to control CSV/JSON date and timestamp parsing behaviour

2022-08-24 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17584612#comment-17584612 ] Ivan Sadikov commented on SPARK-40215: -- Follow-up. > Add SQL configs to control CSV/JSON date and

[jira] [Commented] (SPARK-40169) Fix the issue with Parquet column index and predicate pushdown in Data source V1

2022-08-21 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17582664#comment-17582664 ] Ivan Sadikov commented on SPARK-40169: -- I would like to work on it as it was my responsibility to

[jira] [Updated] (SPARK-40169) Fix the issue with Parquet column index and predicate pushdown in Data source V1

2022-08-21 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-40169: - Description: This is a follow for SPARK-39833. In

[jira] [Created] (SPARK-40169) Fix the issue with Parquet column index and predicate pushdown in Data source V1

2022-08-21 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-40169: Summary: Fix the issue with Parquet column index and predicate pushdown in Data source V1 Key: SPARK-40169 URL: https://issues.apache.org/jira/browse/SPARK-40169

[jira] [Created] (SPARK-40052) Handle direct byte buffers in VectorizedDeltaBinaryPackedReader

2022-08-11 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-40052: Summary: Handle direct byte buffers in VectorizedDeltaBinaryPackedReader Key: SPARK-40052 URL: https://issues.apache.org/jira/browse/SPARK-40052 Project: Spark

[jira] [Comment Edited] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-04 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575032#comment-17575032 ] Ivan Sadikov edited comment on SPARK-39833 at 8/5/22 5:07 AM: -- Your example

[jira] [Commented] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-04 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575570#comment-17575570 ] Ivan Sadikov commented on SPARK-39833: -- I opened a PR to quickly fix it:

[jira] [Comment Edited] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-04 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575032#comment-17575032 ] Ivan Sadikov edited comment on SPARK-39833 at 8/5/22 1:48 AM: -- This is

[jira] [Commented] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-04 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575510#comment-17575510 ] Ivan Sadikov commented on SPARK-39833: -- It appears to be a bug in Parquet-Mr.  There is a

[jira] [Comment Edited] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-03 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575032#comment-17575032 ] Ivan Sadikov edited comment on SPARK-39833 at 8/4/22 5:51 AM: -- This is

[jira] [Comment Edited] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-03 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575032#comment-17575032 ] Ivan Sadikov edited comment on SPARK-39833 at 8/4/22 5:47 AM: -- This is

[jira] [Commented] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-03 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575032#comment-17575032 ] Ivan Sadikov commented on SPARK-39833: -- This is related to case insensitive analysis in Spark. Your

  1   2   >