[jira] [Created] (SPARK-46777) Refactor `StreamingDataSourceRelation` into `StreamingDataSourceRelation` and `StreamingDataSourceScanRelation` for parity with batch scan

2024-01-19 Thread Jackie Zhang (Jira)
Jackie Zhang created SPARK-46777:


 Summary: Refactor `StreamingDataSourceRelation` into 
`StreamingDataSourceRelation` and `StreamingDataSourceScanRelation` for parity 
with batch scan
 Key: SPARK-46777
 URL: https://issues.apache.org/jira/browse/SPARK-46777
 Project: Spark
  Issue Type: Improvement
  Components: SQL, Structured Streaming
Affects Versions: 4.0.0
Reporter: Jackie Zhang


To prepare for the incoming structured streaming operator pushdown, we'd like 
to refactor some catalyst object relationships first.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-39013) Parser changes to enforce `()` for creating table without any columns

2022-04-27 Thread Jackie Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jackie Zhang resolved SPARK-39013.
--
Resolution: Won't Fix

> Parser changes to enforce `()` for creating table without any columns
> -
>
> Key: SPARK-39013
> URL: https://issues.apache.org/jira/browse/SPARK-39013
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Jackie Zhang
>Priority: Major
>
> We would like to enforce the `()` for `CREATE TABLE` queries to explicit 
> indicate a table without any columns will be created.
> E.g. `CREATE TABLE table () USING DELTA`.
> Existing behavior of CTAS and CREATE external table at location are not 
> affected.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39013) Parser changes to enforce `()` for creating table without any columns

2022-04-25 Thread Jackie Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jackie Zhang updated SPARK-39013:
-
Summary: Parser changes to enforce `()` for creating table without any 
columns  (was: Parse changes to enforce `()` for creating table without any 
columns)

> Parser changes to enforce `()` for creating table without any columns
> -
>
> Key: SPARK-39013
> URL: https://issues.apache.org/jira/browse/SPARK-39013
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4
>Reporter: Jackie Zhang
>Priority: Major
>
> We would like to enforce the `()` for `CREATE TABLE` queries to explicit 
> indicate a table without any columns will be created.
> E.g. `CREATE TABLE table () USING DELTA`.
> Existing behavior of CTAS and CREATE external table at location are not 
> affected.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39013) Parse changes to enforce `()` for creating table without any columns

2022-04-25 Thread Jackie Zhang (Jira)
Jackie Zhang created SPARK-39013:


 Summary: Parse changes to enforce `()` for creating table without 
any columns
 Key: SPARK-39013
 URL: https://issues.apache.org/jira/browse/SPARK-39013
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.4
Reporter: Jackie Zhang


We would like to enforce the `()` for `CREATE TABLE` queries to explicit 
indicate a table without any columns will be created.

E.g. `CREATE TABLE table () USING DELTA`.

Existing behavior of CTAS and CREATE external table at location are not 
affected.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-38939) Support ALTER TABLE ... DROP [IF EXISTS] COLUMN .. syntax

2022-04-18 Thread Jackie Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jackie Zhang updated SPARK-38939:
-
Summary: Support ALTER TABLE ... DROP [IF EXISTS] COLUMN .. syntax  (was: 
Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax)

> Support ALTER TABLE ... DROP [IF EXISTS] COLUMN .. syntax
> -
>
> Key: SPARK-38939
> URL: https://issues.apache.org/jira/browse/SPARK-38939
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0, 3.2.1, 3.3.0
>Reporter: Jackie Zhang
>Priority: Major
>
> Currently `ALTER TABLE ... DROP COLUMN(s) ...` syntax will always throw error 
> if the column doesn't exist. We would like to provide an (IF EXISTS) syntax 
> to provide better user experience for downstream handlers (such as Delta) 
> that support it, and make consistent with some other DMLs such as `DROP TABLE 
> (IF EXISTS)`



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-38939) Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax

2022-04-18 Thread Jackie Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jackie Zhang updated SPARK-38939:
-
Description: Currently `ALTER TABLE ... DROP COLUMN(s) ...` syntax will 
always throw error if the column doesn't exist. We would like to provide an (IF 
EXISTS) syntax to provide better user experience for downstream handlers (such 
as Delta) that support it, and make consistent with some other DMLs such as 
`DROP TABLE (IF EXISTS)`  (was: Currently `ALTER TABLE ... DROP COLUMN(s) ...` 
syntax will always throw error if the column doesn't exist. We would like to 
provide an (IF EXISTS) syntax to provide better user experience, and make 
consistent with some other DMLs such as `DROP TABLE (IF EXISTS)` etc.)

> Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax
> -
>
> Key: SPARK-38939
> URL: https://issues.apache.org/jira/browse/SPARK-38939
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0, 3.2.1, 3.3.0
>Reporter: Jackie Zhang
>Priority: Major
>
> Currently `ALTER TABLE ... DROP COLUMN(s) ...` syntax will always throw error 
> if the column doesn't exist. We would like to provide an (IF EXISTS) syntax 
> to provide better user experience for downstream handlers (such as Delta) 
> that support it, and make consistent with some other DMLs such as `DROP TABLE 
> (IF EXISTS)`



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-38939) Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax

2022-04-18 Thread Jackie Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jackie Zhang updated SPARK-38939:
-
Description: Currently `ALTER TABLE ... DROP COLUMN(s) ...` syntax will 
always throw error if the column doesn't exist. We would like to provide an (IF 
EXISTS) syntax to provide better user experience, and make consistent with some 
other DMLs such as `DROP TABLE (IF EXISTS)` etc.  (was: Current )

> Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax
> -
>
> Key: SPARK-38939
> URL: https://issues.apache.org/jira/browse/SPARK-38939
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0, 3.2.1, 3.3.0
>Reporter: Jackie Zhang
>Priority: Major
>
> Currently `ALTER TABLE ... DROP COLUMN(s) ...` syntax will always throw error 
> if the column doesn't exist. We would like to provide an (IF EXISTS) syntax 
> to provide better user experience, and make consistent with some other DMLs 
> such as `DROP TABLE (IF EXISTS)` etc.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-38939) Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax

2022-04-18 Thread Jackie Zhang (Jira)
Jackie Zhang created SPARK-38939:


 Summary: Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax
 Key: SPARK-38939
 URL: https://issues.apache.org/jira/browse/SPARK-38939
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.2.1, 3.2.0, 3.3.0
Reporter: Jackie Zhang


Current 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-38094) Parquet: enable matching schema columns by field id

2022-02-08 Thread Jackie Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jackie Zhang updated SPARK-38094:
-
Description: 
Field Id is a native field in the Parquet schema 
([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398])

After this PR, when the requested schema has field IDs, Parquet readers will 
first use the field ID to determine which Parquet columns to read, before 
falling back to using column names as before. It enables matching columns by 
field id for supported DWs like iceberg and Delta.

This PR supports:
 * vectorized reader
 * Parquet-mr reader

  was:
Field Id is a native field in the Parquet schema 
([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398])

After this PR, when the requested schema has field IDs, Parquet readers will 
first use the field ID to determine which Parquet columns to read, before 
falling back to using column names as before. It enables matching columns by 
field id for supported DWs like iceberg and Delta.

This PR supports:
 * vectorized reader

does not support:
 * Parquet-mr reader due to lack of field id support (needs a follow up ticket)


> Parquet: enable matching schema columns by field id
> ---
>
> Key: SPARK-38094
> URL: https://issues.apache.org/jira/browse/SPARK-38094
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Jackie Zhang
>Priority: Major
>
> Field Id is a native field in the Parquet schema 
> ([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398])
> After this PR, when the requested schema has field IDs, Parquet readers will 
> first use the field ID to determine which Parquet columns to read, before 
> falling back to using column names as before. It enables matching columns by 
> field id for supported DWs like iceberg and Delta.
> This PR supports:
>  * vectorized reader
>  * Parquet-mr reader



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-38094) Parquet: enable matching schema columns by field id

2022-02-03 Thread Jackie Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jackie Zhang updated SPARK-38094:
-
Description: 
Field Id is a native field in the Parquet schema 
([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398])

After this PR, when the requested schema has field IDs, Parquet readers will 
first use the field ID to determine which Parquet columns to read, before 
falling back to using column names as before. It enables matching columns by 
field id for supported DWs like iceberg and Delta.

This PR supports:
 * vectorized reader

does not support:
 * Parquet-mr reader due to lack of field id support (needs a follow up ticket)

  was:
Field Id is a native field in the Parquet schema 
([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398])

After this PR, when the requested schema has field IDs, Parquet readers will 
first use the field ID to determine which Parquet columns to read, before 
falling back to using column names as before. It enables matching columns by 
field id for supported DWs like iceberg and Delta.

This PR supports:
 * OSS vectorized reader

does not support:
 * Parquet-mr reader due to lack of field id support (needs a follow up ticket)


> Parquet: enable matching schema columns by field id
> ---
>
> Key: SPARK-38094
> URL: https://issues.apache.org/jira/browse/SPARK-38094
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Affects Versions: 3.3
>Reporter: Jackie Zhang
>Priority: Major
>
> Field Id is a native field in the Parquet schema 
> ([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398])
> After this PR, when the requested schema has field IDs, Parquet readers will 
> first use the field ID to determine which Parquet columns to read, before 
> falling back to using column names as before. It enables matching columns by 
> field id for supported DWs like iceberg and Delta.
> This PR supports:
>  * vectorized reader
> does not support:
>  * Parquet-mr reader due to lack of field id support (needs a follow up 
> ticket)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-38094) Parquet: enable matching schema columns by field id

2022-02-02 Thread Jackie Zhang (Jira)
Jackie Zhang created SPARK-38094:


 Summary: Parquet: enable matching schema columns by field id
 Key: SPARK-38094
 URL: https://issues.apache.org/jira/browse/SPARK-38094
 Project: Spark
  Issue Type: New Feature
  Components: Spark Core
Affects Versions: 3.3
Reporter: Jackie Zhang


Field Id is a native field in the Parquet schema 
([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398])

After this PR, when the requested schema has field IDs, Parquet readers will 
first use the field ID to determine which Parquet columns to read, before 
falling back to using column names as before. It enables matching columns by 
field id for supported DWs like iceberg and Delta.

This PR supports:
 * OSS vectorized reader

does not support:
 * Parquet-mr reader due to lack of field id support (needs a follow up ticket)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org