[jira] [Created] (SPARK-46777) Refactor `StreamingDataSourceRelation` into `StreamingDataSourceRelation` and `StreamingDataSourceScanRelation` for parity with batch scan
Jackie Zhang created SPARK-46777: Summary: Refactor `StreamingDataSourceRelation` into `StreamingDataSourceRelation` and `StreamingDataSourceScanRelation` for parity with batch scan Key: SPARK-46777 URL: https://issues.apache.org/jira/browse/SPARK-46777 Project: Spark Issue Type: Improvement Components: SQL, Structured Streaming Affects Versions: 4.0.0 Reporter: Jackie Zhang To prepare for the incoming structured streaming operator pushdown, we'd like to refactor some catalyst object relationships first. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-39013) Parser changes to enforce `()` for creating table without any columns
[ https://issues.apache.org/jira/browse/SPARK-39013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackie Zhang resolved SPARK-39013. -- Resolution: Won't Fix > Parser changes to enforce `()` for creating table without any columns > - > > Key: SPARK-39013 > URL: https://issues.apache.org/jira/browse/SPARK-39013 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Jackie Zhang >Priority: Major > > We would like to enforce the `()` for `CREATE TABLE` queries to explicit > indicate a table without any columns will be created. > E.g. `CREATE TABLE table () USING DELTA`. > Existing behavior of CTAS and CREATE external table at location are not > affected. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39013) Parser changes to enforce `()` for creating table without any columns
[ https://issues.apache.org/jira/browse/SPARK-39013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackie Zhang updated SPARK-39013: - Summary: Parser changes to enforce `()` for creating table without any columns (was: Parse changes to enforce `()` for creating table without any columns) > Parser changes to enforce `()` for creating table without any columns > - > > Key: SPARK-39013 > URL: https://issues.apache.org/jira/browse/SPARK-39013 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4 >Reporter: Jackie Zhang >Priority: Major > > We would like to enforce the `()` for `CREATE TABLE` queries to explicit > indicate a table without any columns will be created. > E.g. `CREATE TABLE table () USING DELTA`. > Existing behavior of CTAS and CREATE external table at location are not > affected. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39013) Parse changes to enforce `()` for creating table without any columns
Jackie Zhang created SPARK-39013: Summary: Parse changes to enforce `()` for creating table without any columns Key: SPARK-39013 URL: https://issues.apache.org/jira/browse/SPARK-39013 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.4 Reporter: Jackie Zhang We would like to enforce the `()` for `CREATE TABLE` queries to explicit indicate a table without any columns will be created. E.g. `CREATE TABLE table () USING DELTA`. Existing behavior of CTAS and CREATE external table at location are not affected. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38939) Support ALTER TABLE ... DROP [IF EXISTS] COLUMN .. syntax
[ https://issues.apache.org/jira/browse/SPARK-38939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackie Zhang updated SPARK-38939: - Summary: Support ALTER TABLE ... DROP [IF EXISTS] COLUMN .. syntax (was: Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax) > Support ALTER TABLE ... DROP [IF EXISTS] COLUMN .. syntax > - > > Key: SPARK-38939 > URL: https://issues.apache.org/jira/browse/SPARK-38939 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0, 3.2.1, 3.3.0 >Reporter: Jackie Zhang >Priority: Major > > Currently `ALTER TABLE ... DROP COLUMN(s) ...` syntax will always throw error > if the column doesn't exist. We would like to provide an (IF EXISTS) syntax > to provide better user experience for downstream handlers (such as Delta) > that support it, and make consistent with some other DMLs such as `DROP TABLE > (IF EXISTS)` -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38939) Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax
[ https://issues.apache.org/jira/browse/SPARK-38939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackie Zhang updated SPARK-38939: - Description: Currently `ALTER TABLE ... DROP COLUMN(s) ...` syntax will always throw error if the column doesn't exist. We would like to provide an (IF EXISTS) syntax to provide better user experience for downstream handlers (such as Delta) that support it, and make consistent with some other DMLs such as `DROP TABLE (IF EXISTS)` (was: Currently `ALTER TABLE ... DROP COLUMN(s) ...` syntax will always throw error if the column doesn't exist. We would like to provide an (IF EXISTS) syntax to provide better user experience, and make consistent with some other DMLs such as `DROP TABLE (IF EXISTS)` etc.) > Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax > - > > Key: SPARK-38939 > URL: https://issues.apache.org/jira/browse/SPARK-38939 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0, 3.2.1, 3.3.0 >Reporter: Jackie Zhang >Priority: Major > > Currently `ALTER TABLE ... DROP COLUMN(s) ...` syntax will always throw error > if the column doesn't exist. We would like to provide an (IF EXISTS) syntax > to provide better user experience for downstream handlers (such as Delta) > that support it, and make consistent with some other DMLs such as `DROP TABLE > (IF EXISTS)` -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38939) Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax
[ https://issues.apache.org/jira/browse/SPARK-38939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackie Zhang updated SPARK-38939: - Description: Currently `ALTER TABLE ... DROP COLUMN(s) ...` syntax will always throw error if the column doesn't exist. We would like to provide an (IF EXISTS) syntax to provide better user experience, and make consistent with some other DMLs such as `DROP TABLE (IF EXISTS)` etc. (was: Current ) > Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax > - > > Key: SPARK-38939 > URL: https://issues.apache.org/jira/browse/SPARK-38939 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0, 3.2.1, 3.3.0 >Reporter: Jackie Zhang >Priority: Major > > Currently `ALTER TABLE ... DROP COLUMN(s) ...` syntax will always throw error > if the column doesn't exist. We would like to provide an (IF EXISTS) syntax > to provide better user experience, and make consistent with some other DMLs > such as `DROP TABLE (IF EXISTS)` etc. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38939) Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax
Jackie Zhang created SPARK-38939: Summary: Support ALTER TABLE ... DROP COLUMN [IF EXISTS] .. syntax Key: SPARK-38939 URL: https://issues.apache.org/jira/browse/SPARK-38939 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.1, 3.2.0, 3.3.0 Reporter: Jackie Zhang Current -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38094) Parquet: enable matching schema columns by field id
[ https://issues.apache.org/jira/browse/SPARK-38094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackie Zhang updated SPARK-38094: - Description: Field Id is a native field in the Parquet schema ([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398]) After this PR, when the requested schema has field IDs, Parquet readers will first use the field ID to determine which Parquet columns to read, before falling back to using column names as before. It enables matching columns by field id for supported DWs like iceberg and Delta. This PR supports: * vectorized reader * Parquet-mr reader was: Field Id is a native field in the Parquet schema ([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398]) After this PR, when the requested schema has field IDs, Parquet readers will first use the field ID to determine which Parquet columns to read, before falling back to using column names as before. It enables matching columns by field id for supported DWs like iceberg and Delta. This PR supports: * vectorized reader does not support: * Parquet-mr reader due to lack of field id support (needs a follow up ticket) > Parquet: enable matching schema columns by field id > --- > > Key: SPARK-38094 > URL: https://issues.apache.org/jira/browse/SPARK-38094 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Jackie Zhang >Priority: Major > > Field Id is a native field in the Parquet schema > ([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398]) > After this PR, when the requested schema has field IDs, Parquet readers will > first use the field ID to determine which Parquet columns to read, before > falling back to using column names as before. It enables matching columns by > field id for supported DWs like iceberg and Delta. > This PR supports: > * vectorized reader > * Parquet-mr reader -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38094) Parquet: enable matching schema columns by field id
[ https://issues.apache.org/jira/browse/SPARK-38094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jackie Zhang updated SPARK-38094: - Description: Field Id is a native field in the Parquet schema ([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398]) After this PR, when the requested schema has field IDs, Parquet readers will first use the field ID to determine which Parquet columns to read, before falling back to using column names as before. It enables matching columns by field id for supported DWs like iceberg and Delta. This PR supports: * vectorized reader does not support: * Parquet-mr reader due to lack of field id support (needs a follow up ticket) was: Field Id is a native field in the Parquet schema ([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398]) After this PR, when the requested schema has field IDs, Parquet readers will first use the field ID to determine which Parquet columns to read, before falling back to using column names as before. It enables matching columns by field id for supported DWs like iceberg and Delta. This PR supports: * OSS vectorized reader does not support: * Parquet-mr reader due to lack of field id support (needs a follow up ticket) > Parquet: enable matching schema columns by field id > --- > > Key: SPARK-38094 > URL: https://issues.apache.org/jira/browse/SPARK-38094 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 3.3 >Reporter: Jackie Zhang >Priority: Major > > Field Id is a native field in the Parquet schema > ([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398]) > After this PR, when the requested schema has field IDs, Parquet readers will > first use the field ID to determine which Parquet columns to read, before > falling back to using column names as before. It enables matching columns by > field id for supported DWs like iceberg and Delta. > This PR supports: > * vectorized reader > does not support: > * Parquet-mr reader due to lack of field id support (needs a follow up > ticket) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38094) Parquet: enable matching schema columns by field id
Jackie Zhang created SPARK-38094: Summary: Parquet: enable matching schema columns by field id Key: SPARK-38094 URL: https://issues.apache.org/jira/browse/SPARK-38094 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 3.3 Reporter: Jackie Zhang Field Id is a native field in the Parquet schema ([https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L398]) After this PR, when the requested schema has field IDs, Parquet readers will first use the field ID to determine which Parquet columns to read, before falling back to using column names as before. It enables matching columns by field id for supported DWs like iceberg and Delta. This PR supports: * OSS vectorized reader does not support: * Parquet-mr reader due to lack of field id support (needs a follow up ticket) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org