[jira] [Created] (SPARK-44313) Generated column expression validation fails if there is a char/varchar column anywhere in the schema
Allison Portis created SPARK-44313: -- Summary: Generated column expression validation fails if there is a char/varchar column anywhere in the schema Key: SPARK-44313 URL: https://issues.apache.org/jira/browse/SPARK-44313 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.4.1, 3.4.0 Reporter: Allison Portis When validating generated column expressions, this call https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GeneratedColumn.scala#L123 to checkAnalysis fails when there are char or varchar columns anywhere in the schema. For example, this query will fail {code:java} CREATE TABLE default.example ( name VARCHAR(64), tstamp TIMESTAMP, tstamp_date DATE GENERATED ALWAYS AS (CAST(tstamp as DATE)) ){code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41806) Use AppendData.byName for SQL INSERT INTO by name for DSV2 and block ambiguous queries with static partitions columns
Allison Portis created SPARK-41806: -- Summary: Use AppendData.byName for SQL INSERT INTO by name for DSV2 and block ambiguous queries with static partitions columns Key: SPARK-41806 URL: https://issues.apache.org/jira/browse/SPARK-41806 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.4.0 Reporter: Allison Portis Currently for INSERT INTO by name we reorder the value list and convert it to INSERT INTO by ordinal. Since DSv2 logical nodes have the isByName flag we don't need to do this. The current approach is limiting in that # Users must provide the full list of table columns (this limits the functionality for features like generated columns see SPARK-41290) # It allows ambiguous queries such as INSERT OVERWRITE t PARTITION (c='1') (c) VALUES ('2') where the user provides both the static partition column 'c' and the column 'c' in the column list. We should check that the static partition column is not in the column list. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41735) Any SparkThrowable (with an error class) not in error-classes.json is masked in SQLExecution.withNewExecutionId and end-user will see "org.apache.spark.SparkException: [
[ https://issues.apache.org/jira/browse/SPARK-41735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allison Portis updated SPARK-41735: --- Description: This change [here|https://github.com/apache/spark/pull/38302/files#diff-fdd1e9e26aa1ba9d1cc923ee7c84a1935dcc285502330a471f1ade7f3ad08bf9] means that any seen error is passed to SparkThrowableHelper.getMessage(...). Any SparkThrowable with an error class (for example, if a connector uses the spark error format i.e. see ErrorClassesJsonReader) will be masked as {code:java} org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot find main error class 'SOME_ERROR_CLASS'{code} in SparkThrowableHelper.getMessage since errorReader.getMessageTemplate(errorClass) will fail for the error class not defined in error-classes.json. was: This change [here|https://github.com/apache/spark/pull/38302/files#diff-fdd1e9e26aa1ba9d1cc923ee7c84a1935dcc285502330a471f1ade7f3ad08bf9] means that any seen error is passed to SparkThrowableHelper.getMessage(...). Any SparkThrowable with an error class (for example, if a connector uses the spark error format i.e. see ErrorClassesJsonReader) will be masked as {code:java} org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot find main error class 'SOME_ERROR_CLASS'{code} in SparkThrowableHelper.getMessage since errorReader.getMessageTemplate(errorClass) will fail for the error class not defined in error-classes.json. > Any SparkThrowable (with an error class) not in error-classes.json is masked > in SQLExecution.withNewExecutionId and end-user will see > "org.apache.spark.SparkException: [INTERNAL_ERROR]" > -- > > Key: SPARK-41735 > URL: https://issues.apache.org/jira/browse/SPARK-41735 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Priority: Major > > This change > [here|https://github.com/apache/spark/pull/38302/files#diff-fdd1e9e26aa1ba9d1cc923ee7c84a1935dcc285502330a471f1ade7f3ad08bf9] > means that any seen error is passed to SparkThrowableHelper.getMessage(...). > Any SparkThrowable with an error class (for example, if a connector uses the > spark error format i.e. see ErrorClassesJsonReader) will be masked as > {code:java} > org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot find main error > class 'SOME_ERROR_CLASS'{code} > in SparkThrowableHelper.getMessage since > errorReader.getMessageTemplate(errorClass) will fail for the error class not > defined in error-classes.json. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41735) Any SparkThrowable (with an error class) not in error-classes.json is masked in SQLExecution.withNewExecutionId and end-user will see "org.apache.spark.SparkException: [
[ https://issues.apache.org/jira/browse/SPARK-41735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allison Portis updated SPARK-41735: --- Description: This change [here|https://github.com/apache/spark/pull/38302/files#diff-fdd1e9e26aa1ba9d1cc923ee7c84a1935dcc285502330a471f1ade7f3ad08bf9] means that any seen error is passed to SparkThrowableHelper.getMessage(...). Any SparkThrowable with an error class (for example, if a connector uses the spark error format i.e. see ErrorClassesJsonReader) will be masked as {code:java} org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot find main error class 'SOME_ERROR_CLASS'{code} in SparkThrowableHelper.getMessage since errorReader.getMessageTemplate(errorClass) will fail for the error class not defined in error-classes.json. was: This change [here|https://github.com/apache/spark/pull/38302/files#diff-fdd1e9e26aa1ba9d1cc923ee7c84a1935dcc285502330a471f1ade7f3ad08bf9] means that any seen error is passed to `SparkThrowableHelper.getMessage(...)`. Any SparkThrowable with an error class (for example, if a connector uses the spark error format i.e. see `ErrorClassesJsonReader`) will be masked as {code:java} org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot find main error class 'SOME_ERROR_CLASS'{code} in `SparkThrowableHelper.getMessage` since `errorReader.getMessageTemplate(errorClass)` will fail for the error class not defined in `error-classes.json` > Any SparkThrowable (with an error class) not in error-classes.json is masked > in SQLExecution.withNewExecutionId and end-user will see > "org.apache.spark.SparkException: [INTERNAL_ERROR]" > -- > > Key: SPARK-41735 > URL: https://issues.apache.org/jira/browse/SPARK-41735 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Priority: Major > > This change > [here|https://github.com/apache/spark/pull/38302/files#diff-fdd1e9e26aa1ba9d1cc923ee7c84a1935dcc285502330a471f1ade7f3ad08bf9] > means that any seen error is passed to SparkThrowableHelper.getMessage(...). > Any SparkThrowable with an error class (for example, if a connector uses the > spark error format i.e. see ErrorClassesJsonReader) will be masked as > {code:java} > org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot find main error > class 'SOME_ERROR_CLASS'{code} > in SparkThrowableHelper.getMessage since > errorReader.getMessageTemplate(errorClass) will fail for the error class not > defined in error-classes.json. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41735) Any SparkThrowable (with an error class) not in error-classes.json is masked in SQLExecution.withNewExecutionId and end-user will see "org.apache.spark.SparkException: [
[ https://issues.apache.org/jira/browse/SPARK-41735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allison Portis updated SPARK-41735: --- Summary: Any SparkThrowable (with an error class) not in error-classes.json is masked in SQLExecution.withNewExecutionId and end-user will see "org.apache.spark.SparkException: [INTERNAL_ERROR]" (was: Any SparkThrowable (with an error class) not in `error-classes.json` is masked in `SQLExecution.withNewExecutionId` and end-user will see `org.apache.spark.SparkException: [INTERNAL_ERROR]` ) > Any SparkThrowable (with an error class) not in error-classes.json is masked > in SQLExecution.withNewExecutionId and end-user will see > "org.apache.spark.SparkException: [INTERNAL_ERROR]" > -- > > Key: SPARK-41735 > URL: https://issues.apache.org/jira/browse/SPARK-41735 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Priority: Major > > This change > [here|https://github.com/apache/spark/pull/38302/files#diff-fdd1e9e26aa1ba9d1cc923ee7c84a1935dcc285502330a471f1ade7f3ad08bf9] > means that any seen error is passed to > `SparkThrowableHelper.getMessage(...)`. Any SparkThrowable with an error > class (for example, if a connector uses the spark error format i.e. see > `ErrorClassesJsonReader`) will be masked as > {code:java} > org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot find main error > class 'SOME_ERROR_CLASS'{code} > in `SparkThrowableHelper.getMessage` since > `errorReader.getMessageTemplate(errorClass)` will fail for the error class > not defined in `error-classes.json` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41735) Any SparkThrowable (with an error class) not in `error-classes.json` is masked in `SQLExecution.withNewExecutionId` and end-user will see `org.apache.spark.SparkExceptio
Allison Portis created SPARK-41735: -- Summary: Any SparkThrowable (with an error class) not in `error-classes.json` is masked in `SQLExecution.withNewExecutionId` and end-user will see `org.apache.spark.SparkException: [INTERNAL_ERROR]` Key: SPARK-41735 URL: https://issues.apache.org/jira/browse/SPARK-41735 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.4.0 Reporter: Allison Portis This change [here|https://github.com/apache/spark/pull/38302/files#diff-fdd1e9e26aa1ba9d1cc923ee7c84a1935dcc285502330a471f1ade7f3ad08bf9] means that any seen error is passed to `SparkThrowableHelper.getMessage(...)`. Any SparkThrowable with an error class (for example, if a connector uses the spark error format i.e. see `ErrorClassesJsonReader`) will be masked as {code:java} org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot find main error class 'SOME_ERROR_CLASS'{code} in `SparkThrowableHelper.getMessage` since `errorReader.getMessageTemplate(errorClass)` will fail for the error class not defined in `error-classes.json` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41290) Support GENERATED ALWAYS AS syntax in create/replace table to create a generated column
[ https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allison Portis updated SPARK-41290: --- Description: Support GENERATED ALWAYS AS syntax for defining generated columns in create table and replace table. For example, {code:java} CREATE TABLE default.example ( time TIMESTAMP, date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) ) {code} This syntax is SQL standard and will enable defining generated columns in Spark SQL for data sources that support it. was: Support GENERATED ALWAYS AS syntax for defining generated columns in create table and replace table. For example, {code:java} CREATE TABLE default.example ( time TIMESTAMP, date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) ) {code} This syntax is SQL standard and will enable defining generated columns in Spark SQL for data sources that support them. > Support GENERATED ALWAYS AS syntax in create/replace table to create a > generated column > --- > > Key: SPARK-41290 > URL: https://issues.apache.org/jira/browse/SPARK-41290 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Priority: Major > > Support GENERATED ALWAYS AS syntax for defining generated columns in create > table and replace table. > For example, > {code:java} > CREATE TABLE default.example ( > time TIMESTAMP, > date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) > ) > {code} > > This syntax is SQL standard and will enable defining generated columns in > Spark SQL for data sources that support it. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41290) Support GENERATED ALWAYS AS syntax in create/replace table to create a generated column
[ https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allison Portis updated SPARK-41290: --- Description: Support GENERATED ALWAYS AS syntax for defining generated columns in create table and replace table. For example, {code:java} CREATE TABLE default.example ( time TIMESTAMP, date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) ) {code} This syntax is SQL standard and will enable defining generated columns in Spark SQL for data sources that support them. was: Support GENERATED ALWAYS AS syntax for defining generated columns in create table and replace table. For example, {code:java} CREATE TABLE default.example ( time TIMESTAMP, date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) ) {code} > Support GENERATED ALWAYS AS syntax in create/replace table to create a > generated column > --- > > Key: SPARK-41290 > URL: https://issues.apache.org/jira/browse/SPARK-41290 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Priority: Major > > Support GENERATED ALWAYS AS syntax for defining generated columns in create > table and replace table. > For example, > {code:java} > CREATE TABLE default.example ( > time TIMESTAMP, > date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) > ) > {code} > > This syntax is SQL standard and will enable defining generated columns in > Spark SQL for data sources that support them. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41290) Support GENERATED ALWAYS AS syntax in create/replace table to create a generated column
[ https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allison Portis updated SPARK-41290: --- Summary: Support GENERATED ALWAYS AS syntax in create/replace table to create a generated column (was: Support GENERATED ALWAYS AS in create/replace table) > Support GENERATED ALWAYS AS syntax in create/replace table to create a > generated column > --- > > Key: SPARK-41290 > URL: https://issues.apache.org/jira/browse/SPARK-41290 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Priority: Major > > Support GENERATED ALWAYS AS syntax for defining generated columns in create > table and replace table. > For example, > {code:java} > CREATE TABLE default.example ( > time TIMESTAMP, > date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) > ) > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41290) Support GENERATED ALWAYS AS in create/replace table
[ https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allison Portis updated SPARK-41290: --- Description: Support GENERATED ALWAYS AS syntax for defining generated columns in create table and replace table. For example, {code:java} CREATE TABLE default.example ( time TIMESTAMP, date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) ) {code} was: Support GENERATED ALWAYS AS syntax for defining generated columns in create table and replace table. For example, {code:java} CREATE TABLE default.example ( time TIMESTAMP, date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) ) {code} > Support GENERATED ALWAYS AS in create/replace table > --- > > Key: SPARK-41290 > URL: https://issues.apache.org/jira/browse/SPARK-41290 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Priority: Major > > Support GENERATED ALWAYS AS syntax for defining generated columns in create > table and replace table. > For example, > {code:java} > CREATE TABLE default.example ( > time TIMESTAMP, > date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) > ) > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41290) Support GENERATED ALWAYS AS in create/replace table
[ https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allison Portis updated SPARK-41290: --- Description: Support GENERATED ALWAYS AS syntax for defining generated columns in create table and replace table. For example, {code:java} CREATE TABLE default.example ( time TIMESTAMP, date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) ) {code} was: Support GENERATED ALWAYS AS syntax for defining generated columns in create table. For example, {code:java} CREATE TABLE default.example ( time TIMESTAMP, date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) ) {code} > Support GENERATED ALWAYS AS in create/replace table > --- > > Key: SPARK-41290 > URL: https://issues.apache.org/jira/browse/SPARK-41290 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Priority: Major > > Support GENERATED ALWAYS AS syntax for defining generated columns in create > table and replace table. > For example, > > {code:java} > CREATE TABLE default.example ( > time TIMESTAMP, > date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) > ) > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41290) Support GENERATED ALWAYS AS in create/replace table
[ https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allison Portis updated SPARK-41290: --- Summary: Support GENERATED ALWAYS AS in create/replace table (was: Support GENERATED ALWAYS AS in create and replace table) > Support GENERATED ALWAYS AS in create/replace table > --- > > Key: SPARK-41290 > URL: https://issues.apache.org/jira/browse/SPARK-41290 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Priority: Major > > Support GENERATED ALWAYS AS syntax for defining generated columns in create > table. > For example, > > {code:java} > CREATE TABLE default.example ( > time TIMESTAMP, > date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) > ) > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41290) Support GENERATED ALWAYS AS in create and replace table
[ https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allison Portis updated SPARK-41290: --- Summary: Support GENERATED ALWAYS AS in create and replace table (was: Support GENERATED ALWAYS AS in create table) > Support GENERATED ALWAYS AS in create and replace table > --- > > Key: SPARK-41290 > URL: https://issues.apache.org/jira/browse/SPARK-41290 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Priority: Major > > Support GENERATED ALWAYS AS syntax for defining generated columns in create > table. > For example, > > {code:java} > CREATE TABLE default.example ( > time TIMESTAMP, > date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) > ) > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41290) Support GENERATED ALWAYS AS in create table
Allison Portis created SPARK-41290: -- Summary: Support GENERATED ALWAYS AS in create table Key: SPARK-41290 URL: https://issues.apache.org/jira/browse/SPARK-41290 Project: Spark Issue Type: New Feature Components: SQL Affects Versions: 3.4.0 Reporter: Allison Portis Support GENERATED ALWAYS AS syntax for defining generated columns in create table. For example, {code:java} CREATE TABLE default.example ( time TIMESTAMP, date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) ) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41154) Incorrect relation caching for queries with time travel spec
Allison Portis created SPARK-41154: -- Summary: Incorrect relation caching for queries with time travel spec Key: SPARK-41154 URL: https://issues.apache.org/jira/browse/SPARK-41154 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.1, 3.3.0 Reporter: Allison Portis [https://github.com/apache/spark/pull/34497] added AS OF syntax support to support time travel queries in SQL. When resolving these [we cache the resolved relation|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L1250] with only the qualified table name as the key, ignoring the time travel spec. Thus any subsequent queries on that table are resolved using the first's time travel spec. This affects subqueries, CTEs, and temporary views (when created with SQL). Queries like this will be incorrectly resolved: {code:sql} select * from table version as of 1 union all select * from table version as of 0 {code} ---> {code:sql} select * from table version as of 1 union all select * from table version as of 1 {code} This was originally reported here https://github.com/delta-io/delta/issues/1479 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org