[jira] [Assigned] (SPARK-46890) CSV fails on a column with default and without enforcing schema

2024-02-02 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-46890:


Assignee: Daniel

> CSV fails on a column with default and without enforcing schema
> ---
>
> Key: SPARK-46890
> URL: https://issues.apache.org/jira/browse/SPARK-46890
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Assignee: Daniel
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2024-01-29-13-22-05-326.png
>
>
> When we create a table using CSV on an existing file with a header and:
>  - a column has an default +
>  - enforceSchema is false - taking into account CSV header
> then query a column with a default.
> The example below shows the issue:
> {code:sql}
> CREATE TABLE IF NOT EXISTS products (
>   product_id INT,
>   name STRING,
>   price FLOAT default 0.0,
>   quantity INT default 0
> )
> USING CSV
> OPTIONS (
>   header 'true',
>   inferSchema 'false',
>   enforceSchema 'false',
>   path '/Users/maximgekk/tmp/products.csv'
> );
> {code}
> The CSV file products.csv:
> {code:java}
> product_id,name,price,quantity
> 1,Apple,0.50,100
> 2,Banana,0.25,200
> 3,Orange,0.75,50
> {code}
> The query fails:
> {code:sql}
> spark-sql (default)> SELECT price FROM products;
> 24/01/28 11:43:09 ERROR Executor: Exception in task 0.0 in stage 8.0 (TID 6)
> java.lang.IllegalArgumentException: Number of column in CSV header is not 
> equal to number of fields in the schema:
>  Header length: 4, schema size: 1
> CSV file: file:///Users/maximgekk/tmp/products.csv
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-46890) CSV fails on a column with default and without enforcing schema

2024-01-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-46890:
--

Assignee: (was: Apache Spark)

> CSV fails on a column with default and without enforcing schema
> ---
>
> Key: SPARK-46890
> URL: https://issues.apache.org/jira/browse/SPARK-46890
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2024-01-29-13-22-05-326.png
>
>
> When we create a table using CSV on an existing file with a header and:
>  - a column has an default +
>  - enforceSchema is false - taking into account CSV header
> then query a column with a default.
> The example below shows the issue:
> {code:sql}
> CREATE TABLE IF NOT EXISTS products (
>   product_id INT,
>   name STRING,
>   price FLOAT default 0.0,
>   quantity INT default 0
> )
> USING CSV
> OPTIONS (
>   header 'true',
>   inferSchema 'false',
>   enforceSchema 'false',
>   path '/Users/maximgekk/tmp/products.csv'
> );
> {code}
> The CSV file products.csv:
> {code:java}
> product_id,name,price,quantity
> 1,Apple,0.50,100
> 2,Banana,0.25,200
> 3,Orange,0.75,50
> {code}
> The query fails:
> {code:sql}
> spark-sql (default)> SELECT price FROM products;
> 24/01/28 11:43:09 ERROR Executor: Exception in task 0.0 in stage 8.0 (TID 6)
> java.lang.IllegalArgumentException: Number of column in CSV header is not 
> equal to number of fields in the schema:
>  Header length: 4, schema size: 1
> CSV file: file:///Users/maximgekk/tmp/products.csv
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-46890) CSV fails on a column with default and without enforcing schema

2024-01-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-46890:
--

Assignee: Apache Spark

> CSV fails on a column with default and without enforcing schema
> ---
>
> Key: SPARK-46890
> URL: https://issues.apache.org/jira/browse/SPARK-46890
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Assignee: Apache Spark
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2024-01-29-13-22-05-326.png
>
>
> When we create a table using CSV on an existing file with a header and:
>  - a column has an default +
>  - enforceSchema is false - taking into account CSV header
> then query a column with a default.
> The example below shows the issue:
> {code:sql}
> CREATE TABLE IF NOT EXISTS products (
>   product_id INT,
>   name STRING,
>   price FLOAT default 0.0,
>   quantity INT default 0
> )
> USING CSV
> OPTIONS (
>   header 'true',
>   inferSchema 'false',
>   enforceSchema 'false',
>   path '/Users/maximgekk/tmp/products.csv'
> );
> {code}
> The CSV file products.csv:
> {code:java}
> product_id,name,price,quantity
> 1,Apple,0.50,100
> 2,Banana,0.25,200
> 3,Orange,0.75,50
> {code}
> The query fails:
> {code:sql}
> spark-sql (default)> SELECT price FROM products;
> 24/01/28 11:43:09 ERROR Executor: Exception in task 0.0 in stage 8.0 (TID 6)
> java.lang.IllegalArgumentException: Number of column in CSV header is not 
> equal to number of fields in the schema:
>  Header length: 4, schema size: 1
> CSV file: file:///Users/maximgekk/tmp/products.csv
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-46890) CSV fails on a column with default and without enforcing schema

2024-01-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-46890:


Assignee: (was: Max Gekk)

> CSV fails on a column with default and without enforcing schema
> ---
>
> Key: SPARK-46890
> URL: https://issues.apache.org/jira/browse/SPARK-46890
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Priority: Major
>
> When we create a table using CSV on an existing file with a header and:
> - a column has an default +
> - enforceSchema is false - taking into account CSV header
> The example below shows the issue:
> {code:sql}
> CREATE TABLE IF NOT EXISTS products (
>   product_id INT,
>   name STRING,
>   price FLOAT default 0.0,
>   quantity INT default 0
> )
> USING CSV
> OPTIONS (
>   header 'true',
>   inferSchema 'false',
>   enforceSchema 'false',
>   path '/Users/maximgekk/tmp/products.csv'
> );
> {code}
> The CSV file products.csv:
> {code}
> product_id,name,price,quantity
> 1,Apple,0.50,100
> 2,Banana,0.25,200
> 3,Orange,0.75,50
> {code}
> The query fails:
> {code:sql}
> spark-sql (default)> SELECT price FROM products;
> 24/01/28 11:43:09 ERROR Executor: Exception in task 0.0 in stage 8.0 (TID 6)
> java.lang.IllegalArgumentException: Number of column in CSV header is not 
> equal to number of fields in the schema:
>  Header length: 4, schema size: 1
> CSV file: file:///Users/maximgekk/tmp/products.csv
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org