[ 
https://issues.apache.org/jira/browse/SPARK-39012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-39012:
-----------------------------
    Description: 
When Spark needs to infer schema, it needs to parse string to a type. Not all 
data types are supported so far in this path. For example, binary is known to 
not be supported. 

string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also 
because when converting from a string, small scale type won't be identified if 
there is a larger scale type. For example, short and long 

Based on Spark SQL data types: 
https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support the 
following types:

BINARY
BOOLEAN

And there are two types that I am not sure if SparkSQL is supporting:
YearMonthIntervalType
DayTimeIntervalType


  was:
When Spark needs to infer schema, it needs to parse string to a type. Not all 
data types are supported so far in this path. For example, binary is spotted to 
not supported.

string might be converted to all types except ARRAY, MAP, STRUCT, etc.

Spark SQL data types: 
https://spark.apache.org/docs/latest/sql-ref-datatypes.html


> SparkSQL infer schema does not support all data types
> -----------------------------------------------------
>
>                 Key: SPARK-39012
>                 URL: https://issues.apache.org/jira/browse/SPARK-39012
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Rui Wang
>            Priority: Major
>
> When Spark needs to infer schema, it needs to parse string to a type. Not all 
> data types are supported so far in this path. For example, binary is known to 
> not be supported. 
> string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also 
> because when converting from a string, small scale type won't be identified 
> if there is a larger scale type. For example, short and long 
> Based on Spark SQL data types: 
> https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support 
> the following types:
> BINARY
> BOOLEAN
> And there are two types that I am not sure if SparkSQL is supporting:
> YearMonthIntervalType
> DayTimeIntervalType



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to