David Moss created SPARK-52732:
----------------------------------
Summary: to_timestamp returns null if format is just numbers with
more than 1 fractional second
Key: SPARK-52732
URL: https://issues.apache.org/jira/browse/SPARK-52732
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 3.5.2
Environment: Azure Databricks 16.4 LTS (includes Apache Spark 3.5.2,
Scala 2.13)
Reporter: David Moss
When trying to convert a string of numbers to a timestamp using the
to_timestamp function, a null is unexpectedly returned when the are no other
non-numeric characters and multiple fractional seconds.
This can be seen in the following script:
{code:java}
select to_timestamp("20220101123456123456" ,"yyyyMMddHHmmssSSSSSS") as
broken
,to_timestamp("202201011234561" ,"yyyyMMddHHmmssS") as works1
,to_timestamp("2022-01-01T12:34:56.123456","yyyy-MM-dd'T'HH:mm:ss.SSSSSS") as
works2
,to_timestamp("20220101T123456123456" ,"yyyyMMdd'T'HHmmssSSSSSS") as
works3
,to_timestamp("2022-01-01T12-34-56-123456","yyyy-MM-dd'T'HH-mm-ss-SSSSSS") as
works4
,to_timestamp("2022-01-01-12-34-56-123456","yyyy-MM-dd-HH-mm-ss-SSSSSS")
as works5 {code}
which outputs:
||broken||works1||works2||works3||works4||works5||
|null|2022-01-01T12:34:56.1Z|2022-01-01T12:34:56.123456Z|2022-01-01T12:34:56.123456Z|2022-01-01T12:34:56.123456Z|2022-01-01T12:34:56.123456Z|
The output of the 'broken' column should be the same as 'works2' through
'works5'.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]