And to answer your question (sorry, read too fast). The string is not in
proper ISO8601. Extended form must be used throughout, ie
2020-04-11T20:40:00-05:00, there's a colon (:) lacking in the UTC offset
info.

br,

Magnus

On Tue, Mar 31, 2020 at 7:11 PM Magnus Nilsson <ma...@kth.se> wrote:

> Timestamps aren't timezoned. If you parse ISO8601 strings they will be
> converted to UTC automatically.
>
> If you parse timestamps without timezone they will converted to the the
> timezone the server Spark is running on uses. You can change the timezone
> Spark uses with spark.conf.set("spark.sql.session.timeZone", "UTC").
> Timestamps represent a point in time, the clock representation of that
> instant is dependent on sparks timezone settings both for parsing (non
> ISO8601) strings and showing timestamps.
>
> br,
>
> Magnus
>
> On Tue, Mar 31, 2020 at 6:14 PM Chetan Khatri <chetan.opensou...@gmail.com>
> wrote:
>
>> Hi Spark Users,
>>
>> I am losing the timezone value from below format, I tried couple of
>> formats but not able to make it. Can someone throw lights?
>>
>> scala> val sampleDF = Seq("2020-04-11T20:40:00-0500").toDF("value")
>> sampleDF: org.apache.spark.sql.DataFrame = [value: string]
>>
>> scala> sampleDF.select('value, to_timestamp('value,
>> "yyyy-MM-dd\'T\'HH:mm:ss")).show(false)
>>
>> +------------------------+------------------------------------------------+
>> |value                   |to_timestamp(`value`,
>> 'yyyy-MM-dd\'T\'HH:mm:ss')|
>>
>> +------------------------+------------------------------------------------+
>> |2020-04-11T20:40:00-0500|2020-04-11 20:40:00
>> |
>>
>> +------------------------+------------------------------------------------+
>>
>> Thanks
>>
>

Reply via email to