And to answer your question (sorry, read too fast). The string is not in proper ISO8601. Extended form must be used throughout, ie 2020-04-11T20:40:00-05:00, there's a colon (:) lacking in the UTC offset info.
br, Magnus On Tue, Mar 31, 2020 at 7:11 PM Magnus Nilsson <ma...@kth.se> wrote: > Timestamps aren't timezoned. If you parse ISO8601 strings they will be > converted to UTC automatically. > > If you parse timestamps without timezone they will converted to the the > timezone the server Spark is running on uses. You can change the timezone > Spark uses with spark.conf.set("spark.sql.session.timeZone", "UTC"). > Timestamps represent a point in time, the clock representation of that > instant is dependent on sparks timezone settings both for parsing (non > ISO8601) strings and showing timestamps. > > br, > > Magnus > > On Tue, Mar 31, 2020 at 6:14 PM Chetan Khatri <chetan.opensou...@gmail.com> > wrote: > >> Hi Spark Users, >> >> I am losing the timezone value from below format, I tried couple of >> formats but not able to make it. Can someone throw lights? >> >> scala> val sampleDF = Seq("2020-04-11T20:40:00-0500").toDF("value") >> sampleDF: org.apache.spark.sql.DataFrame = [value: string] >> >> scala> sampleDF.select('value, to_timestamp('value, >> "yyyy-MM-dd\'T\'HH:mm:ss")).show(false) >> >> +------------------------+------------------------------------------------+ >> |value |to_timestamp(`value`, >> 'yyyy-MM-dd\'T\'HH:mm:ss')| >> >> +------------------------+------------------------------------------------+ >> |2020-04-11T20:40:00-0500|2020-04-11 20:40:00 >> | >> >> +------------------------+------------------------------------------------+ >> >> Thanks >> >