akefieldmba<http://www.linkedin.com/in/bobwakefieldmba>
Twitter: @BobLovesData<http://twitter.com/BobLovesData>
From: Steve Loughran [mailto:ste...@hortonworks.com]
Sent: Tuesday, October 3, 2017 2:19 PM
To: Adaryl Wakefield <adaryl.wakefi...@hotmail.com>
Cc: user@spark.apache.org
Subject:
://www.linkedin.com/in/bobwakefieldmba>
Twitter: @BobLovesData<http://twitter.com/BobLovesData>
From: Nicholas Hakobian [mailto:nicholas.hakob...@rallyhealth.com]
Sent: Tuesday, October 3, 2017 1:04 PM
To: Adaryl Wakefield <adaryl.wakefi...@hotmail.com>
Cc: user@spark.apach
On 3 Oct 2017, at 18:43, Adaryl Wakefield
> wrote:
I gave myself a project to start actually writing Spark programs. I’m using
Scala and Spark 2.2.0. In my project, I had to do some grouping and filtering
by dates. It was awful
I'd suggest first converting your string containing your date/time to a
TimestampType or a DateType. Then the built in functions for year, month,
day, etc. will then work as expected. If your date is in a "standard"
format, you can perform the conversion just by casting the column to a date
or
I usually check the list of Hive UDFs as Spark has implemented almost all
of them
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions
Or/and check `org.apache.spark.sql.functions` directly:
I gave myself a project to start actually writing Spark programs. I'm using
Scala and Spark 2.2.0. In my project, I had to do some grouping and filtering
by dates. It was awful and took forever. I was trying to use dataframes and SQL
as much as possible. I see that there are date functions in