Hi, you can truncate datetimes like this (in pyspark), e.g. to 5 minutes:
import pyspark.sql.functions as F df.select((F.floor(F.col('myDateColumn').cast('long') / 300) * 300).cast('timestamp')) Best, Eike David Hodefi <davidhodeffi.w...@gmail.com> schrieb am Mo., 13. Nov. 2017 um 12:27 Uhr: > I am familiar with those functions, none of them is actually truncating a > date. We can use those methods to help implement truncate method. I think > truncating a day/ hour should be as simple as "truncate(...,"DD") or > truncate(...,"HH") ". > > On Thu, Nov 9, 2017 at 8:23 PM, Gaspar Muñoz <gmu...@datiobd.com> wrote: > >> There are functions for day (called dayOfMonth and dayOfYear) and hour >> (called hour). You can view them here: >> https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.functions >> >> Example: >> >> import org.apache.spark.sql.functions._ >> val df = df.select(hour($"myDateColumn"), dayOfMonth($"myDateColumn"), >> dayOfYear($"myDateColumn")) >> >> 2017-11-09 12:05 GMT+01:00 David Hodefi <davidhodeffi.w...@gmail.com>: >> >>> I would like to truncate date to his day or hour. currently it is only >>> possible to truncate MONTH or YEAR. >>> 1.How can achieve that? >>> 2.Is there any pull request about this issue? >>> 3.If there is not any open pull request about this issue, what are the >>> implications that I should be aware of when coding /contributing it as a >>> pull request? >>> >>> Last question is, Looking at DateTImeUtils class code, it seems like >>> implementation is not using any open library for handling dates i.e >>> apache-common , Why implementing it instead of reusing open source? >>> >>> Thanks David >>> >> >> >> >> -- >> Gaspar Muñoz Soria >> >> Vía de las dos Castillas, 33 >> <https://maps.google.com/?q=V%C3%ADa+de+las+dos+Castillas,+33&entry=gmail&source=g>, >> Ática 4, 3ª Planta >> 28224 Pozuelo de Alarcón, Madrid >> Tel: +34 91 828 6473 >> > >