Github user youngbink commented on the issue: https://github.com/apache/spark/pull/20015 @HyukjinKwon Just took a look at this PR #14788. My point of mentioning those databases was just to give examples of the function that Spark doesn't support but other databases commonly do. (They all have this `date_trunc` which takes `timestamp` and output `timestamp`) As you said, we could extend `trunc` and simply create an alias `date_trunc`, but it's actually not as simple. For e.g, PR #14788 won't be able to handle the following command collectly on PySpark: ``` df = spark.createDataFrame([('1997-02-28 05:02:11',)], ['d']) df.select(functions.trunc(df.d, 'year').alias('year')).collect() df.select(functions.trunc(df.d, 'SS').alias('SS')).collect() ``` This is because `trunc(string, string)` isn't correctly handled. We could find a way around this and get it working, but after having a discussion with @cloud-fan, @gatorsmile, @rednaxelafx and Reynold, we decided to add `date_trunc` to be compatible with Postgres for now.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org