for the input  (I changed the format)  :

+---------------+
|    Date        |
+---------------+
| 2019-02-08 |
+----------------+
| 2019-02-07 |
+----------------+
| 2019-12-01 |
+----------------+
| 2015-02-02 |
+----------------+
| 2012-02-03 |
+----------------+
| 2018-05-06 |
+----------------+
| 2022-02-08 |
+----------------+
the output was 2012-01-03

To note that for my below code to work I cast to string the resulting min
column.

Le mar. 14 juin 2022 à 21:12, Sean Owen <sro...@gmail.com> a écrit :

> You haven't shown your input or the result
>
> On Tue, Jun 14, 2022 at 1:40 PM marc nicole <mk1853...@gmail.com> wrote:
>
>> Hi Sean,
>>
>> Even with MM for months it gives incorrect (but different this time) min
>> value.
>>
>> Le mar. 14 juin 2022 à 20:18, Sean Owen <sro...@gmail.com> a écrit :
>>
>>> Yes that is right. It has to be parsed as a date to correctly reason
>>> about ordering. Otherwise you are finding the minimum string
>>> alphabetically.
>>>
>>> Small note, MM is month. mm is minute. You have to fix that for this to
>>> work. These are Java format strings.
>>>
>>> On Tue, Jun 14, 2022, 12:32 PM marc nicole <mk1853...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I want to identify a column of dates as such, the column has formatted
>>>> strings in the likes of: "06-14-2022" (the format being mm-dd-yyyy) and get
>>>> the minimum of those dates.
>>>>
>>>> I tried in Java as follows:
>>>>
>>>> if (dataset.filter(org.apache.spark.sql.functions.to_date(
>>>>> dataset.col(colName), "mm-dd-yyyy").isNotNull()).select(colName).count() 
>>>>> !=
>>>>> 0) { ....
>>>>
>>>>
>>>> And to get the *min *of the column:
>>>>
>>>> Object colMin =
>>>>> dataset.agg(org.apache.spark.sql.functions.min(org.apache.spark.sql.functions.to_date(dataset.col(colName),
>>>>> "mm-dd-yyyy"))).first().get(0);
>>>>
>>>> // then I cast the *colMin *to string.
>>>>
>>>> To note that if i don't apply *to_date*() to the target column then
>>>> the result will be erroneous (i think Spark will take the values as string
>>>> and will get the min as if it was applied on an alphabetical string).
>>>>
>>>> Any better approach to accomplish this?
>>>> Thanks.
>>>>
>>>

Reply via email to