Hi, I want to identify a column of dates as such, the column has formatted strings in the likes of: "06-14-2022" (the format being mm-dd-yyyy) and get the minimum of those dates.
I tried in Java as follows: if (dataset.filter(org.apache.spark.sql.functions.to_date( > dataset.col(colName), "mm-dd-yyyy").isNotNull()).select(colName).count() != > 0) { .... And to get the *min *of the column: Object colMin = > dataset.agg(org.apache.spark.sql.functions.min(org.apache.spark.sql.functions.to_date(dataset.col(colName), > "mm-dd-yyyy"))).first().get(0); // then I cast the *colMin *to string. To note that if i don't apply *to_date*() to the target column then the result will be erroneous (i think Spark will take the values as string and will get the min as if it was applied on an alphabetical string). Any better approach to accomplish this? Thanks.