hello all, i am trying to get familiar with spark sql partitioning support.
my data is partitioned by date, so like this: data/date=2015-01-01 data/date=2015-01-02 data/date=2015-01-03 ... lets say i would like a batch process to read data for the latest date only. how do i proceed? generally the latest date will be yesterday, but it could be a day older or maybe 2. i understand that i will have to do something like: df.filter(df("date") === some_date_string_here) however i do now know what some_date_string_here should be. i would like to inspect the available dates and pick the latest. is there an efficient way to find out what the available partitions are? thanks! koert