hello all,
i am trying to get familiar with spark sql partitioning support.

my data is partitioned by date, so like this:
data/date=2015-01-01
data/date=2015-01-02
data/date=2015-01-03
...

lets say i would like a batch process to read data for the latest date
only. how do i proceed?
generally the latest date will be yesterday, but it could be a day older or
maybe 2.

i understand that i will have to do something like:
df.filter(df("date") === some_date_string_here)

however i do now know what some_date_string_here should be. i would like to
inspect the available dates and pick the latest. is there an efficient way
to  find out what the available partitions are?

thanks! koert

Reply via email to