spark sql partitioned by date... read last date

Koert Kuipers Sun, 01 Nov 2015 12:04:19 -0800

hello all,
i am trying to get familiar with spark sql partitioning support.


my data is partitioned by date, so like this:
data/date=2015-01-01
data/date=2015-01-02
data/date=2015-01-03
...

lets say i would like a batch process to read data for the latest date
only. how do i proceed?
generally the latest date will be yesterday, but it could be a day older or
maybe 2.

i understand that i will have to do something like:
df.filter(df("date") === some_date_string_here)

however i do now know what some_date_string_here should be. i would like to
inspect the available dates and pick the latest. is there an efficient way
to  find out what the available partitions are?

thanks! koert

spark sql partitioned by date... read last date

Reply via email to