Try with max date, in your case it could make more sense to represent the date as int
Sent from my iPhone > On 01 Nov 2015, at 21:03, Koert Kuipers <ko...@tresata.com> wrote: > > hello all, > i am trying to get familiar with spark sql partitioning support. > > my data is partitioned by date, so like this: > data/date=2015-01-01 > data/date=2015-01-02 > data/date=2015-01-03 > ... > > lets say i would like a batch process to read data for the latest date only. > how do i proceed? > generally the latest date will be yesterday, but it could be a day older or > maybe 2. > > i understand that i will have to do something like: > df.filter(df("date") === some_date_string_here) > > however i do now know what some_date_string_here should be. i would like to > inspect the available dates and pick the latest. is there an efficient way to > find out what the available partitions are? > > thanks! koert > >