Re: spark sql partitioned by date... read last date

Jörn Franke Sun, 01 Nov 2015 13:07:19 -0800

Try with max date, in your case it could make more sense to represent the date 
as int


Sent from my iPhone

> On 01 Nov 2015, at 21:03, Koert Kuipers <ko...@tresata.com> wrote:
> 
> hello all,
> i am trying to get familiar with spark sql partitioning support.
> 
> my data is partitioned by date, so like this:
> data/date=2015-01-01
> data/date=2015-01-02
> data/date=2015-01-03
> ...
> 
> lets say i would like a batch process to read data for the latest date only. 
> how do i proceed? 
> generally the latest date will be yesterday, but it could be a day older or 
> maybe 2. 
> 
> i understand that i will have to do something like:
> df.filter(df("date") === some_date_string_here)
> 
> however i do now know what some_date_string_here should be. i would like to 
> inspect the available dates and pick the latest. is there an efficient way to 
>  find out what the available partitions are?
> 
> thanks! koert
> 
>

Re: spark sql partitioned by date... read last date

Reply via email to