Let's say I have my input data from the past 12 months organized into subdirs 
by date:

/data/2012-06-10
/data/2012-06-11
...
/data/2013-06-09

And now say that I want to run a Pig script to process data from a range of 
dates within the last 12 months, say 2012-11-07 through 2013-05-26. The regex 
that I could specify for this date range is going to get quite complicated. 

Is there a way that I can get my Pig script to load data from such a range 
without a regex? 

I could load all the data in /data/*, and then FILTER by the date field in each 
record, but this is not desirable if the range of dates is small compared to 
the entire dataset.
                                          

Reply via email to