Hi, I have some basic questions on how hive handles dates and date arithmetic. I apologize if this has already been addressed. Per most samples on this site and elsewhere, I can have an access log table defined with a partition scheme that looks like this: ds='09-08-09'. This is obviously pretty good to partition the data. However, how can this information be used later in queries? For example, if I want to select data for all dates between 08/15/09 and 09/15/09, how would I do that? The partition column ds cannot be used with >= and similar operators right? Additionally, when is partitioned this way, how can I do counts on month, etc? Obviously all of these queries need to be expressed in a way hive can still take advantage of the partitioning scheme. I hope that makes sense.
Thanks, Vijay
