Hi Jacques,

Yes, the timestamps are set at the time the MR job runs, not the time
they're loaded. So, you'll see the values from the job that wrote its
output most recently.

You can also specify timestamps explicitly for each KeyValue, if you prefer.

-Todd

On Fri, Aug 5, 2011 at 2:10 PM, Jacques <[email protected]> wrote:
> Can someone confirm that bulk loading hfiles keeps cell timestamps from
> overwriting each other.
>
> For example:
> I run mapreduce A job on Monday.
> I run mapreduce B job on Tuesday.
>
> I then run LoadIncrementalHFiles on job B first, followed by A.
>
> Please confirm that at the intersection of outputs A & B will be the values
> from B.
>
> Thanks,
> Jacques
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to