Hi Tim,
You could create a custom HashPartitioner so that all key,value pairs
denoting the actions of the same user end up in the same reducer; then you
need
only one output file per reducer. Btw, how large are the output files? make
sure you don't end up creating
a lot of small files, i.e., 64MB.
Best,
stan
On Thu, Sep 1, 2011 at 3:47 PM, modemide modem...@gmail.com wrote:
Hi all,
I was wondering if anyone was familiar with this class. I want to
create multiple output files during my reduce.
My input files will consist of
name1action1date1
name1action2date2
name1action3date3
name2action1date1
name2action2date2
name2action3date3
My goal is to create files with the following format
Filename:
name_Date:CCYYMM
File Contents:
action1
action2
action3
I.e. This will store all the actions of one person for any given month
in one file.
I just don't know how I will decide the file name at run time. Can anyone
help?
Thanks,
Tim