Hi.

 

I'm writing streaming based tasks that involves running thousands of
mappers, after that I want to put all these outputs into small number (say
30) output files mainly so that disk space will be used more efficiently,
the way I'm doing it right now is using /bin/cat as reducer and setting
number of reducers to desired. This involves two highly ineffective (for the
task) steps - sorting and fetching.  Is there a way to get around that? 

Ideally I'd want all mapper outputs to be written to one file, one record
per line.

 

Thanks. 

 

---

Dmitry Pushkarev

+1-650-644-8988

 

Reply via email to