Presplitting table regions - #Reducers different from #Regions

Ioakim Perros Tue, 07 Aug 2012 17:12:05 -0700

By generalizing my question, in a M/R job, reading/writing from/to thesame Hbase table,

is there any way to work with any desirable number of reducers importingdata to a table,


which has its regions presplitted & specified?



Thanks in advance and excuse me for re-asking for help.



-------- Original Message --------
Subject:        Re: Bulk load - #Reducers different from #Regions
Date:   Tue, 07 Aug 2012 20:02:27 +0300
From:   Ioakim Perros <imper...@gmail.com>
To:     user@hbase.apache.org



Excuse me for not well-defining.

I am bulk updating my hbase table through code, usingconfigureIncrementalLoad function of HFileOutputFormat. At therespective documentation, I read that this function " Sets the number ofreduce tasks to match the current number of regions" ,but I waswondering if I could explicitly avoid it, perhaps by another way of bulkimporting data.

PS: I try to insist on bulk importing, because I have understood (I hopethat this is correct), that it is much more efficient than going withthe traditional Hbase API. And as I require my job to be of iterativenature, this way hopefully would end up giving a good boost-up, asopposed to the Hbase API.


Thank you for responding.



On 08/07/2012 07:53 PM, Subir S wrote:

Bulk load using
ImportTsv with pre-splitted regions for target table?

Do u mean to set number of reducers that ImportTsv must use?

On 8/7/12, Ioakim Perros<imper...@gmail.com>  wrote:

HI,

I am bulk importing (updating) data iteratively and I would like to be
able to set the number of reducers at a M/R task, to be different from
the number of regions of the table to which I am updating data.

I tried it through job.setNumReduceTasks(#reducers), but the job ignored
it.

Is there a way to avoid an intermediary job and to set the number of
reducers explicitly ?
I would be grateful if anyone could shed a light to this.

Thanks and regards,
Ioakim

Presplitting table regions - #Reducers different from #Regions

Reply via email to