On Jun 3, 2008, at 4:56 AM, smallufo wrote:
What if my data come from DB or memory ?
I should implement a DatabaseInputFormat implements InputFormat<int
rowIndex
, MyData value> , right ?
Yes
But , how to implement the getSplits() , and getRecordReader() ?
I looks into the sample source code for a long time , but still
don't know
how to "split" the data.
For most tables, I would choose key ranges for the splits. For
example, if your primary key was name, choose split points that
divide the table into roughly equal parts.
name < 'b' -> mapper 0
'b' <= name < 'c' -> mapper 1
or whatever makes sense for your data.
Is there any example code demonstrating data not come from DB or
objects in
memory ?
Take a look at the hbase table splitter:
http://tinyurl.com/48s76f
-- Owen