Yes, I have looked them.

I was looking for examples where I can have my own split function which
overrides the behavious of DBInputSplit. By default, it feches the rows from
the table and splits them and each split having different number of rows.
This is what I want to control, I need splits to be fixed size say 'n' so
that each map task should then work on these splits independently.

-Gauarv



Amandeep Khurana wrote:
> 
> You can find examples on how to use DBInputFormat on the internet. And if
> you want a sample input format, just read any of the existing ones...
> 
> 
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
> 
> 
> On Fri, Feb 12, 2010 at 10:39 PM, Gaurav Vashishth
> <[email protected]>wrote:
> 
>>
>> Ok thanks for the reply. Do you have any sample code which can
>> demonstrate
>> how to do this?
>>
>> -Gaurav
>>
>>
>> Amandeep Khurana wrote:
>> >
>> > DBInputFormat splits the count() from the RDBMS table into the number
>> of
>> > mappers. If you want to split using your own scheme, you'll have to
>> write
>> > your own input format or tweak the existing one.
>> >
>> >
>> > Amandeep Khurana
>> > Computer Science Graduate Student
>> > University of California, Santa Cruz
>> >
>> >
>> > On Fri, Feb 12, 2010 at 12:08 PM, Stack <[email protected]> wrote:
>> >
>> >> On Fri, Feb 12, 2010 at 4:32 AM, Gaurav Vashishth
>> <[email protected]
>> >
>> >> wrote:
>> >> >
>> >> > I have the Map Reduce function whose job is to process the database
>> ,
>> >> MySql,
>> >> > and give us some output. For this purpose, I have created the map
>> >> reduce
>> >> > fucntion and have used the DBInputFormat, but Im confused in how the
>> >> > JobTracker will produce the splits here.
>> >> >
>> >> > I want that first 'n' records from the database should be processed
>> by
>> >> > single map task and so on and if jobtracker splits the record and
>> give
>> >> less
>> >> > than 'n' records, it would be problem.
>> >> >
>> >> > Is there any API for getting this done or Im missing something.
>> >> >
>> >>
>> >> Maybe you have to write your own splitter?  One that makes sure each
>> >> task has N rows?  Is there a splitter that is part of DBInputFormat?
>> >> Can you look at how it works?  Maybe you can specify rows per task
>> >> just with a configuration?
>> >> St.Ack
>> >>
>> >
>> >
>>
>> --
>> View this message in context:
>> http://old.nabble.com/DBInputFormat-tp27562875p27572830.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://old.nabble.com/DBInputFormat-tp27562875p27573008.html
Sent from the HBase User mailing list archive at Nabble.com.

Reply via email to