Ninad,

I'm not sure why you posted in this thread as it does not seem
related, but for your answer a region will only split when a family
reaches 256MB, so I guess your small number of records wasn't enough.
To force a split, go in the shell and type " split 'tablename' " with
the name of your table. Or you can also alter your table and change
the MAX_FILESIZE value to something smaller. Finally, searching this
mailing list you will find many other tips.

J-D

On Sat, Apr 11, 2009 at 1:46 AM, Ninad Raut <[email protected]> wrote:
> Map Tasks not splitting data for
>
> distributed processing
>
> I have 17,000 odd records for processing and in non distributed mode(without
> mapred) I am able to process 1 record in 1 min. To reduce this time I used
> HDFS/HBase and Mapred . But to my frustrations, HBase has kept all the
> records in one region and is running a single mapred. this nullifies all my
> effort and I am back to where I started(process 1 record in 1 min). Forced
> splitting of records corrupts data. Can some one tell me a way of of this
> mess?
>
> On Fri, Apr 10, 2009 at 11:02 PM, Rakhi Khatwani
> <[email protected]>wrote:
>
>>
>>
>> ---------- Forwarded message ----------
>> From: Vaibhav Puranik <[email protected]>
>> Date: Fri, Apr 10, 2009 at 7:43 PM
>> Subject: Re: Region Servers going down frequently
>> To: [email protected]
>>
>>
>> Rakhi,
>>
>> Erick Holstad has written an Import Export map reduce job. This job is
>> compatible with 0.19.1.
>>
>> I have tried it myself and it works fine.
>> Here is the code -
>> https://issues.apache.org/jira/browse/HBASE-974<
>> https://issues.apache.org/jira/browse/HBASE-974>
>>
>> Regards,
>> Vaibhav
>>
>>
>> On Thu, Apr 9, 2009 at 3:10 AM, Rakhi Khatwani <[email protected]
>> >wrote:
>>
>> > Hi,
>> > The backup tool is written for older version and not for 0.19 . I tries
>> > changing the code and taking a backup and restoring it again, but the
>> > restore fails. No data is restored, though the mapreduce runs without
>> > errors.
>> >
>> > Any help???
>> >
>> > On Thu, Apr 9, 2009 at 12:57 PM, Amandeep Khurana <[email protected]>
>> > wrote:
>> >
>> > > When you say column foo: it basically picks up all the columns under
>> the
>> > > family foo:.. You dont have to give individual column names.
>> > >
>> > >
>> > > Amandeep Khurana
>> > > Computer Science Graduate Student
>> > > University of California, Santa Cruz
>> > >
>> > >
>> > > On Thu, Apr 9, 2009 at 12:25 AM, Rakhi Khatwani <
>> > [email protected]
>> > > >wrote:
>> > >
>> > > > Thanks Amandeep.
>> > > >
>> > > > the usage of the code is
>> > > >
>> > > > "bin/hadoop com.mahalo.hadoop.hbase.Exporter -output mybackup -table
>> > test
>> > > > -columns foo:"
>> > > >
>> > > > but my columns are like example
>> > > > *URL:http://www.yahoo.com/(columnname)*3(some<http://www.yahoo.com/%28columnname%29*3%28some>
>> <http://www.yahoo.com/%28columnname%29*3%28some>
>> > <http://www.yahoo.com/%28columnname%29*3%28some>
>> > > <http://www.yahoo.com/%28columnname%29*3%28some>int value) .And there
>> > > > are thousands of rows.
>> > > > Its not feasible to use the code from command prompt. Is there
>> another
>> > > way
>> > > > ??
>> > > >
>> > > >
>> > > >
>> > > > On Thu, Apr 9, 2009 at 12:33 PM, Amandeep Khurana <[email protected]>
>> > > > wrote:
>> > > >
>> > > > > You can use this...
>> https://issues.apache.org/jira/browse/HBASE-897
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > Amandeep Khurana
>> > > > > Computer Science Graduate Student
>> > > > > University of California, Santa Cruz
>> > > > >
>> > > > >
>> > > > > On Wed, Apr 8, 2009 at 11:59 PM, Rakhi Khatwani <
>> > > > [email protected]
>> > > > > >wrote:
>> > > > >
>> > > > > > Hi Andy,
>> > > > > >
>> > > > > > I want to back up my HBase and move to a more powerful machine. I
>> > am
>> > > > > trying
>> > > > > > distcp but it doesnot backup hbase folder properly. When I try
>> > > > restoring
>> > > > > > the
>> > > > > > hbase folder I don't get all the records. Some tables are coming
>> > > blank.
>> > > > > > What
>> > > > > > could be the reason.??
>> > > > > >
>> > > > > > On Wed, Apr 8, 2009 at 11:23 PM, Andrew Purtell <
>> > [email protected]
>> > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > >
>> > > > > > > I updated the Troubleshooting page on the wiki with a section
>> > > > > > > about EC2. Please feel free to extend/enhance/revise.
>> > > > > > >
>> > > > > > >   - Andy
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>>
>

Reply via email to