Import was run as an M/R job on the same configuration as the export (15 nodes, 
5 tasks per node). Nodes are 8 cores with 23GB of total RAM (6GB for hbase RS). 
As far as I could tell, everything was running pretty balanced and hbase was 
the bottleneck due to all of the compaction.

Actually, an hbase export to "bulk load" facility sounds like a great idea. We 
have been using bulk loads to migrate data from an older data store and they 
have worked awesome for us. It also doesn't seem like it would be that hard to 
implement. So what am I missing?

Paul

-----Original Message-----
From: saint....@gmail.com [mailto:saint....@gmail.com] On Behalf Of Stack
Sent: Monday, February 20, 2012 4:29 PM
To: user@hbase.apache.org
Subject: Re: export/import for backup

On Mon, Feb 20, 2012 at 1:20 PM, Paul Mackles <pmack...@adobe.com> wrote:
> We are on hbase 0.90.4 (cd3u2). We are using the standard hbase export/import 
> for backups. In a test run, our imports ran extremely slow. While a full 
> export of our dataset took about an hour, the corresponding import took 20+ 
> hours (for 216 regions across 15 servers). While it finished, I am a little 
> uncomfortable with that sort of recovery time should disaster strike. Are 
> there any recommendations for speeding up imports in a recovery scenario? One 
> thing I noticed while watching the region-server logs was that there were a 
> lot of compactions happening during the import (both major and minor). Should 
> we disable compactions while the import is running and then do it all at the 
> end? We have our region-size set to 100GB right now so we can manage 
> splitting. Thanks in advance for any recommendations.
>

Can you tell where it was spending the time Paul?  Upping config. so
less flushing sounds like it might good way to go.  You might want to
do stuff like large flush sizes when importing so flushes are larger.
How did you import?  A MR job?  It was running full on? HBase was what
was keeping it slow?

Anyone played with going from an export to a bulk load?  I wonder if
this would run faster?

St.Ack

Reply via email to