Re: Online snapshots are very slow?

Vladimir Rodionov Tue, 03 Mar 2015 09:39:35 -0800

Matteo,

For large cluster/table this one:


 - the master will aggregate the result and verify the integrity

looks like a real bottleneck.

Any other hidden serialized parts of the implementation?

-Vlad


On Tue, Mar 3, 2015 at 9:25 AM, Matteo Bertozzi <[email protected]>
wrote:

> the high-level overview of snapshot is:
>  - client ask the master to take a snapshot
>  - the master lookup the RS that are hosting the regions for the specified
> table
>  - the master creates a znode to notify the RSs to take a snapshot
>  - each RS involved will get notified and take the snapshot. which is flush
> + writing a manifest
>  - each RS involved will respond to the master
>  - the master will aggregate the result and verify the integrity
>  - snapshot complete
>
> so, the time required to take a snapshot is bounded by the slowest region
> to flush/respond.
> You can try with SKIP_FLUSH = true
> also, if you grep Snapshot from the master log you can see what is taking
> long.
>
> Matteo
>
>
> On Tue, Mar 3, 2015 at 5:18 PM, Vladimir Rodionov <[email protected]>
> wrote:
>
> > Some discussions:
> > http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/43616
> >
> > Any ideas why? It should not take 10s of seconds (unless we flush several
> > GBs per server)
> > I got info from my coworker that it is indeed slow (20+ sec on an almost
> > empty table).
> >
> > I have not started testing myself yet but before I start digging into it
> I
> > would like to collect opinions from HBase folks.
> >
> > -Vlad
> >
>

Re: Online snapshots are very slow?

Reply via email to