Re: A better way to migrate the whole cluster?

Bryan Beaudreault Fri, 15 Aug 2014 11:28:40 -0700

The reason it swallows exceptions is because the idea is that you first run
a diff, then use the output of that diff as the input of the Backup job.
 So if a HFile fails to copy, the job will still finish, copying most of
the HFiles.  Then you run the diff again, and it will see that an HFile was
missed.  The next run will only copy that file and any other new files.


You run the diff before each job run.  The diff is basically run `hdfs dfs
-ls -R /hbase` on each cluster, and pass the output of those commands into
https://github.com/mozilla-metrics/akela/blob/master/src/main/python/lsr_diff.py
.

So the way we've used this job (modified internally, like I said, but very
much the same concept) is:

1) Run the job, usually takes a while
2) Run the job again. This takes much shorter because most of the files
were copied and we're just copying those new or failed files since the last
job began.
3) Depending on how long the job is taking, i.e. how fast the data
ingestion is, we'll run it as many times as we need to get the window small.
4) Stop the source cluster
5) Run the job one more time
6) If there was an exception, run it again.  I've had to do this maybe
once.
6) Start the target cluster, and bounce applications so they connect to the
new cluster.

The idea is you can run the job over and over to get closer to being
in-sync.  This is how we handled errors too.  In the times I've used it
(10+ on multiple production clusters), I've never seen an exception that
couldn't be resolved by running it again.

So we previously worked with the leniency of having a short window (<10
mins) of downtime.  There are some challenges with doing this in an online
migration with replication, but maybe still doable with some thought around
the diffing. For example, compactions will never stop in a running source
cluster, so currently we might have to keep running the Backup over and
over, never reaching parity.  So maybe the diff function should be changed
to inspect the actual HFiles instead of just comparing file name and length.



On Fri, Aug 15, 2014 at 1:47 PM, Ted Yu <[email protected]> wrote:

> Bryan:
> From javadoc of Backup.java:
> bq. it favors swallowing exceptions and incrementing counters as opposed to
> failing
>
> Can you share some experience how you handled the errors reported by Backup
> ?
>
> Thanks
>
>
> On Fri, Aug 15, 2014 at 10:38 AM, Bryan Beaudreault <
> [email protected]> wrote:
>
> > I agree it would be nice if this was provided by HBase, but it's already
> > possible to work straight with the HFiles.  All you need is a custom
> hadoop
> > job.  A good starting point is
> >
> >
> https://github.com/mozilla-metrics/akela/blob/master/src/main/java/com/mozilla/hadoop/Backup.java
> > and modify it to your needs. We've used our own modification of this job
> > many times when we do our own cluster migrations.  The idea is that it is
> > incremental, so as HFiles get compacted, deleted, etc, you can just run
> it
> > again and move smaller and smaller amounts of data.
> >
> > Working at the hdfs level should be faster, as you can use more mappers.
> > You will still be taxing the IO of the source cluster, but not adding
> load
> > to the actual regionserver processes (ipc queue, memory, etc).
> >
> > If you upgrade to CDH5 (or the equivalent hdfs version), you can use hdfs
> > snapshots to minimize the need to re-run the above Backup job (since you
> > are already using replication to keep data up-to-date).
> >
> >
> > On Fri, Aug 15, 2014 at 1:11 PM, Esteban Gutierrez <[email protected]
> >
> > wrote:
> >
> > > 1.8TB in a day is not terrible slow if that number comes from the
> > CopyTable
> > > counters and you are moving data across data centers using public
> > networks,
> > > that should be about 20MB/sec. Also, CopyTable won't compress anything
> on
> > > the wire so the network overhead should be a lot. If you use anything
> > like
> > > snappy for block compression and/or fast_diff for block encoding the
> > > HFiles, then using snapshots and export them using the ExportSnapshot
> > tool
> > > should be the way to go.
> > >
> > > cheers,
> > > esteban.
> > >
> > >
> > >
> > > --
> > > Cloudera, Inc.
> > >
> > >
> > >
> > > On Thu, Aug 14, 2014 at 11:24 PM, tobe <[email protected]> wrote:
> > >
> > > > Thank @lars.
> > > >
> > > > We're using HBase 0.94.11 and follow the instruction to run
> > `./bin/hbase
> > > > org.apache.hadoop.hbase.mapreduce.CopyTable
> > > --peer.adr=hbase://cluster_name
> > > > table_name`. We have namespace service to find the ZooKeeper with
> > > > "hbase://cluster_name". And the job ran on a shared yarn cluster.
> > > >
> > > > The performance is affected by many factors, but we haven't found out
> > the
> > > > reason. It would be great to see your suggestions.
> > > >
> > > >
> > > > On Fri, Aug 15, 2014 at 1:34 PM, lars hofhansl <[email protected]>
> > wrote:
> > > >
> > > > > What version of HBase? How are you running CopyTable? A day for
> 1.8T
> > is
> > > > > not what we would expect.
> > > > > You can definitely take a snapshot and then export the snapshot to
> > > > another
> > > > > cluster, which will move the actual files; but CopyTable should not
> > be
> > > so
> > > > > slow.
> > > > >
> > > > >
> > > > > -- Lars
> > > > >
> > > > >
> > > > >
> > > > > ________________________________
> > > > >  From: tobe <[email protected]>
> > > > > To: "[email protected]" <[email protected]>
> > > > > Cc: [email protected]
> > > > > Sent: Thursday, August 14, 2014 8:18 PM
> > > > > Subject: A better way to migrate the whole cluster?
> > > > >
> > > > >
> > > > > Sometimes our users want to upgrade their servers or move to a new
> > > > > datacenter, then we have to migrate the data from HBase. Currently
> we
> > > > > enable the replication from the old cluster to the new cluster, and
> > run
> > > > > CopyTable to move the older data.
> > > > >
> > > > > It's a little inefficient. It takes more than one day to migrate
> 1.8T
> > > > data
> > > > > and more time to verify. Can we have a better way to do that, like
> > > > snapshot
> > > > > or purely HDFS files?
> > > > >
> > > > > And what's the best practise or your valuable experience?
> > > > >
> > > >
> > >
> >
>

Re: A better way to migrate the whole cluster?

Reply via email to