bq. HMerge can merge multiple regions by going over the list of
regions and checking
their sizes.
bq. But both of these tools (Merge and HMerge) are very dangerous

I came across HMerge and it looks like dead code. Isn't referenced from
anywhere except one test. (This is what lars also pointed out in the first
email too).
It would make perfect sense if it was a tool or was being referenced from
somewhere, but with lack of either of that, am a bit confused here.
@Enis, you seem to know everything about them, please educate me.
Thanks
- Appy



On Thu, Sep 29, 2016 at 12:43 AM, Enis Söztutar <enis....@gmail.com> wrote:

> Merge has very limited usability singe it can do a single merge and can
> only run when HBase is offline.
> HMerge can merge multiple regions by going over the list of regions and
> checking their sizes.
> And of course we have the "supported" online merge which is the shell
> command.
>
> But both of these tools (Merge and HMerge) are very dangerous I think. I
> would say we should deprecate both to be replaced by the online merger
> tool. We should not allow offline merge at all. I fail to see the usecase
> that you have to use an offline merge.
>
> Enis
>
> On Wed, Sep 28, 2016 at 7:32 AM, Lars George <lars.geo...@gmail.com>
> wrote:
>
> > Hey,
> >
> > Sorry to resurrect this old thread, but working on the book update, I
> > came across the same today, i.e. we have Merge and HMerge. I tried and
> > Merge works fine now. It is also the only one of the two flagged as
> > being a tool. Should HMerge be removed? At least deprecated?
> >
> > Cheers,
> > Lars
> >
> >
> > On Thu, Jul 7, 2011 at 2:03 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> > >>> there is already an issue to do this but not revamp of these Merge
> > > classes
> > > I guess the issue is HBASE-1621
> > >
> > > On Wed, Jul 6, 2011 at 2:28 PM, Stack <st...@duboce.net> wrote:
> > >
> > >> Yeah, can you file an issue Lars.  This stuff is ancient and needs to
> > >> be redone AND redone so we can do merging while table is online (there
> > >> is already an issue to do this but not revamp of these Merge classes).
> > >>  The unit tests for Merge are also all junit3 and do whacky stuff to
> > >> put up multiple regions.  This should be redone too (they are often
> > >> first thing broke when major change and putting them back together is
> > >> a headache since they do not follow the usual pattern).
> > >>
> > >> St.Ack
> > >>
> > >> On Sun, Jul 3, 2011 at 12:38 AM, Lars George <lars.geo...@gmail.com>
> > >> wrote:
> > >> > Hi Ted,
> > >> >
> > >> > The log is from an earlier attempt, I tried this a few times. This
> is
> > all
> > >> local, after rm'ing the /hbase. So the files are all pretty empty, but
> > since
> > >> I put data in I was assuming it should work. Once you gotten into this
> > >> state, you also get funny error messages in the shell:
> > >> >
> > >> > hbase(main):001:0> list
> > >> > TABLE
> > >> > 11/07/03 09:36:21 INFO ipc.HBaseRPC: Using
> > >> org.apache.hadoop.hbase.ipc.WritableRpcEngine for
> > >> org.apache.hadoop.hbase.ipc.HMasterInterface
> > >> >
> > >> > ERROR: undefined method `map' for nil:NilClass
> > >> >
> > >> > Here is some help for this command:
> > >> > List all tables in hbase. Optional regular expression parameter
> could
> > >> > be used to filter the output. Examples:
> > >> >
> > >> >  hbase> list
> > >> >  hbase> list 'abc.*'
> > >> >
> > >> >
> > >> > hbase(main):002:0>
> > >> >
> > >> > I am assuming this is collateral, but why? The UI works but the
> table
> > is
> > >> gone too.
> > >> >
> > >> > Lars
> > >> >
> > >> > On Jul 2, 2011, at 10:55 PM, Ted Yu wrote:
> > >> >
> > >> >> There is TestMergeTool which tests Merge.
> > >> >>
> > >> >> From the log you provided, I got a little confused as why
> > >> >> 'testtable,row-20,1309613053987.23a35ac696bdf4a8023dcc4c5b8419e0.'
> > >> didn't
> > >> >> appear in your command line or the output from .META. scanning.
> > >> >>
> > >> >> On Sat, Jul 2, 2011 at 10:36 AM, Lars George <
> lars.geo...@gmail.com>
> > >> wrote:
> > >> >>
> > >> >>> Hi,
> > >> >>>
> > >> >>> These two seem both in a bit of a weird state: HMerge is scoped
> > package
> > >> >>> local, therefore no one but the package can call the merge()
> > >> functions...
> > >> >>> and no one does that but the unit test. But it would be good to
> have
> > >> this on
> > >> >>> the CLI and shell as a command (and in the shell maybe with a
> > >> confirmation
> > >> >>> message?), but it is not available AFAIK.
> > >> >>>
> > >> >>> HMerge can merge regions of tables that are disabled. It also
> merges
> > >> all
> > >> >>> that qualify, i.e. where the merged region is less than or equal
> of
> > >> half the
> > >> >>> configured max file size.
> > >> >>>
> > >> >>> Merge on the other hand does have a main(), so can be invoked:
> > >> >>>
> > >> >>> $ hbase org.apache.hadoop.hbase.util.Merge
> > >> >>> Usage: bin/hbase merge <table-name> <region-1> <region-2>
> > >> >>>
> > >> >>> Note how the help insinuates that you can use it as a tool, but
> > that is
> > >> not
> > >> >>> correct. Also, it only merges two given regions, and the cluster
> > must
> > >> be
> > >> >>> shut down (only the HBase daemons). So that is a step back.
> > >> >>>
> > >> >>> What is worse is that I cannot get it to work. I tried in the
> shell:
> > >> >>>
> > >> >>> hbase(main):001:0> create 'testtable', 'colfam1',  {SPLITS =>
> > >> >>> ['row-10','row-20','row-30','row-40','row-50']}
> > >> >>> 0 row(s) in 0.2640 seconds
> > >> >>>
> > >> >>> hbase(main):002:0> for i in '0'..'9' do for j in '0'..'9' do put
> > >> >>> 'testtable', "row-#{i}#{j}", "colfam1:#{j}", "#{j}" end end
> > >> >>> 0 row(s) in 1.0450 seconds
> > >> >>>
> > >> >>> hbase(main):003:0> flush 'testtable'
> > >> >>> 0 row(s) in 0.2000 seconds
> > >> >>>
> > >> >>> hbase(main):004:0> scan '.META.', { COLUMNS =>
> ['info:regioninfo']}
> > >> >>> ROW                                  COLUMN+CELL
> > >> >>> testtable,,1309614509037.612d1e0112 column=info:regioninfo,
> > >> >>> timestamp=130...
> > >> >>> 406e6c2bb482eeaec57322.             STARTKEY => '', ENDKEY =>
> > 'row-10'
> > >> >>> testtable,row-10,1309614509040.2fba column=info:regioninfo,
> > >> >>> timestamp=130...
> > >> >>> fcc9bc6afac94c465ce5dcabc5d1.       STARTKEY => 'row-10', ENDKEY
> =>
> > >> >>> 'row-20'
> > >> >>> testtable,row-20,1309614509041.e7c1 column=info:regioninfo,
> > >> >>> timestamp=130...
> > >> >>> 6267eb30e147e5d988c63d40f982.       STARTKEY => 'row-20', ENDKEY
> =>
> > >> >>> 'row-30'
> > >> >>> testtable,row-30,1309614509041.a9cd column=info:regioninfo,
> > >> >>> timestamp=130...
> > >> >>> e1cbc7d1a21b1aca2ac7fda30ad8.       STARTKEY => 'row-30', ENDKEY
> =>
> > >> >>> 'row-40'
> > >> >>> testtable,row-40,1309614509041.d458 column=info:regioninfo,
> > >> >>> timestamp=130...
> > >> >>> 236feae097efcf33477e7acc51d4.       STARTKEY => 'row-40', ENDKEY
> =>
> > >> >>> 'row-50'
> > >> >>> testtable,row-50,1309614509041.74a5 column=info:regioninfo,
> > >> >>> timestamp=130...
> > >> >>> 7dc7e3e9602d9229b15d4c0357d1.       STARTKEY => 'row-50', ENDKEY
> =>
> > ''
> > >> >>> 6 row(s) in 0.0440 seconds
> > >> >>>
> > >> >>> hbase(main):005:0> exit
> > >> >>>
> > >> >>> $ ./bin/stop-hbase.sh
> > >> >>>
> > >> >>> $ hbase org.apache.hadoop.hbase.util.Merge testtable \
> > >> >>> testtable,row-20,1309614509041.e7c16267eb30e147e5d988c63d40f982.
> \
> > >> >>> testtable,row-30,1309614509041.a9cde1cbc7d1a21b1aca2ac7fda30ad8.
> > >> >>>
> > >> >>> But I get consistently errors:
> > >> >>>
> > >> >>> 11/07/02 07:20:49 INFO util.Merge: Merging regions
> > >> >>> testtable,row-20,1309613053987.23a35ac696bdf4a8023dcc4c5b8419e0.
> > and
> > >> >>> testtable,row-30,1309613053987.3664920956c30ac5ff2a7726e4e6 in
> > table
> > >> >>> testtable
> > >> >>> 11/07/02 07:20:49 INFO wal.HLog: HLog configuration: blocksize=32
> > MB,
> > >> >>> rollsize=30.4 MB, enabled=true, optionallogflushinternal=1000ms
> > >> >>> 11/07/02 07:20:49 INFO wal.HLog: New hlog
> > >> >>>
> > >> /Volumes/Macintosh-HD/Users/larsgeorge/.logs_1309616449171/hlog.
> > 1309616449181
> > >> >>> 11/07/02 07:20:49 INFO wal.HLog: getNumCurrentReplicas--HDFS-826
> > not
> > >> >>> available; hdfs_out=org.apache.hadoop.fs.
> > FSDataOutputStream@25961581,
> > >> >>>
> > >> exception=org.apache.hadoop.fs.ChecksumFileSystem$
> > ChecksumFSOutputSummer.getNumCurrentReplicas()
> > >> >>> 11/07/02 07:20:49 INFO regionserver.HRegion: Setting up
> > tabledescriptor
> > >> >>> config now ...
> > >> >>> 11/07/02 07:20:49 INFO regionserver.HRegion: Onlined
> > >> -ROOT-,,0.70236052;
> > >> >>> next sequenceid=1
> > >> >>> info: null
> > >> >>> region1: [B@48fd918a
> > >> >>> region2: [B@7f5e2075
> > >> >>> 11/07/02 07:20:49 FATAL util.Merge: Merge failed
> > >> >>> java.io.IOException: Could not find meta region for
> > >> >>> testtable,row-20,1309613053987.23a35ac696bdf4a8023dcc4c5b8419e0.
> > >> >>>       at
> > >> >>> org.apache.hadoop.hbase.util.Merge.mergeTwoRegions(Merge.
> java:211)
> > >> >>>       at org.apache.hadoop.hbase.util.Merge.run(Merge.java:111)
> > >> >>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.
> java:65)
> > >> >>>       at org.apache.hadoop.hbase.util.Merge.main(Merge.java:386)
> > >> >>> 11/07/02 07:20:49 INFO regionserver.HRegion: Setting up
> > tabledescriptor
> > >> >>> config now ...
> > >> >>> 11/07/02 07:20:49 INFO regionserver.HRegion: Onlined
> > >> .META.,,1.1028785192;
> > >> >>> next sequenceid=1
> > >> >>> 11/07/02 07:20:49 INFO regionserver.HRegion: Closed
> > -ROOT-,,0.70236052
> > >> >>> 11/07/02 07:20:49 INFO wal.HLog: main.logSyncer exiting
> > >> >>> 11/07/02 07:20:49 ERROR util.Merge: exiting due to error
> > >> >>> java.lang.NullPointerException
> > >> >>>       at
> > >> org.apache.hadoop.hbase.util.Merge$1.processRow(Merge.java:119)
> > >> >>>       at
> > >> >>>
> > >> org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(
> > MetaUtils.java:229)
> > >> >>>       at
> > >> >>>
> > >> org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(
> > MetaUtils.java:258)
> > >> >>>       at org.apache.hadoop.hbase.util.Merge.run(Merge.java:116)
> > >> >>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.
> java:65)
> > >> >>>       at org.apache.hadoop.hbase.util.Merge.main(Merge.java:386)
> > >> >>>
> > >> >>> After which I most of the times have shot .META. with an error
> > >> >>>
> > >> >>> 2011-07-02 06:42:10,763 WARN org.apache.hadoop.hbase.
> > master.HMaster:
> > >> Failed
> > >> >>> getting all descriptors
> > >> >>> java.io.FileNotFoundException: No status for
> > >> >>> hdfs://localhost:8020/hbase/.corrupt
> > >> >>>       at
> > >> >>>
> > >> org.apache.hadoop.hbase.util.FSUtils.getTableInfoModtime(
> > FSUtils.java:888)
> > >> >>>       at
> > >> >>>
> > >> org.apache.hadoop.hbase.util.FSTableDescriptors.get(
> > FSTableDescriptors.java:122)
> > >> >>>       at
> > >> >>>
> > >> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(
> > FSTableDescriptors.java:149)
> > >> >>>       at
> > >> >>>
> > >> org.apache.hadoop.hbase.master.HMaster.getHTableDescriptors(HMaster.
> > java:1429)
> > >> >>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > >> >>>       at
> > >> >>>
> > >> sun.reflect.NativeMethodAccessorImpl.invoke(
> > NativeMethodAccessorImpl.java:39)
> > >> >>>       at
> > >> >>>
> > >> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > DelegatingMethodAccessorImpl.java:25)
> > >> >>>       at java.lang.reflect.Method.invoke(Method.java:597)
> > >> >>>       at
> > >> >>>
> > >> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(
> > WritableRpcEngine.java:312)
> > >> >>>       at
> > >> >>>
> > >> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(
> > HBaseServer.java:1065)
> > >> >>>
> > >> >>> Lars
> > >> >
> > >> >
> > >>
> >
>



-- 

-- Appy

Reply via email to