On Sat, Jan 14, 2017 at 9:50 PM, Lars George <lars.geo...@gmail.com> wrote:

> I think that makes sense. The tool with its custom code dates back to
> where we had no built in version. I am all for removing all of the tools
> and leave the API call only. That is the same for an admin then compared to
> calling flush or split.
>
> No?
>
>
Sounds good to me.
St.Ack



> Lars
>
> Sent from my iPhone
>
> On 15 Jan 2017, at 04:25, Stephen Jiang <syuanjiang...@gmail.com> wrote:
>
> >> If you remove the util.Merge tool, how then does an operator ask for a
> merge
> > in its absence?
> >
> > We have a shell command to merge region.  In the past, it calls the same
> RS
> > side code.  I don't think there is a need to have util.Merge (even if we
> > really want, we can ask this utility to call HBaseAdmin.mergeRegions,
> which
> > is the same path from the merge command through 'hbase shell').
> >
> > Thanks
> > Stephen
> >
> >> On Fri, Jan 13, 2017 at 11:29 PM, Stack <st...@duboce.net> wrote:
> >>
> >> On Fri, Jan 13, 2017 at 7:16 PM, Stephen Jiang <syuanjiang...@gmail.com
> >
> >> wrote:
> >>
> >>> Revive this thread
> >>>
> >>> I am in the process of removing Region Server side merge (and split)
> >>> transaction code in master branch; as now we have merge (and split)
> >>> procedure(s) from master doing the same thing.
> >>>
> >>>
> >> Good (Issue?)
> >>
> >>
> >>> The Merge tool depends on RS-side merge code.  I'd like to use this
> >> chance
> >>> to remove the util.Merge tool.  This is for 2.0 and up releases only.
> >>> Deprecation does not work here; as keeping the RS-side merge code would
> >>> have duplicate logic in source code and make the new Assignment manager
> >>> code more complicated.
> >>>
> >>>
> >> Could util.Merge be changed to ask the Master run the merge (via AMv2)?
> >>
> >> If you remove the util.Merge tool, how then does an operator ask for a
> >> merge in its absence?
> >>
> >> Thanks Stephen
> >>
> >> S
> >>
> >>
> >>> Please let me know whether you have objection.
> >>>
> >>> Thanks
> >>> Stephen
> >>>
> >>> PS.  I could deprecated HMerge code if anyone is really using it.  It
> has
> >>> its own logic and standalone (supposed to dangerously work offline and
> >>> merge more than 2 regions - the util.Merge and shell not support these
> >>> functionality for now).
> >>>
> >>> On Wed, Nov 16, 2016 at 11:04 AM, Enis Söztutar <enis....@gmail.com>
> >>> wrote:
> >>>
> >>>> @Appy what is not clear from above?
> >>>>
> >>>> I think we should get rid of both Merge and HMerge.
> >>>>
> >>>> We should not have any tool which will work in offline mode by going
> >> over
> >>>> the HDFS data. Seems very brittle to be broken when things get
> changed.
> >>>> Only use case I can think of is that somehow you end up with a lot of
> >>>> regions and you cannot bring the cluster back up because of OOMs, etc
> >> and
> >>>> you have to reduce the number of regions in offline mode. However, we
> >> did
> >>>> not see this kind of thing in any of our customers for the last couple
> >> of
> >>>> years so far.
> >>>>
> >>>> I think we should seriously look into improving normalizer and
> enabling
> >>>> that by default for all the tables. Ideally, normalizer should be
> >> running
> >>>> much more frequently, and should be configured with higher-level goals
> >>> and
> >>>> heuristics. Like on average how many regions per node, etc and should
> >> be
> >>>> looking at the global state (like the balancer) to decide on split /
> >>> merge
> >>>> points.
> >>>>
> >>>> Enis
> >>>>
> >>>> On Wed, Nov 16, 2016 at 1:17 AM, Apekshit Sharma <a...@cloudera.com>
> >>>> wrote:
> >>>>
> >>>>> bq. HMerge can merge multiple regions by going over the list of
> >>>>> regions and checking
> >>>>> their sizes.
> >>>>> bq. But both of these tools (Merge and HMerge) are very dangerous
> >>>>>
> >>>>> I came across HMerge and it looks like dead code. Isn't referenced
> >> from
> >>>>> anywhere except one test. (This is what lars also pointed out in the
> >>>> first
> >>>>> email too).
> >>>>> It would make perfect sense if it was a tool or was being referenced
> >>> from
> >>>>> somewhere, but with lack of either of that, am a bit confused here.
> >>>>> @Enis, you seem to know everything about them, please educate me.
> >>>>> Thanks
> >>>>> - Appy
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Thu, Sep 29, 2016 at 12:43 AM, Enis Söztutar <enis....@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Merge has very limited usability singe it can do a single merge and
> >>> can
> >>>>>> only run when HBase is offline.
> >>>>>> HMerge can merge multiple regions by going over the list of regions
> >>> and
> >>>>>> checking their sizes.
> >>>>>> And of course we have the "supported" online merge which is the
> >> shell
> >>>>>> command.
> >>>>>>
> >>>>>> But both of these tools (Merge and HMerge) are very dangerous I
> >>> think.
> >>>> I
> >>>>>> would say we should deprecate both to be replaced by the online
> >>> merger
> >>>>>> tool. We should not allow offline merge at all. I fail to see the
> >>>> usecase
> >>>>>> that you have to use an offline merge.
> >>>>>>
> >>>>>> Enis
> >>>>>>
> >>>>>> On Wed, Sep 28, 2016 at 7:32 AM, Lars George <
> >> lars.geo...@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hey,
> >>>>>>>
> >>>>>>> Sorry to resurrect this old thread, but working on the book
> >>> update, I
> >>>>>>> came across the same today, i.e. we have Merge and HMerge. I
> >> tried
> >>>> and
> >>>>>>> Merge works fine now. It is also the only one of the two flagged
> >> as
> >>>>>>> being a tool. Should HMerge be removed? At least deprecated?
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Lars
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, Jul 7, 2011 at 2:03 AM, Ted Yu <yuzhih...@gmail.com>
> >>> wrote:
> >>>>>>>>>> there is already an issue to do this but not revamp of these
> >>>> Merge
> >>>>>>>> classes
> >>>>>>>> I guess the issue is HBASE-1621
> >>>>>>>>
> >>>>>>>> On Wed, Jul 6, 2011 at 2:28 PM, Stack <st...@duboce.net>
> >> wrote:
> >>>>>>>>
> >>>>>>>>> Yeah, can you file an issue Lars.  This stuff is ancient and
> >>> needs
> >>>>> to
> >>>>>>>>> be redone AND redone so we can do merging while table is
> >> online
> >>>>> (there
> >>>>>>>>> is already an issue to do this but not revamp of these Merge
> >>>>> classes).
> >>>>>>>>> The unit tests for Merge are also all junit3 and do whacky
> >>> stuff
> >>>> to
> >>>>>>>>> put up multiple regions.  This should be redone too (they are
> >>>> often
> >>>>>>>>> first thing broke when major change and putting them back
> >>> together
> >>>>> is
> >>>>>>>>> a headache since they do not follow the usual pattern).
> >>>>>>>>>
> >>>>>>>>> St.Ack
> >>>>>>>>>
> >>>>>>>>> On Sun, Jul 3, 2011 at 12:38 AM, Lars George <
> >>>> lars.geo...@gmail.com
> >>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>> Hi Ted,
> >>>>>>>>>>
> >>>>>>>>>> The log is from an earlier attempt, I tried this a few
> >> times.
> >>>> This
> >>>>>> is
> >>>>>>> all
> >>>>>>>>> local, after rm'ing the /hbase. So the files are all pretty
> >>> empty,
> >>>>> but
> >>>>>>> since
> >>>>>>>>> I put data in I was assuming it should work. Once you gotten
> >>> into
> >>>>> this
> >>>>>>>>> state, you also get funny error messages in the shell:
> >>>>>>>>>>
> >>>>>>>>>> hbase(main):001:0> list
> >>>>>>>>>> TABLE
> >>>>>>>>>> 11/07/03 09:36:21 INFO ipc.HBaseRPC: Using
> >>>>>>>>> org.apache.hadoop.hbase.ipc.WritableRpcEngine for
> >>>>>>>>> org.apache.hadoop.hbase.ipc.HMasterInterface
> >>>>>>>>>>
> >>>>>>>>>> ERROR: undefined method `map' for nil:NilClass
> >>>>>>>>>>
> >>>>>>>>>> Here is some help for this command:
> >>>>>>>>>> List all tables in hbase. Optional regular expression
> >>> parameter
> >>>>>> could
> >>>>>>>>>> be used to filter the output. Examples:
> >>>>>>>>>>
> >>>>>>>>>> hbase> list
> >>>>>>>>>> hbase> list 'abc.*'
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> hbase(main):002:0>
> >>>>>>>>>>
> >>>>>>>>>> I am assuming this is collateral, but why? The UI works but
> >>> the
> >>>>>> table
> >>>>>>> is
> >>>>>>>>> gone too.
> >>>>>>>>>>
> >>>>>>>>>> Lars
> >>>>>>>>>>
> >>>>>>>>>>> On Jul 2, 2011, at 10:55 PM, Ted Yu wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> There is TestMergeTool which tests Merge.
> >>>>>>>>>>>
> >>>>>>>>>>> From the log you provided, I got a little confused as why
> >>>>>>>>>>> 'testtable,row-20,1309613053987.
> >>> 23a35ac696bdf4a8023dcc4c5b8419
> >>>>> e0.'
> >>>>>>>>> didn't
> >>>>>>>>>>> appear in your command line or the output from .META.
> >>> scanning.
> >>>>>>>>>>>
> >>>>>>>>>>> On Sat, Jul 2, 2011 at 10:36 AM, Lars George <
> >>>>>> lars.geo...@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>>
> >>>>>>>>>>>> These two seem both in a bit of a weird state: HMerge is
> >>>> scoped
> >>>>>>> package
> >>>>>>>>>>>> local, therefore no one but the package can call the
> >> merge()
> >>>>>>>>> functions...
> >>>>>>>>>>>> and no one does that but the unit test. But it would be
> >> good
> >>>> to
> >>>>>> have
> >>>>>>>>> this on
> >>>>>>>>>>>> the CLI and shell as a command (and in the shell maybe
> >> with
> >>> a
> >>>>>>>>> confirmation
> >>>>>>>>>>>> message?), but it is not available AFAIK.
> >>>>>>>>>>>>
> >>>>>>>>>>>> HMerge can merge regions of tables that are disabled. It
> >>> also
> >>>>>> merges
> >>>>>>>>> all
> >>>>>>>>>>>> that qualify, i.e. where the merged region is less than or
> >>>> equal
> >>>>>> of
> >>>>>>>>> half the
> >>>>>>>>>>>> configured max file size.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Merge on the other hand does have a main(), so can be
> >>> invoked:
> >>>>>>>>>>>>
> >>>>>>>>>>>> $ hbase org.apache.hadoop.hbase.util.Merge
> >>>>>>>>>>>> Usage: bin/hbase merge <table-name> <region-1> <region-2>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Note how the help insinuates that you can use it as a
> >> tool,
> >>>> but
> >>>>>>> that is
> >>>>>>>>> not
> >>>>>>>>>>>> correct. Also, it only merges two given regions, and the
> >>>> cluster
> >>>>>>> must
> >>>>>>>>> be
> >>>>>>>>>>>> shut down (only the HBase daemons). So that is a step
> >> back.
> >>>>>>>>>>>>
> >>>>>>>>>>>> What is worse is that I cannot get it to work. I tried in
> >>> the
> >>>>>> shell:
> >>>>>>>>>>>>
> >>>>>>>>>>>> hbase(main):001:0> create 'testtable', 'colfam1',  {SPLITS
> >>> =>
> >>>>>>>>>>>> ['row-10','row-20','row-30','row-40','row-50']}
> >>>>>>>>>>>> 0 row(s) in 0.2640 seconds
> >>>>>>>>>>>>
> >>>>>>>>>>>> hbase(main):002:0> for i in '0'..'9' do for j in '0'..'9'
> >> do
> >>>> put
> >>>>>>>>>>>> 'testtable', "row-#{i}#{j}", "colfam1:#{j}", "#{j}" end
> >> end
> >>>>>>>>>>>> 0 row(s) in 1.0450 seconds
> >>>>>>>>>>>>
> >>>>>>>>>>>> hbase(main):003:0> flush 'testtable'
> >>>>>>>>>>>> 0 row(s) in 0.2000 seconds
> >>>>>>>>>>>>
> >>>>>>>>>>>> hbase(main):004:0> scan '.META.', { COLUMNS =>
> >>>>>> ['info:regioninfo']}
> >>>>>>>>>>>> ROW                                  COLUMN+CELL
> >>>>>>>>>>>> testtable,,1309614509037.612d1e0112
> >> column=info:regioninfo,
> >>>>>>>>>>>> timestamp=130...
> >>>>>>>>>>>> 406e6c2bb482eeaec57322.             STARTKEY => '', ENDKEY
> >>> =>
> >>>>>>> 'row-10'
> >>>>>>>>>>>> testtable,row-10,1309614509040.2fba
> >> column=info:regioninfo,
> >>>>>>>>>>>> timestamp=130...
> >>>>>>>>>>>> fcc9bc6afac94c465ce5dcabc5d1.       STARTKEY => 'row-10',
> >>>> ENDKEY
> >>>>>> =>
> >>>>>>>>>>>> 'row-20'
> >>>>>>>>>>>> testtable,row-20,1309614509041.e7c1
> >> column=info:regioninfo,
> >>>>>>>>>>>> timestamp=130...
> >>>>>>>>>>>> 6267eb30e147e5d988c63d40f982.       STARTKEY => 'row-20',
> >>>> ENDKEY
> >>>>>> =>
> >>>>>>>>>>>> 'row-30'
> >>>>>>>>>>>> testtable,row-30,1309614509041.a9cd
> >> column=info:regioninfo,
> >>>>>>>>>>>> timestamp=130...
> >>>>>>>>>>>> e1cbc7d1a21b1aca2ac7fda30ad8.       STARTKEY => 'row-30',
> >>>> ENDKEY
> >>>>>> =>
> >>>>>>>>>>>> 'row-40'
> >>>>>>>>>>>> testtable,row-40,1309614509041.d458
> >> column=info:regioninfo,
> >>>>>>>>>>>> timestamp=130...
> >>>>>>>>>>>> 236feae097efcf33477e7acc51d4.       STARTKEY => 'row-40',
> >>>> ENDKEY
> >>>>>> =>
> >>>>>>>>>>>> 'row-50'
> >>>>>>>>>>>> testtable,row-50,1309614509041.74a5
> >> column=info:regioninfo,
> >>>>>>>>>>>> timestamp=130...
> >>>>>>>>>>>> 7dc7e3e9602d9229b15d4c0357d1.       STARTKEY => 'row-50',
> >>>> ENDKEY
> >>>>>> =>
> >>>>>>> ''
> >>>>>>>>>>>> 6 row(s) in 0.0440 seconds
> >>>>>>>>>>>>
> >>>>>>>>>>>> hbase(main):005:0> exit
> >>>>>>>>>>>>
> >>>>>>>>>>>> $ ./bin/stop-hbase.sh
> >>>>>>>>>>>>
> >>>>>>>>>>>> $ hbase org.apache.hadoop.hbase.util.Merge testtable \
> >>>>>>>>>>>> testtable,row-20,1309614509041.
> >>> e7c16267eb30e147e5d988c63d40f9
> >>>>> 82.
> >>>>>> \
> >>>>>>>>>>>> testtable,row-30,1309614509041.
> >>> a9cde1cbc7d1a21b1aca2ac7fda30a
> >>>>> d8.
> >>>>>>>>>>>>
> >>>>>>>>>>>> But I get consistently errors:
> >>>>>>>>>>>>
> >>>>>>>>>>>> 11/07/02 07:20:49 INFO util.Merge: Merging regions
> >>>>>>>>>>>> testtable,row-20,1309613053987.
> >>> 23a35ac696bdf4a8023dcc4c5b8419
> >>>>> e0.
> >>>>>>> and
> >>>>>>>>>>>> testtable,row-30,1309613053987.
> >> 3664920956c30ac5ff2a7726e4e6
> >>>> in
> >>>>>>> table
> >>>>>>>>>>>> testtable
> >>>>>>>>>>>> 11/07/02 07:20:49 INFO wal.HLog: HLog configuration:
> >>>>> blocksize=32
> >>>>>>> MB,
> >>>>>>>>>>>> rollsize=30.4 MB, enabled=true, optionallogflushinternal=
> >>>> 1000ms
> >>>>>>>>>>>> 11/07/02 07:20:49 INFO wal.HLog: New hlog
> >>>>>>>>>>>>
> >>>>>>>>> /Volumes/Macintosh-HD/Users/larsgeorge/.logs_
> >>> 1309616449171/hlog.
> >>>>>>> 1309616449181
> >>>>>>>>>>>> 11/07/02 07:20:49 INFO wal.HLog:
> >>> getNumCurrentReplicas--HDFS-
> >>>>> 826
> >>>>>>> not
> >>>>>>>>>>>> available; hdfs_out=org.apache.hadoop.fs.
> >>>>>>> FSDataOutputStream@25961581,
> >>>>>>>>>>>>
> >>>>>>>>> exception=org.apache.hadoop.fs.ChecksumFileSystem$
> >>>>>>> ChecksumFSOutputSummer.getNumCurrentReplicas()
> >>>>>>>>>>>> 11/07/02 07:20:49 INFO regionserver.HRegion: Setting up
> >>>>>>> tabledescriptor
> >>>>>>>>>>>> config now ...
> >>>>>>>>>>>> 11/07/02 07:20:49 INFO regionserver.HRegion: Onlined
> >>>>>>>>> -ROOT-,,0.70236052;
> >>>>>>>>>>>> next sequenceid=1
> >>>>>>>>>>>> info: null
> >>>>>>>>>>>> region1: [B@48fd918a
> >>>>>>>>>>>> region2: [B@7f5e2075
> >>>>>>>>>>>> 11/07/02 07:20:49 FATAL util.Merge: Merge failed
> >>>>>>>>>>>> java.io.IOException: Could not find meta region for
> >>>>>>>>>>>> testtable,row-20,1309613053987.
> >>> 23a35ac696bdf4a8023dcc4c5b8419
> >>>>> e0.
> >>>>>>>>>>>>      at
> >>>>>>>>>>>> org.apache.hadoop.hbase.util.Merge.mergeTwoRegions(Merge.
> >>>>>> java:211)
> >>>>>>>>>>>>      at org.apache.hadoop.hbase.util.
> >>>> Merge.run(Merge.java:111)
> >>>>>>>>>>>>      at org.apache.hadoop.util.
> >> ToolRunner.run(ToolRunner.
> >>>>>> java:65)
> >>>>>>>>>>>>      at org.apache.hadoop.hbase.util.
> >>>>> Merge.main(Merge.java:386)
> >>>>>>>>>>>> 11/07/02 07:20:49 INFO regionserver.HRegion: Setting up
> >>>>>>> tabledescriptor
> >>>>>>>>>>>> config now ...
> >>>>>>>>>>>> 11/07/02 07:20:49 INFO regionserver.HRegion: Onlined
> >>>>>>>>> .META.,,1.1028785192;
> >>>>>>>>>>>> next sequenceid=1
> >>>>>>>>>>>> 11/07/02 07:20:49 INFO regionserver.HRegion: Closed
> >>>>>>> -ROOT-,,0.70236052
> >>>>>>>>>>>> 11/07/02 07:20:49 INFO wal.HLog: main.logSyncer exiting
> >>>>>>>>>>>> 11/07/02 07:20:49 ERROR util.Merge: exiting due to error
> >>>>>>>>>>>> java.lang.NullPointerException
> >>>>>>>>>>>>      at
> >>>>>>>>> org.apache.hadoop.hbase.util.Merge$1.processRow(Merge.java:
> >> 119)
> >>>>>>>>>>>>      at
> >>>>>>>>>>>>
> >>>>>>>>> org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(
> >>>>>>> MetaUtils.java:229)
> >>>>>>>>>>>>      at
> >>>>>>>>>>>>
> >>>>>>>>> org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(
> >>>>>>> MetaUtils.java:258)
> >>>>>>>>>>>>      at org.apache.hadoop.hbase.util.
> >>>> Merge.run(Merge.java:116)
> >>>>>>>>>>>>      at org.apache.hadoop.util.
> >> ToolRunner.run(ToolRunner.
> >>>>>> java:65)
> >>>>>>>>>>>>      at org.apache.hadoop.hbase.util.
> >>>>> Merge.main(Merge.java:386)
> >>>>>>>>>>>>
> >>>>>>>>>>>> After which I most of the times have shot .META. with an
> >>> error
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2011-07-02 06:42:10,763 WARN org.apache.hadoop.hbase.
> >>>>>>> master.HMaster:
> >>>>>>>>> Failed
> >>>>>>>>>>>> getting all descriptors
> >>>>>>>>>>>> java.io.FileNotFoundException: No status for
> >>>>>>>>>>>> hdfs://localhost:8020/hbase/.corrupt
> >>>>>>>>>>>>      at
> >>>>>>>>>>>>
> >>>>>>>>> org.apache.hadoop.hbase.util.FSUtils.getTableInfoModtime(
> >>>>>>> FSUtils.java:888)
> >>>>>>>>>>>>      at
> >>>>>>>>>>>>
> >>>>>>>>> org.apache.hadoop.hbase.util.FSTableDescriptors.get(
> >>>>>>> FSTableDescriptors.java:122)
> >>>>>>>>>>>>      at
> >>>>>>>>>>>>
> >>>>>>>>> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(
> >>>>>>> FSTableDescriptors.java:149)
> >>>>>>>>>>>>      at
> >>>>>>>>>>>>
> >>>>>>>>> org.apache.hadoop.hbase.master.HMaster.
> >>>>> getHTableDescriptors(HMaster.
> >>>>>>> java:1429)
> >>>>>>>>>>>>      at sun.reflect.NativeMethodAccessorImpl.
> >>> invoke0(Native
> >>>>>>> Method)
> >>>>>>>>>>>>      at
> >>>>>>>>>>>>
> >>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(
> >>>>>>> NativeMethodAccessorImpl.java:39)
> >>>>>>>>>>>>      at
> >>>>>>>>>>>>
> >>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> >>>>>>> DelegatingMethodAccessorImpl.java:25)
> >>>>>>>>>>>>      at java.lang.reflect.Method.invoke(Method.java:597)
> >>>>>>>>>>>>      at
> >>>>>>>>>>>>
> >>>>>>>>> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(
> >>>>>>> WritableRpcEngine.java:312)
> >>>>>>>>>>>>      at
> >>>>>>>>>>>>
> >>>>>>>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(
> >>>>>>> HBaseServer.java:1065)
> >>>>>>>>>>>>
> >>>>>>>>>>>> Lars
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>>
> >>>>> -- Appy
> >>>>>
> >>>>
> >>>
> >>
>

Reply via email to