Logging lines at ~milliseconds frequency could certainly lead to a production problem. For example log files filling a volume rendering a master node out of service. If possible can we fix it and spin a new RC?
> On Feb 4, 2016, at 12:18 PM, Stack <st...@duboce.net> wrote: > > Meant to report back here. I've been running loadings with monkeys trying > to repro 'hung balancer'. Now I am of the opinion that there is no 'hung > balancer', just balancer spew. The spew came in on HBASE-13376 which would > be new to 1.2. Want to wait on at least this big cycle to finish before > voting (i've done a bunch of small itbll loops... and all looks good). > > The spew comes after Master joins an up cluster. It doesn't always > happen... seems to be some balance arrangement that brings it on... haven't > spent the time on it. It kinda ugly when it happens but maybe not a > blocker? If we want to roll a new RC, I can apply HBASE-15210 (and there > are some other log cleans I'd do). > > St.Ack > >> On Thu, Feb 4, 2016 at 5:59 AM, Sean Busbey <bus...@cloudera.com> wrote: >> >> Friendly reminder that this vote closes tomorrow. >> >> Stack, do you have a better idea about the severity of HBASE-15207 and >> wether the root cause is new to 1.2.0? >> >> -Sean >> >>> On Tue, Feb 2, 2016 at 1:37 PM, Stack <st...@duboce.net> wrote: >>> >>> It looks like balancer got stuck after Master joined running cluster (it >>> had been killed by Monkey). We then log at a rate of about 10 lines per >>> millisecond. HBASE-15207 >>> St.Ack >>> >>>> On Tue, Feb 2, 2016 at 9:51 AM, Stack <st...@duboce.net> wrote: >>>> >>>> I've been running some cluster loadings on the RC. Last night my logs >>>> filled with this (10x256MB log files): >>>> >>>> .... >>>> 2016-02-01 11:25:26,958 DEBUG >>>> [B.defaultRpcServer.handler=9,queue=0,port=16000] >>>> balancer.BaseLoadBalancer: Lowest locality region server with non zero >>>> regions is ve0542.halxg.cloudera.com with locality 0.0 >>>> 2016-02-01 11:25:26,958 DEBUG >>>> [B.defaultRpcServer.handler=9,queue=0,port=16000] >>>> balancer.BaseLoadBalancer: Lowest locality region index is 0 and its >>>> region server contains 1 regions >>>> ... >>>> >>>> >>>> Added by this: >>>> >>>> commit 54028140f4f19a6af81c8c8f29dda0c52491a0c9 >>>> Author: tedyu <yuzhih...@gmail.com> >>>> Date: Thu Aug 13 09:11:59 2015 -0700 >>>> >>>> HBASE-13376 Improvements to Stochastic load balancer (Vandana >>>> Ayyalasomayajula) >>>> >>>> Folks think this a blocker? >>>> >>>> Let me see if it happens always or if its just stuck balancer. >>>> >>>> St.Ack >>>> >>>> >>>> On Fri, Jan 29, 2016 at 7:29 AM, Sean Busbey <bus...@apache.org> >> wrote: >>>> >>>>> Hi Folks! >>>>> >>>>> I'm pleased to announce the second release candidate for HBase 1.2.0. >>>>> >>>>> Artifacts are available here: >>>>> >>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.2.0RC1/ >>>>> >>>>> As of this vote, the relevant md5 hashes are: >>>>> >>>>> a338ca93cd4c495f03bcff2d457222ef hbase-1.2.0-bin.tar.gz >>>>> 955cf9908ae7fef12e3b1447ce8dd035 hbase-1.2.0-src.tar.gz >>>>> >>>>> Maven artifacts are available in this staging repository: >> https://repository.apache.org/content/repositories/orgapachehbase-1127/ >>>>> >>>>> All artifacts are signed with my code signing key 0D80DB7C, available >> in >>>>> the project KEYS file: >>>>> >>>>> http://www.apache.org/dist/hbase/KEYS >>>>> >>>>> these artifacts correspond to commit hash >>>>> >>>>> 46fc1d876bd604f2f71f8692d79978055a095a7a >>>>> >>>>> which signed tag 1.2.0RC1 currently point to >> https://git1-us-west.apache.org/repos/asf?p=hbase.git;a=tag;h=5696635f2f87da6777878b3755a17e0fa639a5c4 >>>>> >>>>> HBase 1.2.0 is the second minor release in the HBase 1.x line, >>> continuing >>>>> on >>>>> the theme of bringing a stable, reliable database to the Hadoop and >>> NoSQL >>>>> communities. This release includes roughly 250 resolved issues not >>> covered >>>>> by previous 1.x releases. >>>>> >>>>> Notable new features include: >>>>> - JDK8 is now supported >>>>> - Hadoop 2.6.1+ and Hadoop 2.7.1+ are now supported >>>>> - per column-family time ranges for scan (HBASE-14355) >>>>> - daemons respond to SIGHUP to reload configs (HBASE-14529) >>>>> - region location methods added to thrift2 proxy (HBASE-13698) >>>>> - table-level sync that sends deltas (HBASE-13639) >>>>> - client side metrics via JMX (HBASE-12911) >>>>> >>>>> The full list of issues can be found at: >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12332062 >>>>> >>>>> To see the changes since the prior release candidate, you can use the >>>>> following git command on >>>>> a up-to-date checkout of the hbase repository: >>>>> >>>>> git log 1.2.0RC0..1.2.0RC1 >>>>> >>>>> Please take a few minutes to verify the release[1] and vote on >> releasing >>>>> it: >>>>> >>>>> [ ] +1 Release this package as Apache HBase 1.2.0 >>>>> [ ] +0 no opinion >>>>> [ ] -1 Do not release this package because... >>>>> >>>>> Vote will be subject to Majority Approval[2] and will close at 4:00PM >>> UTC >>>>> on Friday, Feb 5th, 2015[3]. >>>>> >>>>> [1]: http://www.apache.org/info/verification.html >>>>> [2]: https://www.apache.org/foundation/glossary.html#MajorityApproval >>>>> [3]: to find this in your local timezone see: >>>>> http://s.apache.org/hbase-1.2.0-rc1-close >> >> >> >> -- >> Sean >>