Is an extension here a reasonable ask? Putting the vote up right before what is a long New Year weekend for many folks doesn't give a lot of opportunity for thorough review.
Mike On Mon, Jan 1, 2018 at 1:30 PM, stack <[email protected]> wrote: > This is great stuff jms. Thank you. Away from computer at mo but will dig > in. > > Is it possible old files left over written with old hbase with old hfile > version? Can you see on source? They should have but updated by a > compaction if a long time idle, I agree. > > Yeah. If region assign fails, and goes into assignable state, we need > intervention. We've been shutting down all the ways in which this could > happen but you seem to have stumbled on a new one. I will take a look at > your logs. > > What you going to vote? Does it basically work? > > Thanks again for the try out. > S > > On Dec 31, 2017 12:43 PM, "Jean-Marc Spaggiari" <[email protected]> > wrote: > > Sorry to spam the list :( > > Another interesting thing. > > Now most of my tablesare online. For few I'm getting this: > Caused by: java.lang.IllegalArgumentException: Invalid HFile version: > major=2, minor=1: expected at least major=2 and minor=3 > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.checkFileVersion( > HFileReaderImpl.java:332) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.<init>( > HFileReaderImpl.java:199) > at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile. > java:538) > ... 13 more > > What is interesting is tat I'm not doing anything on the source cluster for > weeks/months. So all tables are all major compacted the same way. I will > major compact them all under HFiles v3 format and retry. > > 2017-12-31 13:33 GMT-05:00 Jean-Marc Spaggiari <[email protected]>: > > > Ok. With a brand new DestCP from source cluster, regions are getting > > assigned correctly. So sound like if they get stuck initially for any > > reason, then even if the reason is fixed they can not get assigned > anymore > > again. Will keep playing. > > > > I kept the previous /hbase just in case we need something from it. > > > > Thanks, > > > > JMS > > > > 2017-12-31 10:23 GMT-05:00 Jean-Marc Spaggiari <[email protected] > >: > > > >> Nothing bad that I can see. Here is a region server log: > >> https://pastebin.com/0r76Y6ap > >> > >> Disabling the table makes the regions leave the transition mode. I'm > >> trying to disable all tables one by one (because it get stuck after each > >> disable) and will see if re-enabling them helps... > >> > >> On the master side, I now have errors all over: > >> 2017-12-31 10:06:26,511 WARN [ProcExecWrkr-89] > >> assignment.RegionTransitionProcedure: Retryable error trying to > >> transition: pid=511, ppid=398, state=RUNNABLE:REGION_ > TRANSITION_DISPATCH; > >> UnassignProcedure table=work_proposed, region= > d0a58b76ad9376b12b3e763660049d3d, > >> server=node3.com,16020,1514693337210; rit=OPENING, location=node3.com > >> ,16020,1514693337210 > >> org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected > >> [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but > >> current state=OPENING > >> at org.apache.hadoop.hbase.master.assignment.RegionStates$Regio > >> nStateNode.transitionState(RegionStates.java:155) > >> at org.apache.hadoop.hbase.master.assignment.AssignmentManager. > >> markRegionAsClosing(AssignmentManager.java:1530) > >> at org.apache.hadoop.hbase.master.assignment.UnassignProcedure. > >> updateTransition(UnassignProcedure.java:179) > >> at org.apache.hadoop.hbase.master.assignment.RegionTransitionPr > >> ocedure.execute(RegionTransitionProcedure.java:309) > >> at org.apache.hadoop.hbase.master.assignment.RegionTransitionPr > >> ocedure.execute(RegionTransitionProcedure.java:85) > >> at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Proce > >> dure.java:845) > >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execPro > >> cedure(ProcedureExecutor.java:1456) > >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execute > >> Procedure(ProcedureExecutor.java:1225) > >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$ > >> 800(ProcedureExecutor.java:78) > >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerT > >> hread.run(ProcedureExecutor.java:1735) > >> > >> Non-stop showing on the logs. Probably because I disabled the table. > >> Restarting HBase so see if it clears that a but... > >> > >> After restart there isn't any org.apache.hadoop.hbase.except > >> ions.UnexpectedStateException on the logs. Only INFO lever. And nothing > >> bad. But still, regions are stuck in transition even for the disabled > >> tables. > >> > >> Master ls are here. I removed some sections because it always says the > >> same thing, for each and every single region: https://pastebin.com/K > >> 6SQ7DXP > >> > >> JMS > >> > >> 2017-12-31 9:58 GMT-05:00 stack <[email protected]>: > >> > >>> There is nothing further up in the master log from regionservers or on > >>> regionservers side on open? > >>> > >>> Thanks, > >>> S > >>> > >>> On Dec 31, 2017 8:37 AM, "stack" <[email protected]> wrote: > >>> > >>> > Good questions. If you disable snappy does it work? If you start > over > >>> > fresh does it work? It should be picking up native libs. Make an > >>> issue > >>> > please jms. Thanks for giving it a go. > >>> > > >>> > S > >>> > > >>> > On Dec 30, 2017 11:49 PM, "Jean-Marc Spaggiari" < > >>> [email protected]> > >>> > wrote: > >>> > > >>> >> Hi Stack, > >>> >> > >>> >> I just tried to give it a try... Wipe out all HDFS content and code, > >>> all > >>> >> HBase content and code, and all ZK. Re-build a brand new cluster > with > >>> 7 > >>> >> physical worker nodes. I'm able to get HBase start, how-ever I'm not > >>> able > >>> >> to get my regions online. > >>> >> > >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node8.16020,1514693333206, table=pageMini, > >>> >> region=a778eb67898dfd378e426f2e7700faea > >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node6.16020,1514693336563, table=work_proposed, > >>> >> region=4a1d86197ace3f4c8b1c8de28dbe1d34 > >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node1.16020,1514693336898, table=page_crc, > >>> >> region=86b3912a09a5676b6851636ed22c2abc > >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node7.16020,1514693337406, table=pageAvro, > >>> >> region=391784c43c87bdea6df05f96accad0ff > >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node8.16020,1514693333206, table=page, > >>> >> region=5850d782a3beea18872769bf8fd70fc7 > >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node5.16020,1514693330961, table=work_proposed, > >>> >> region=1d892c9b54b66f802b82c2f9fe847f1f > >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node5.16020,1514693330961, table=pageAvro, > >>> >> region=e9de2c68cc01883e959d7953a4251687 > >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node3.16020,1514693337210, table=page, > >>> >> region=e2e5fc1c262273893f10e92f24817d1b > >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node3.16020,1514693337210, table=page, > >>> >> region=89c443c09f10bd1584b1bb86a637e1a8 > >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node5.16020,1514693330961, table=page, > >>> >> region=8ca93e9285233ca7b31992f194056bc1 > >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node4.16020,1514693339685, table=work_proposed, > >>> >> region=9afcf06c4d0d21d7e04b0223edcfc40a > >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node6.16020,1514693336563, table=page, > >>> >> region=3457b3237c576eecd550eccee3f584cd > >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node1.16020,1514693336898, table=page, > >>> >> region=dd5fb1dbd41945a9ccbc110b8d4a51b5 > >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node7.16020,1514693337406, table=work_proposed, > >>> >> region=480bb37af54d9fa57c727da9e8a33578 > >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node8.16020,1514693333206, table=page_crc, > >>> >> region=56b18d470a569c5474ea084f0d995726 > >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node6.16020,1514693336563, table=page_duplicate, > >>> >> region=e744a9af161de965c70c7d1a08b07660 > >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node1.16020,1514693336898, table=page_proposed, > >>> >> region=1c75e53308acac6313db4be63c2b48fe > >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node8.16020,1514693333206, table=work_proposed, > >>> >> region=45a25ba85f6341a177db7b15554259f9 > >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node3.16020,1514693337210, table=work_proposed, > >>> >> region=d0a58b76ad9376b12b3e763660049d3d > >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node3.16020,1514693337210, table=page, > >>> >> region=599a4b7b21b1d93fa232ebbbef37a31b > >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node1.16020,1514693336898, table=page_proposed, > >>> >> region=55c07269cc907b8e8875c2a1c4ec27d5 > >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] > >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: > >>> >> rit=OPENING, > >>> >> location=node5.,16020,1514693330961, table=page_crc, > >>> >> region=fa3a3d7ebc64ce2a5494cae01477d8d8 > >>> >> > >>> >> I'm 99% confident this is because of SNAPPY. I'm fighting to get it > >>> >> working > >>> >> but it's such a pain! My concern here is I don't see any exception > >>> >> anywhere > >>> >> on any logs. Nothing on the RS side, nothing on the master side > >>> (Except > >>> >> extract above). > >>> >> > >>> >> I suspect it's snappy because of this: > >>> >> > >>> >> hbase@node2:~/hbase-2.0.0-beta-1$ bin/hbase > >>> >> org.apache.hadoop.hbase.util.CompressionTest > hdfs://node2/tmp/snappy > >>> >> snappy > >>> >> 2017-12-31 00:45:31,006 WARN [main] util.NativeCodeLoader: Unable > to > >>> load > >>> >> native-hadoop library for your platform... using builtin-java > classes > >>> >> where > >>> >> applicable > >>> >> 2017-12-31 00:45:33,283 INFO [main] metrics.MetricRegistries: > Loaded > >>> >> MetricRegistries class > >>> >> org.apache.hadoop.hbase.metrics.impl.MetricRegistriesImpl > >>> >> 2017-12-31 00:45:33,366 INFO [main] hfile.CacheConfig: Created > >>> >> cacheConfig: CacheConfig:disabled > >>> >> Exception in thread "main" java.lang.RuntimeException: native snappy > >>> >> library not available: this version of libhadoop was built without > >>> snappy > >>> >> support. > >>> >> at > >>> >> org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoa > >>> >> ded(SnappyCodec.java:65) > >>> >> at > >>> >> org.apache.hadoop.io.compress.SnappyCodec.getCompressorType( > >>> >> SnappyCodec.java:134) > >>> >> at > >>> >> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecP > >>> ool.java:150) > >>> >> at > >>> >> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecP > >>> ool.java:168) > >>> >> at > >>> >> org.apache.hadoop.hbase.io.compress.Compression$Algorithm. > >>> >> getCompressor(Compression.java:355) > >>> >> at > >>> >> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodin > >>> >> gContext.<init>(HFileBlockDefaultEncodingContext.java:90) > >>> >> at > >>> >> org.apache.hadoop.hbase.io.hfile.NoOpDataBlockEncoder.newDat > >>> >> aBlockEncodingContext(NoOpDataBlockEncoder.java:85) > >>> >> at > >>> >> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.<init>( > >>> >> HFileBlock.java:923) > >>> >> at > >>> >> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.finishInit( > >>> >> HFileWriterImpl.java:296) > >>> >> at > >>> >> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.<init>(HFil > >>> >> eWriterImpl.java:186) > >>> >> at > >>> >> org.apache.hadoop.hbase.io.hfile.HFile$WriterFactory.create( > >>> >> HFile.java:339) > >>> >> at > >>> >> org.apache.hadoop.hbase.util.CompressionTest.doSmokeTest(Com > >>> >> pressionTest.java:129) > >>> >> at > >>> >> org.apache.hadoop.hbase.util.CompressionTest.main(Compressio > >>> >> nTest.java:167) > >>> >> > >>> >> But I think my installation is fine: > >>> >> hbase@node2:~/hbase-2.0.0-beta-1$ ll native-build/ > >>> >> total 308 > >>> >> lrwxrwxrwx 1 hbase hbase 24 déc 31 00:29 libhadoopsnappy.so -> > >>> >> libhadoopsnappy.so.0.0.1 > >>> >> lrwxrwxrwx 1 hbase hbase 24 déc 31 00:29 libhadoopsnappy.so.0 -> > >>> >> libhadoopsnappy.so.0.0.1 > >>> >> -rwxr-xr-x 1 hbase hbase 120144 déc 31 00:29 > libhadoopsnappy.so.0.0.1 > >>> >> lrwxrwxrwx 1 hbase hbase 18 déc 1 2012 libsnappy.so -> > >>> >> libsnappy.so.1.1.3 > >>> >> lrwxrwxrwx 1 hbase hbase 18 déc 1 2012 libsnappy.so.1 -> > >>> >> libsnappy.so.1.1.3 > >>> >> -rwxr-xr-x 1 hbase hbase 178210 déc 1 2012 libsnappy.so.1.1.3 > >>> >> drwxr-xr-x 3 hbase hbase 4096 déc 30 15:44 python2.6 > >>> >> drwxr-xr-x 4 hbase hbase 4096 déc 30 23:35 python2.7 > >>> >> drwxr-xr-x 3 hbase hbase 4096 déc 30 23:29 python3.5 > >>> >> > >>> >> an in hbase-env.sh: > >>> >> export JAVA_HOME=/usr/local/jdk1.8.0_151 > >>> >> export HBASE_LIBRARY_PATH=/home/hbase/hbase-2.0.0-beta-1/ > native-build > >>> >> > >>> >> > >>> >> So there is 2 things here. > >>> >> 1) Why are the region servers not reporting any error when they are > >>> not > >>> >> able to open a region because of the compression codec not being > >>> loaded? > >>> >> 2) Why is HBase not picking up the Snappy codec. > >>> >> > >>> >> Thanks, > >>> >> > >>> >> JMS > >>> >> > >>> >> > >>> >> 2017-12-29 13:15 GMT-05:00 Stack <[email protected]>: > >>> >> > >>> >> > The first release candidate for HBase 2.0.0-beta-1 is up at: > >>> >> > > >>> >> > https://dist.apache.org/repos/dist/dev/hbase/hbase-2.0.0-bet > >>> a-1-RC0/ > >>> >> > > >>> >> > Maven artifacts are available from a staging directory here: > >>> >> > > >>> >> > https://repository.apache.org/content/repositories/orgapache > >>> hbase-1188 > >>> >> > > >>> >> > All was signed with my key at 8ACC93D2 [1] > >>> >> > > >>> >> > I tagged the RC as 2.0.0-beta-1-RC0 > >>> >> > (0907563eb72697b394b8b960fe54887d6ff304fd) > >>> >> > > >>> >> > hbase-2.0.0-beta-1 is our first beta release. It includes all that > >>> was > >>> >> in > >>> >> > previous alphas (new assignment manager, offheap read/write path, > >>> >> in-memory > >>> >> > compactions, etc.). The APIs and feature-set are sealed. > >>> >> > > >>> >> > hbase-2.0.0-beta-1 is a not-for-production preview of hbase-2.0.0. > >>> It is > >>> >> > meant for devs and downstreamers to test drive and flag us if we > >>> messed > >>> >> up > >>> >> > on anything ahead of our rolling GAs. We are particular interested > >>> in > >>> >> > hearing from Coprocessor developers. > >>> >> > > >>> >> > The list of features addressed in 2.0.0 so far can be found here > >>> [3]. > >>> >> There > >>> >> > are thousands. The list of ~2k+ fixes in 2.0.0 exclusively can be > >>> found > >>> >> > here [4] (My JIRA JQL foo is a bit dodgy -- forgive me if > mistakes). > >>> >> > > >>> >> > I've updated our overview doc. on the state of 2.0.0 [6]. We'll do > >>> one > >>> >> more > >>> >> > beta before we put up our first 2.0.0 Release Candidate by the end > >>> of > >>> >> > January, 2.0.0-beta-2. Its focus will be making it so users can do > a > >>> >> > rolling upgrade on to hbase-2.x from hbase-1.x (and any bug fixes > >>> found > >>> >> > running beta-1). Here is the list of what we have targeted so far > >>> for > >>> >> > beta-2 [5]. Check it out. > >>> >> > > >>> >> > One knownissue is that the User API has not been properly filtered > >>> so it > >>> >> > shows more than just InterfaceAudience Public content > (HBASE-19663, > >>> to > >>> >> be > >>> >> > fixed by beta-2). > >>> >> > > >>> >> > Please take this beta for a spin. Please vote on whether it ok to > >>> put > >>> >> out > >>> >> > this RC as our first beta (Note CHANGES has not yet been updated). > >>> Let > >>> >> the > >>> >> > VOTE be open for 72 hours (Monday) > >>> >> > > >>> >> > Thanks, > >>> >> > Your 2.0.0 Release Manager > >>> >> > > >>> >> > 1. http://pgp.mit.edu/pks/lookup?op=get&search=0x9816C7FC8ACC93D2 > >>> >> > 3. https://goo.gl/scYjJr > >>> >> > 4. https://goo.gl/dFFT8b > >>> >> > 5. https://issues.apache.org/jira/projects/HBASE/versions/ > 12340862 > >>> >> > 6. https://docs.google.com/document/d/ > 1WCsVlnHjJeKUcl7wHwqb4z9iEu_ > >>> >> > ktczrlKHK8N4SZzs/ > >>> >> > > >>> >> > >>> > > >>> > >> > >> > > >
