Sorry to spam the list :( Another interesting thing.
Now most of my tablesare online. For few I'm getting this: Caused by: java.lang.IllegalArgumentException: Invalid HFile version: major=2, minor=1: expected at least major=2 and minor=3 at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.checkFileVersion(HFileReaderImpl.java:332) at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.<init>(HFileReaderImpl.java:199) at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile.java:538) ... 13 more What is interesting is tat I'm not doing anything on the source cluster for weeks/months. So all tables are all major compacted the same way. I will major compact them all under HFiles v3 format and retry. 2017-12-31 13:33 GMT-05:00 Jean-Marc Spaggiari <jean-m...@spaggiari.org>: > Ok. With a brand new DestCP from source cluster, regions are getting > assigned correctly. So sound like if they get stuck initially for any > reason, then even if the reason is fixed they can not get assigned anymore > again. Will keep playing. > > I kept the previous /hbase just in case we need something from it. > > Thanks, > > JMS > > 2017-12-31 10:23 GMT-05:00 Jean-Marc Spaggiari <jean-m...@spaggiari.org>: > >> Nothing bad that I can see. Here is a region server log: >> https://pastebin.com/0r76Y6ap >> >> Disabling the table makes the regions leave the transition mode. I'm >> trying to disable all tables one by one (because it get stuck after each >> disable) and will see if re-enabling them helps... >> >> On the master side, I now have errors all over: >> 2017-12-31 10:06:26,511 WARN [ProcExecWrkr-89] >> assignment.RegionTransitionProcedure: Retryable error trying to >> transition: pid=511, ppid=398, state=RUNNABLE:REGION_TRANSITION_DISPATCH; >> UnassignProcedure table=work_proposed, >> region=d0a58b76ad9376b12b3e763660049d3d, >> server=node3.com,16020,1514693337210; rit=OPENING, location=node3.com >> ,16020,1514693337210 >> org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected >> [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but >> current state=OPENING >> at org.apache.hadoop.hbase.master.assignment.RegionStates$Regio >> nStateNode.transitionState(RegionStates.java:155) >> at org.apache.hadoop.hbase.master.assignment.AssignmentManager. >> markRegionAsClosing(AssignmentManager.java:1530) >> at org.apache.hadoop.hbase.master.assignment.UnassignProcedure. >> updateTransition(UnassignProcedure.java:179) >> at org.apache.hadoop.hbase.master.assignment.RegionTransitionPr >> ocedure.execute(RegionTransitionProcedure.java:309) >> at org.apache.hadoop.hbase.master.assignment.RegionTransitionPr >> ocedure.execute(RegionTransitionProcedure.java:85) >> at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Proce >> dure.java:845) >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execPro >> cedure(ProcedureExecutor.java:1456) >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execute >> Procedure(ProcedureExecutor.java:1225) >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$ >> 800(ProcedureExecutor.java:78) >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerT >> hread.run(ProcedureExecutor.java:1735) >> >> Non-stop showing on the logs. Probably because I disabled the table. >> Restarting HBase so see if it clears that a but... >> >> After restart there isn't any org.apache.hadoop.hbase.except >> ions.UnexpectedStateException on the logs. Only INFO lever. And nothing >> bad. But still, regions are stuck in transition even for the disabled >> tables. >> >> Master ls are here. I removed some sections because it always says the >> same thing, for each and every single region: https://pastebin.com/K >> 6SQ7DXP >> >> JMS >> >> 2017-12-31 9:58 GMT-05:00 stack <saint....@gmail.com>: >> >>> There is nothing further up in the master log from regionservers or on >>> regionservers side on open? >>> >>> Thanks, >>> S >>> >>> On Dec 31, 2017 8:37 AM, "stack" <saint....@gmail.com> wrote: >>> >>> > Good questions. If you disable snappy does it work? If you start over >>> > fresh does it work? It should be picking up native libs. Make an >>> issue >>> > please jms. Thanks for giving it a go. >>> > >>> > S >>> > >>> > On Dec 30, 2017 11:49 PM, "Jean-Marc Spaggiari" < >>> jean-m...@spaggiari.org> >>> > wrote: >>> > >>> >> Hi Stack, >>> >> >>> >> I just tried to give it a try... Wipe out all HDFS content and code, >>> all >>> >> HBase content and code, and all ZK. Re-build a brand new cluster with >>> 7 >>> >> physical worker nodes. I'm able to get HBase start, how-ever I'm not >>> able >>> >> to get my regions online. >>> >> >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node8.16020,1514693333206, table=pageMini, >>> >> region=a778eb67898dfd378e426f2e7700faea >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node6.16020,1514693336563, table=work_proposed, >>> >> region=4a1d86197ace3f4c8b1c8de28dbe1d34 >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node1.16020,1514693336898, table=page_crc, >>> >> region=86b3912a09a5676b6851636ed22c2abc >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node7.16020,1514693337406, table=pageAvro, >>> >> region=391784c43c87bdea6df05f96accad0ff >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node8.16020,1514693333206, table=page, >>> >> region=5850d782a3beea18872769bf8fd70fc7 >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node5.16020,1514693330961, table=work_proposed, >>> >> region=1d892c9b54b66f802b82c2f9fe847f1f >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node5.16020,1514693330961, table=pageAvro, >>> >> region=e9de2c68cc01883e959d7953a4251687 >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node3.16020,1514693337210, table=page, >>> >> region=e2e5fc1c262273893f10e92f24817d1b >>> >> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node3.16020,1514693337210, table=page, >>> >> region=89c443c09f10bd1584b1bb86a637e1a8 >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node5.16020,1514693330961, table=page, >>> >> region=8ca93e9285233ca7b31992f194056bc1 >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node4.16020,1514693339685, table=work_proposed, >>> >> region=9afcf06c4d0d21d7e04b0223edcfc40a >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node6.16020,1514693336563, table=page, >>> >> region=3457b3237c576eecd550eccee3f584cd >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node1.16020,1514693336898, table=page, >>> >> region=dd5fb1dbd41945a9ccbc110b8d4a51b5 >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node7.16020,1514693337406, table=work_proposed, >>> >> region=480bb37af54d9fa57c727da9e8a33578 >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node8.16020,1514693333206, table=page_crc, >>> >> region=56b18d470a569c5474ea084f0d995726 >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node6.16020,1514693336563, table=page_duplicate, >>> >> region=e744a9af161de965c70c7d1a08b07660 >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node1.16020,1514693336898, table=page_proposed, >>> >> region=1c75e53308acac6313db4be63c2b48fe >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node8.16020,1514693333206, table=work_proposed, >>> >> region=45a25ba85f6341a177db7b15554259f9 >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node3.16020,1514693337210, table=work_proposed, >>> >> region=d0a58b76ad9376b12b3e763660049d3d >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node3.16020,1514693337210, table=page, >>> >> region=599a4b7b21b1d93fa232ebbbef37a31b >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node1.16020,1514693336898, table=page_proposed, >>> >> region=55c07269cc907b8e8875c2a1c4ec27d5 >>> >> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>> >> assignment.AssignmentManager: TODO Handle stuck in transition: >>> >> rit=OPENING, >>> >> location=node5.,16020,1514693330961, table=page_crc, >>> >> region=fa3a3d7ebc64ce2a5494cae01477d8d8 >>> >> >>> >> I'm 99% confident this is because of SNAPPY. I'm fighting to get it >>> >> working >>> >> but it's such a pain! My concern here is I don't see any exception >>> >> anywhere >>> >> on any logs. Nothing on the RS side, nothing on the master side >>> (Except >>> >> extract above). >>> >> >>> >> I suspect it's snappy because of this: >>> >> >>> >> hbase@node2:~/hbase-2.0.0-beta-1$ bin/hbase >>> >> org.apache.hadoop.hbase.util.CompressionTest hdfs://node2/tmp/snappy >>> >> snappy >>> >> 2017-12-31 00:45:31,006 WARN [main] util.NativeCodeLoader: Unable to >>> load >>> >> native-hadoop library for your platform... using builtin-java classes >>> >> where >>> >> applicable >>> >> 2017-12-31 00:45:33,283 INFO [main] metrics.MetricRegistries: Loaded >>> >> MetricRegistries class >>> >> org.apache.hadoop.hbase.metrics.impl.MetricRegistriesImpl >>> >> 2017-12-31 00:45:33,366 INFO [main] hfile.CacheConfig: Created >>> >> cacheConfig: CacheConfig:disabled >>> >> Exception in thread "main" java.lang.RuntimeException: native snappy >>> >> library not available: this version of libhadoop was built without >>> snappy >>> >> support. >>> >> at >>> >> org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoa >>> >> ded(SnappyCodec.java:65) >>> >> at >>> >> org.apache.hadoop.io.compress.SnappyCodec.getCompressorType( >>> >> SnappyCodec.java:134) >>> >> at >>> >> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecP >>> ool.java:150) >>> >> at >>> >> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecP >>> ool.java:168) >>> >> at >>> >> org.apache.hadoop.hbase.io.compress.Compression$Algorithm. >>> >> getCompressor(Compression.java:355) >>> >> at >>> >> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodin >>> >> gContext.<init>(HFileBlockDefaultEncodingContext.java:90) >>> >> at >>> >> org.apache.hadoop.hbase.io.hfile.NoOpDataBlockEncoder.newDat >>> >> aBlockEncodingContext(NoOpDataBlockEncoder.java:85) >>> >> at >>> >> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.<init>( >>> >> HFileBlock.java:923) >>> >> at >>> >> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.finishInit( >>> >> HFileWriterImpl.java:296) >>> >> at >>> >> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.<init>(HFil >>> >> eWriterImpl.java:186) >>> >> at >>> >> org.apache.hadoop.hbase.io.hfile.HFile$WriterFactory.create( >>> >> HFile.java:339) >>> >> at >>> >> org.apache.hadoop.hbase.util.CompressionTest.doSmokeTest(Com >>> >> pressionTest.java:129) >>> >> at >>> >> org.apache.hadoop.hbase.util.CompressionTest.main(Compressio >>> >> nTest.java:167) >>> >> >>> >> But I think my installation is fine: >>> >> hbase@node2:~/hbase-2.0.0-beta-1$ ll native-build/ >>> >> total 308 >>> >> lrwxrwxrwx 1 hbase hbase 24 déc 31 00:29 libhadoopsnappy.so -> >>> >> libhadoopsnappy.so.0.0.1 >>> >> lrwxrwxrwx 1 hbase hbase 24 déc 31 00:29 libhadoopsnappy.so.0 -> >>> >> libhadoopsnappy.so.0.0.1 >>> >> -rwxr-xr-x 1 hbase hbase 120144 déc 31 00:29 libhadoopsnappy.so.0.0.1 >>> >> lrwxrwxrwx 1 hbase hbase 18 déc 1 2012 libsnappy.so -> >>> >> libsnappy.so.1.1.3 >>> >> lrwxrwxrwx 1 hbase hbase 18 déc 1 2012 libsnappy.so.1 -> >>> >> libsnappy.so.1.1.3 >>> >> -rwxr-xr-x 1 hbase hbase 178210 déc 1 2012 libsnappy.so.1.1.3 >>> >> drwxr-xr-x 3 hbase hbase 4096 déc 30 15:44 python2.6 >>> >> drwxr-xr-x 4 hbase hbase 4096 déc 30 23:35 python2.7 >>> >> drwxr-xr-x 3 hbase hbase 4096 déc 30 23:29 python3.5 >>> >> >>> >> an in hbase-env.sh: >>> >> export JAVA_HOME=/usr/local/jdk1.8.0_151 >>> >> export HBASE_LIBRARY_PATH=/home/hbase/hbase-2.0.0-beta-1/native-build >>> >> >>> >> >>> >> So there is 2 things here. >>> >> 1) Why are the region servers not reporting any error when they are >>> not >>> >> able to open a region because of the compression codec not being >>> loaded? >>> >> 2) Why is HBase not picking up the Snappy codec. >>> >> >>> >> Thanks, >>> >> >>> >> JMS >>> >> >>> >> >>> >> 2017-12-29 13:15 GMT-05:00 Stack <st...@duboce.net>: >>> >> >>> >> > The first release candidate for HBase 2.0.0-beta-1 is up at: >>> >> > >>> >> > https://dist.apache.org/repos/dist/dev/hbase/hbase-2.0.0-bet >>> a-1-RC0/ >>> >> > >>> >> > Maven artifacts are available from a staging directory here: >>> >> > >>> >> > https://repository.apache.org/content/repositories/orgapache >>> hbase-1188 >>> >> > >>> >> > All was signed with my key at 8ACC93D2 [1] >>> >> > >>> >> > I tagged the RC as 2.0.0-beta-1-RC0 >>> >> > (0907563eb72697b394b8b960fe54887d6ff304fd) >>> >> > >>> >> > hbase-2.0.0-beta-1 is our first beta release. It includes all that >>> was >>> >> in >>> >> > previous alphas (new assignment manager, offheap read/write path, >>> >> in-memory >>> >> > compactions, etc.). The APIs and feature-set are sealed. >>> >> > >>> >> > hbase-2.0.0-beta-1 is a not-for-production preview of hbase-2.0.0. >>> It is >>> >> > meant for devs and downstreamers to test drive and flag us if we >>> messed >>> >> up >>> >> > on anything ahead of our rolling GAs. We are particular interested >>> in >>> >> > hearing from Coprocessor developers. >>> >> > >>> >> > The list of features addressed in 2.0.0 so far can be found here >>> [3]. >>> >> There >>> >> > are thousands. The list of ~2k+ fixes in 2.0.0 exclusively can be >>> found >>> >> > here [4] (My JIRA JQL foo is a bit dodgy -- forgive me if mistakes). >>> >> > >>> >> > I've updated our overview doc. on the state of 2.0.0 [6]. We'll do >>> one >>> >> more >>> >> > beta before we put up our first 2.0.0 Release Candidate by the end >>> of >>> >> > January, 2.0.0-beta-2. Its focus will be making it so users can do a >>> >> > rolling upgrade on to hbase-2.x from hbase-1.x (and any bug fixes >>> found >>> >> > running beta-1). Here is the list of what we have targeted so far >>> for >>> >> > beta-2 [5]. Check it out. >>> >> > >>> >> > One knownissue is that the User API has not been properly filtered >>> so it >>> >> > shows more than just InterfaceAudience Public content (HBASE-19663, >>> to >>> >> be >>> >> > fixed by beta-2). >>> >> > >>> >> > Please take this beta for a spin. Please vote on whether it ok to >>> put >>> >> out >>> >> > this RC as our first beta (Note CHANGES has not yet been updated). >>> Let >>> >> the >>> >> > VOTE be open for 72 hours (Monday) >>> >> > >>> >> > Thanks, >>> >> > Your 2.0.0 Release Manager >>> >> > >>> >> > 1. http://pgp.mit.edu/pks/lookup?op=get&search=0x9816C7FC8ACC93D2 >>> >> > 3. https://goo.gl/scYjJr >>> >> > 4. https://goo.gl/dFFT8b >>> >> > 5. https://issues.apache.org/jira/projects/HBASE/versions/12340862 >>> >> > 6. https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_ >>> >> > ktczrlKHK8N4SZzs/ >>> >> > >>> >> >>> > >>> >> >> >