Seconded. I’ll be back later this week. Can try it out then?
> On Jan 1, 2018, at 12:13 PM, Mike Drob <[email protected]> wrote: > > Is an extension here a reasonable ask? Putting the vote up right before > what is a long New Year weekend for many folks doesn't give a lot of > opportunity for thorough review. > > Mike > >> On Mon, Jan 1, 2018 at 1:30 PM, stack <[email protected]> wrote: >> >> This is great stuff jms. Thank you. Away from computer at mo but will dig >> in. >> >> Is it possible old files left over written with old hbase with old hfile >> version? Can you see on source? They should have but updated by a >> compaction if a long time idle, I agree. >> >> Yeah. If region assign fails, and goes into assignable state, we need >> intervention. We've been shutting down all the ways in which this could >> happen but you seem to have stumbled on a new one. I will take a look at >> your logs. >> >> What you going to vote? Does it basically work? >> >> Thanks again for the try out. >> S >> >> On Dec 31, 2017 12:43 PM, "Jean-Marc Spaggiari" <[email protected]> >> wrote: >> >> Sorry to spam the list :( >> >> Another interesting thing. >> >> Now most of my tablesare online. For few I'm getting this: >> Caused by: java.lang.IllegalArgumentException: Invalid HFile version: >> major=2, minor=1: expected at least major=2 and minor=3 >> at >> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.checkFileVersion( >> HFileReaderImpl.java:332) >> at >> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.<init>( >> HFileReaderImpl.java:199) >> at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile. >> java:538) >> ... 13 more >> >> What is interesting is tat I'm not doing anything on the source cluster for >> weeks/months. So all tables are all major compacted the same way. I will >> major compact them all under HFiles v3 format and retry. >> >> 2017-12-31 13:33 GMT-05:00 Jean-Marc Spaggiari <[email protected]>: >> >>> Ok. With a brand new DestCP from source cluster, regions are getting >>> assigned correctly. So sound like if they get stuck initially for any >>> reason, then even if the reason is fixed they can not get assigned >> anymore >>> again. Will keep playing. >>> >>> I kept the previous /hbase just in case we need something from it. >>> >>> Thanks, >>> >>> JMS >>> >>> 2017-12-31 10:23 GMT-05:00 Jean-Marc Spaggiari <[email protected] >>> : >>> >>>> Nothing bad that I can see. Here is a region server log: >>>> https://pastebin.com/0r76Y6ap >>>> >>>> Disabling the table makes the regions leave the transition mode. I'm >>>> trying to disable all tables one by one (because it get stuck after each >>>> disable) and will see if re-enabling them helps... >>>> >>>> On the master side, I now have errors all over: >>>> 2017-12-31 10:06:26,511 WARN [ProcExecWrkr-89] >>>> assignment.RegionTransitionProcedure: Retryable error trying to >>>> transition: pid=511, ppid=398, state=RUNNABLE:REGION_ >> TRANSITION_DISPATCH; >>>> UnassignProcedure table=work_proposed, region= >> d0a58b76ad9376b12b3e763660049d3d, >>>> server=node3.com,16020,1514693337210; rit=OPENING, location=node3.com >>>> ,16020,1514693337210 >>>> org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected >>>> [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but >>>> current state=OPENING >>>> at org.apache.hadoop.hbase.master.assignment.RegionStates$Regio >>>> nStateNode.transitionState(RegionStates.java:155) >>>> at org.apache.hadoop.hbase.master.assignment.AssignmentManager. >>>> markRegionAsClosing(AssignmentManager.java:1530) >>>> at org.apache.hadoop.hbase.master.assignment.UnassignProcedure. >>>> updateTransition(UnassignProcedure.java:179) >>>> at org.apache.hadoop.hbase.master.assignment.RegionTransitionPr >>>> ocedure.execute(RegionTransitionProcedure.java:309) >>>> at org.apache.hadoop.hbase.master.assignment.RegionTransitionPr >>>> ocedure.execute(RegionTransitionProcedure.java:85) >>>> at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Proce >>>> dure.java:845) >>>> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execPro >>>> cedure(ProcedureExecutor.java:1456) >>>> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execute >>>> Procedure(ProcedureExecutor.java:1225) >>>> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$ >>>> 800(ProcedureExecutor.java:78) >>>> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerT >>>> hread.run(ProcedureExecutor.java:1735) >>>> >>>> Non-stop showing on the logs. Probably because I disabled the table. >>>> Restarting HBase so see if it clears that a but... >>>> >>>> After restart there isn't any org.apache.hadoop.hbase.except >>>> ions.UnexpectedStateException on the logs. Only INFO lever. And nothing >>>> bad. But still, regions are stuck in transition even for the disabled >>>> tables. >>>> >>>> Master ls are here. I removed some sections because it always says the >>>> same thing, for each and every single region: https://pastebin.com/K >>>> 6SQ7DXP >>>> >>>> JMS >>>> >>>> 2017-12-31 9:58 GMT-05:00 stack <[email protected]>: >>>> >>>>> There is nothing further up in the master log from regionservers or on >>>>> regionservers side on open? >>>>> >>>>> Thanks, >>>>> S >>>>> >>>>>> On Dec 31, 2017 8:37 AM, "stack" <[email protected]> wrote: >>>>>> >>>>>> Good questions. If you disable snappy does it work? If you start >> over >>>>>> fresh does it work? It should be picking up native libs. Make an >>>>> issue >>>>>> please jms. Thanks for giving it a go. >>>>>> >>>>>> S >>>>>> >>>>>> On Dec 30, 2017 11:49 PM, "Jean-Marc Spaggiari" < >>>>> [email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Stack, >>>>>>> >>>>>>> I just tried to give it a try... Wipe out all HDFS content and code, >>>>> all >>>>>>> HBase content and code, and all ZK. Re-build a brand new cluster >> with >>>>> 7 >>>>>>> physical worker nodes. I'm able to get HBase start, how-ever I'm not >>>>> able >>>>>>> to get my regions online. >>>>>>> >>>>>>> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node8.16020,1514693333206, table=pageMini, >>>>>>> region=a778eb67898dfd378e426f2e7700faea >>>>>>> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node6.16020,1514693336563, table=work_proposed, >>>>>>> region=4a1d86197ace3f4c8b1c8de28dbe1d34 >>>>>>> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node1.16020,1514693336898, table=page_crc, >>>>>>> region=86b3912a09a5676b6851636ed22c2abc >>>>>>> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node7.16020,1514693337406, table=pageAvro, >>>>>>> region=391784c43c87bdea6df05f96accad0ff >>>>>>> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node8.16020,1514693333206, table=page, >>>>>>> region=5850d782a3beea18872769bf8fd70fc7 >>>>>>> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node5.16020,1514693330961, table=work_proposed, >>>>>>> region=1d892c9b54b66f802b82c2f9fe847f1f >>>>>>> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node5.16020,1514693330961, table=pageAvro, >>>>>>> region=e9de2c68cc01883e959d7953a4251687 >>>>>>> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node3.16020,1514693337210, table=page, >>>>>>> region=e2e5fc1c262273893f10e92f24817d1b >>>>>>> 2017-12-31 00:42:03,187 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node3.16020,1514693337210, table=page, >>>>>>> region=89c443c09f10bd1584b1bb86a637e1a8 >>>>>>> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node5.16020,1514693330961, table=page, >>>>>>> region=8ca93e9285233ca7b31992f194056bc1 >>>>>>> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node4.16020,1514693339685, table=work_proposed, >>>>>>> region=9afcf06c4d0d21d7e04b0223edcfc40a >>>>>>> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node6.16020,1514693336563, table=page, >>>>>>> region=3457b3237c576eecd550eccee3f584cd >>>>>>> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node1.16020,1514693336898, table=page, >>>>>>> region=dd5fb1dbd41945a9ccbc110b8d4a51b5 >>>>>>> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node7.16020,1514693337406, table=work_proposed, >>>>>>> region=480bb37af54d9fa57c727da9e8a33578 >>>>>>> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node8.16020,1514693333206, table=page_crc, >>>>>>> region=56b18d470a569c5474ea084f0d995726 >>>>>>> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node6.16020,1514693336563, table=page_duplicate, >>>>>>> region=e744a9af161de965c70c7d1a08b07660 >>>>>>> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node1.16020,1514693336898, table=page_proposed, >>>>>>> region=1c75e53308acac6313db4be63c2b48fe >>>>>>> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node8.16020,1514693333206, table=work_proposed, >>>>>>> region=45a25ba85f6341a177db7b15554259f9 >>>>>>> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node3.16020,1514693337210, table=work_proposed, >>>>>>> region=d0a58b76ad9376b12b3e763660049d3d >>>>>>> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node3.16020,1514693337210, table=page, >>>>>>> region=599a4b7b21b1d93fa232ebbbef37a31b >>>>>>> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node1.16020,1514693336898, table=page_proposed, >>>>>>> region=55c07269cc907b8e8875c2a1c4ec27d5 >>>>>>> 2017-12-31 00:42:03,188 WARN [ProcExecTimeout] >>>>>>> assignment.AssignmentManager: TODO Handle stuck in transition: >>>>>>> rit=OPENING, >>>>>>> location=node5.,16020,1514693330961, table=page_crc, >>>>>>> region=fa3a3d7ebc64ce2a5494cae01477d8d8 >>>>>>> >>>>>>> I'm 99% confident this is because of SNAPPY. I'm fighting to get it >>>>>>> working >>>>>>> but it's such a pain! My concern here is I don't see any exception >>>>>>> anywhere >>>>>>> on any logs. Nothing on the RS side, nothing on the master side >>>>> (Except >>>>>>> extract above). >>>>>>> >>>>>>> I suspect it's snappy because of this: >>>>>>> >>>>>>> hbase@node2:~/hbase-2.0.0-beta-1$ bin/hbase >>>>>>> org.apache.hadoop.hbase.util.CompressionTest >> hdfs://node2/tmp/snappy >>>>>>> snappy >>>>>>> 2017-12-31 00:45:31,006 WARN [main] util.NativeCodeLoader: Unable >> to >>>>> load >>>>>>> native-hadoop library for your platform... using builtin-java >> classes >>>>>>> where >>>>>>> applicable >>>>>>> 2017-12-31 00:45:33,283 INFO [main] metrics.MetricRegistries: >> Loaded >>>>>>> MetricRegistries class >>>>>>> org.apache.hadoop.hbase.metrics.impl.MetricRegistriesImpl >>>>>>> 2017-12-31 00:45:33,366 INFO [main] hfile.CacheConfig: Created >>>>>>> cacheConfig: CacheConfig:disabled >>>>>>> Exception in thread "main" java.lang.RuntimeException: native snappy >>>>>>> library not available: this version of libhadoop was built without >>>>> snappy >>>>>>> support. >>>>>>> at >>>>>>> org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoa >>>>>>> ded(SnappyCodec.java:65) >>>>>>> at >>>>>>> org.apache.hadoop.io.compress.SnappyCodec.getCompressorType( >>>>>>> SnappyCodec.java:134) >>>>>>> at >>>>>>> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecP >>>>> ool.java:150) >>>>>>> at >>>>>>> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecP >>>>> ool.java:168) >>>>>>> at >>>>>>> org.apache.hadoop.hbase.io.compress.Compression$Algorithm. >>>>>>> getCompressor(Compression.java:355) >>>>>>> at >>>>>>> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodin >>>>>>> gContext.<init>(HFileBlockDefaultEncodingContext.java:90) >>>>>>> at >>>>>>> org.apache.hadoop.hbase.io.hfile.NoOpDataBlockEncoder.newDat >>>>>>> aBlockEncodingContext(NoOpDataBlockEncoder.java:85) >>>>>>> at >>>>>>> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.<init>( >>>>>>> HFileBlock.java:923) >>>>>>> at >>>>>>> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.finishInit( >>>>>>> HFileWriterImpl.java:296) >>>>>>> at >>>>>>> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.<init>(HFil >>>>>>> eWriterImpl.java:186) >>>>>>> at >>>>>>> org.apache.hadoop.hbase.io.hfile.HFile$WriterFactory.create( >>>>>>> HFile.java:339) >>>>>>> at >>>>>>> org.apache.hadoop.hbase.util.CompressionTest.doSmokeTest(Com >>>>>>> pressionTest.java:129) >>>>>>> at >>>>>>> org.apache.hadoop.hbase.util.CompressionTest.main(Compressio >>>>>>> nTest.java:167) >>>>>>> >>>>>>> But I think my installation is fine: >>>>>>> hbase@node2:~/hbase-2.0.0-beta-1$ ll native-build/ >>>>>>> total 308 >>>>>>> lrwxrwxrwx 1 hbase hbase 24 déc 31 00:29 libhadoopsnappy.so -> >>>>>>> libhadoopsnappy.so.0.0.1 >>>>>>> lrwxrwxrwx 1 hbase hbase 24 déc 31 00:29 libhadoopsnappy.so.0 -> >>>>>>> libhadoopsnappy.so.0.0.1 >>>>>>> -rwxr-xr-x 1 hbase hbase 120144 déc 31 00:29 >> libhadoopsnappy.so.0.0.1 >>>>>>> lrwxrwxrwx 1 hbase hbase 18 déc 1 2012 libsnappy.so -> >>>>>>> libsnappy.so.1.1.3 >>>>>>> lrwxrwxrwx 1 hbase hbase 18 déc 1 2012 libsnappy.so.1 -> >>>>>>> libsnappy.so.1.1.3 >>>>>>> -rwxr-xr-x 1 hbase hbase 178210 déc 1 2012 libsnappy.so.1.1.3 >>>>>>> drwxr-xr-x 3 hbase hbase 4096 déc 30 15:44 python2.6 >>>>>>> drwxr-xr-x 4 hbase hbase 4096 déc 30 23:35 python2.7 >>>>>>> drwxr-xr-x 3 hbase hbase 4096 déc 30 23:29 python3.5 >>>>>>> >>>>>>> an in hbase-env.sh: >>>>>>> export JAVA_HOME=/usr/local/jdk1.8.0_151 >>>>>>> export HBASE_LIBRARY_PATH=/home/hbase/hbase-2.0.0-beta-1/ >> native-build >>>>>>> >>>>>>> >>>>>>> So there is 2 things here. >>>>>>> 1) Why are the region servers not reporting any error when they are >>>>> not >>>>>>> able to open a region because of the compression codec not being >>>>> loaded? >>>>>>> 2) Why is HBase not picking up the Snappy codec. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> JMS >>>>>>> >>>>>>> >>>>>>> 2017-12-29 13:15 GMT-05:00 Stack <[email protected]>: >>>>>>> >>>>>>>> The first release candidate for HBase 2.0.0-beta-1 is up at: >>>>>>>> >>>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-2.0.0-bet >>>>> a-1-RC0/ >>>>>>>> >>>>>>>> Maven artifacts are available from a staging directory here: >>>>>>>> >>>>>>>> https://repository.apache.org/content/repositories/orgapache >>>>> hbase-1188 >>>>>>>> >>>>>>>> All was signed with my key at 8ACC93D2 [1] >>>>>>>> >>>>>>>> I tagged the RC as 2.0.0-beta-1-RC0 >>>>>>>> (0907563eb72697b394b8b960fe54887d6ff304fd) >>>>>>>> >>>>>>>> hbase-2.0.0-beta-1 is our first beta release. It includes all that >>>>> was >>>>>>> in >>>>>>>> previous alphas (new assignment manager, offheap read/write path, >>>>>>> in-memory >>>>>>>> compactions, etc.). The APIs and feature-set are sealed. >>>>>>>> >>>>>>>> hbase-2.0.0-beta-1 is a not-for-production preview of hbase-2.0.0. >>>>> It is >>>>>>>> meant for devs and downstreamers to test drive and flag us if we >>>>> messed >>>>>>> up >>>>>>>> on anything ahead of our rolling GAs. We are particular interested >>>>> in >>>>>>>> hearing from Coprocessor developers. >>>>>>>> >>>>>>>> The list of features addressed in 2.0.0 so far can be found here >>>>> [3]. >>>>>>> There >>>>>>>> are thousands. The list of ~2k+ fixes in 2.0.0 exclusively can be >>>>> found >>>>>>>> here [4] (My JIRA JQL foo is a bit dodgy -- forgive me if >> mistakes). >>>>>>>> >>>>>>>> I've updated our overview doc. on the state of 2.0.0 [6]. We'll do >>>>> one >>>>>>> more >>>>>>>> beta before we put up our first 2.0.0 Release Candidate by the end >>>>> of >>>>>>>> January, 2.0.0-beta-2. Its focus will be making it so users can do >> a >>>>>>>> rolling upgrade on to hbase-2.x from hbase-1.x (and any bug fixes >>>>> found >>>>>>>> running beta-1). Here is the list of what we have targeted so far >>>>> for >>>>>>>> beta-2 [5]. Check it out. >>>>>>>> >>>>>>>> One knownissue is that the User API has not been properly filtered >>>>> so it >>>>>>>> shows more than just InterfaceAudience Public content >> (HBASE-19663, >>>>> to >>>>>>> be >>>>>>>> fixed by beta-2). >>>>>>>> >>>>>>>> Please take this beta for a spin. Please vote on whether it ok to >>>>> put >>>>>>> out >>>>>>>> this RC as our first beta (Note CHANGES has not yet been updated). >>>>> Let >>>>>>> the >>>>>>>> VOTE be open for 72 hours (Monday) >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Your 2.0.0 Release Manager >>>>>>>> >>>>>>>> 1. http://pgp.mit.edu/pks/lookup?op=get&search=0x9816C7FC8ACC93D2 >>>>>>>> 3. https://goo.gl/scYjJr >>>>>>>> 4. https://goo.gl/dFFT8b >>>>>>>> 5. https://issues.apache.org/jira/projects/HBASE/versions/ >> 12340862 >>>>>>>> 6. https://docs.google.com/document/d/ >> 1WCsVlnHjJeKUcl7wHwqb4z9iEu_ >>>>>>>> ktczrlKHK8N4SZzs/ >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >>> >>
