[ https://issues.apache.org/jira/browse/HBASE-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410948#comment-13410948 ]
Jean-Daniel Cryans commented on HBASE-6310: ------------------------------------------- A client would write the .META. location to ROOT? Unless you use the shell to do it yourself, I can't see that happening. > -ROOT- corruption when .META. is using the old encoding scheme > -------------------------------------------------------------- > > Key: HBASE-6310 > URL: https://issues.apache.org/jira/browse/HBASE-6310 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.94.0 > Reporter: Jean-Daniel Cryans > Priority: Blocker > Fix For: 0.96.0, 0.94.1 > > > We're still working the on the root cause here, but after the leap second > armageddon we had a hard time getting our 0.94 cluster back up. This is what > we saw in the logs until the master died by itself: > {noformat} > 2012-07-01 23:01:52,149 DEBUG > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: > locateRegionInMeta parentTable=-ROOT-, > metaLocation={region=-ROOT-,,0.70236052, hostname=sfor3s28, > port=10304}, attempt=16 of 100 failed; retrying after sleep of 32000 > because: HRegionInfo was null or empty in -ROOT-, > row=keyvalues={.META.,,1259448304806/info:server/1341124914705/Put/vlen=14/ts=0, > .META.,,1259448304806/info:serverstartcode/1341124914705/Put/vlen=8/ts=0} > {noformat} > (it's strage that we retry this) > This was really misleading because I could see the regioninfo in a scan: > {noformat} > hbase(main):002:0> scan '-ROOT-' > ROW COLUMN+CELL > .META.,,1 column=info:regioninfo, > timestamp=1331755381142, value={NAME => '.META.,,1', STARTKEY => '', > ENDKEY => '', ENCODED => 1028785192,} > .META.,,1 column=info:server, > timestamp=1341183448693, value=sfor3s40:10304 > .META.,,1 > column=info:serverstartcode, timestamp=1341183448693, > value=1341183444689 > .META.,,1 column=info:v, > timestamp=1331755419291, value=\x00\x00 > .META.,,1259448304806 column=info:server, > timestamp=1341124914705, value=sfor3s24:10304 > .META.,,1259448304806 > column=info:serverstartcode, timestamp=1341124914705, > value=1341124455863 > {noformat} > Except that the devil is in the details, ".META.,,1" is not > ".META.,,1259448304806". Basically something writes to .META. by directly > creating the row key without caring if the row is in the old format. I did a > deleteall in the shell and it fixed the issue... until some time later it was > stuck again because the edits reappeared (still not sure why). This time the > PostOpenDeployTasksThread were stuck in the RS trying to update .META. but > there was no logging (saw it with a jstack). I deleted the row again to make > it work. > I'm marking this as a blocker against 0.94.2 since we're trying to get 0.94.1 > out, but I wouldn't recommend upgrading to 0.94 if your cluster was created > before 0.89 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira