Ted, This is the second or third time I have seen this. I think it should apply fairly clean. What do you think?
On Sat, Mar 2, 2013 at 10:47 AM, Ted Yu <yuzhih...@gmail.com> wrote: > HBASE-5837 is only in 0.95 and later. > > Do you want HBASE-5837 to be backported ? > > Thanks > > On Thu, Feb 7, 2013 at 3:17 PM, Brandon Peskin <bpes...@adobe.com> wrote: > > > Thanks Kevin. > > > > Before I tried your advice, I tried this: > > > > scan '.META.', { FILTER => > > org.apache.hadoop.hbase.filter.SingleColumnValueFilter.new > > (org.apache.hadoop.hbase.util.Bytes.toBytes('info'), > > org.apache.hadoop.hbase.util.Bytes.toBytes('regioninfo'), > > > > > org.apache.hadoop.hbase.filter.CompareFilter::CompareOp.valueOf('NOT_EQUAL'), > > org.apache.hadoop.hbase.filter.SubstringComparator.new('algol'))} > > > > deleteall '.META.', '<row_key>' > > > > > > The problem is at some point I fat-fingereda row key and believe I hit > > HBASE-5837 > > > > https://issues.apache.org/jira/browse/HBASE-5837 > > > > I'm getting java.io.IOException: java.io.IOException: > > java.lang.IllegalArgumentException: No 44 in > > <2e40c841-af5b-4a5e-be0f-e06a953f05cc,1359958540596>, length=13, > offset=37 > > Caused my master to die, can't restart it. > > > > Is there any way around this or have I completely hosed my hbase > > installation? > > > > > > On Jan 31, 2013, at 6:23 AM, Kevin O'dell <kevin.od...@cloudera.com> > > wrote: > > > > > I am going to disagree with ignoring the error. You will encounter > > > failures when doing other operations such as import/exports. The first > > > thing I would do is like JM said, lets focus on the region that is not > in > > > META(we at least want 0 inconsistencies). Can you please run hbck > > -repair > > > and then run another -details and let us know if you are still seeing > > > errors? After that, if you are still getting the NULL errors for > > > hregion:info in META. Can you please run echo "scan '.META.'" | hbase > > shell > > >> meta.out and attach the meta.out file. I would like to take a look at > > > some of these. > > > > > > To be able to run the -repair we will want to use a different jar and > > some > > > instructions: > > > > > > > > > 1. Move the new uber jar on to the system. > > > hbase-0.90.4-cdh3u3-patch30+3.jar > > > 2. Copy the hbase dir(/usr/lib/hbase) into /tmp/hbase dir. > > > 3. Move the hbase jar(hbase-0.90.4-cdh3u2.jar) to a .old from the tmp > > > and replace it with the uber-hbck(hbase-0.90.4-cdh3u3-patch30+3.jar). > > > 4. break the sym links that directory > > > 5. Add the value of fs.default.name from the core-site.xml to the > > > HBase-site.xml > > > 6. export HBASE_HOME=/tmp/hbase/ and run ./bin/hbase hbck -details > 2>&1 > > > | tee details.out. > > > 7. Check the details.out and make sure you are still seeing > > > inconsistencies > > > 8. ./bin/hbase hbck -repair 2>&1 | tee repair.out. > > > 9. Run -details again and make sure we have 0 inconsistencies. > > > > > > > > > > > > https://www.dropbox.com/s/fxotosglrrl1tq2/hbase-0.90.4-cdh3u3-patch30%2B3.jar > > > <--- new jar > > > > > > On Thu, Jan 31, 2013 at 6:48 AM, Jean-Marc Spaggiari < > > > jean-m...@spaggiari.org> wrote: > > >> > > >> Hi Brandon, > > >> > > >> I faced the same issue for "HRegionInfo was null or empty" on January > > >> 24th and Ted replied: > > >> > > >> "Encountered problems when prefetch META table: > > >> > > >> You can ignore the warning." > > >> > > >> So I think you should focus on the last one "not listed in META or > > >> deployed on any region server". > > >> > > >> Have you tried hbck to see if it can fix it? > > >> > > >> JM > > >> > > >> 2013/1/31, Brandon Peskin <bpes...@adobe.com>: > > >>> hadoop 0.20.2-cdh3u2 > > >>> hbase 0.90.4-cdh3u2 > > >>> > > >>> On January 8th I had a network event where I lost three region > servers. > > >>> > > >>> When they came back I had unassigned regions/regions not being served > > > errors > > >>> which I fixed with the hbck -fix > > >>> > > >>> > > >>> > > >>> Since then, however I have been getting an increasing number of these > > > when I > > >>> have clients trying to write to specific tables: > > >>> > > >>> > > >>> java.io.IOException: HRegionInfo was null or empty in Meta for > > >>> algol_profile_training_record, > > >>> > > > > > > row=algol_profile_training_record,clientcode:49128:abce6d9f-1ee2-434a-8a82-a151b7dc183f,99999999999999 > > >>> at > > >>> > > > > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:142) > > >>> ~[hbase-0.90.4-cdh3u2.jar:na] > > >>> at > > > > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95) > > >>> ~[hbase-0.90.4-cdh3u2.jar:na] > > >>> at > > >>> > > > > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:649) > > >>> [hbase-0.90.4-cdh3u2.jar:na] > > >>> > > >>> > > >>> > > >>> ....to the point now where I seemingly can't even write to that > table. > > >>> > > >>> This also coincides with the following, seen in the the hbase master > > > log: > > >>> > > >>> 13/01/31 02:49:55 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is > > > empty > > >>> in > > >>> > > > > > > keyvalues={algol_tmp_client_39691_20121101053400_6,,1351783324210.b703b203d6189cad224764853cbd88f7./info:server/1351785275123/Put/vlen=32, > > >>> > > > > > > algol_tmp_clientcode_39691_20121101053400_6,,1351783324210.b703b203d6189cad224764853cbd88f7./info:serverstartcode/1351785275123/Put/vlen=8} > > >>> 13/01/31 02:49:55 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is > > > empty > > >>> in > > >>> > > > > > > keyvalues={algol_tmp_client_39691_20121101053400_6,\x7F\xD6\xA5Pm\xF2E\x9D\x81\xB1b\xD1'\xD3\xE5\xA4,1351783324210.507bdf342aaae45644d597c272c52e02./info:server/1351785275128/Put/vlen=32, > > >>> > > > > > > algol_tmp_clientcode_39691_20121101053400_6,\x7F\xD6\xA5Pm\xF2E\x9D\x81\xB1b\xD1'\xD3\xE5\xA4,1351783324210.507bdf342aaae45644d597c272c52e02./info:serverstartcode/1351785275128/Put/vlen=8} > > >>> 13/01/31 02:49:55 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is > > > empty > > >>> in > > >>> > > > > > > keyvalues={algol_tmp_client_34785_20121004102759_0,,1349383219155.a45774fa4c8af765df87d3373d30cfdc./info:server/1349461385375/Put/vlen=32, > > >>> > > > > > > algol_tmp_overstock_34785_20121004102759_0,,1349383219155.a45774fa4c8af765df87d3373d30cfdc./info:serverstartcode/1349461385375/Put/vlen=8} > > >>> > > >>> > > >>> ...and there's 58 of these I cannot fix (presumably I could with the > > > hbck > > >>> provided with CDH4): > > >>> > > >>> ERROR: Region > > >>> > > > > > > hdfs://namenode:9000/hbase/algol_profile_training_record/fdaa1024d1b725b4997d2283640f0fa4 > > >>> on HDFS, but not listed in META or deployed on any region server > > >>> > > >>> > > >>> I'm absolutely stumped. I've done some poking around and I can't find > > > any > > >>> sort of data surrounding this issue with the exception of similar > > > symptoms > > >>> in an exchange on this list of March this year (though the inquirer > had > > >>> different questions). Any help would be appreciated, though I > suspect I > > > will > > >>> be told 'Upgrade to CDH4' or 'Drop and re-create the table'. > > >>> > > >>> Thanks in advance. > > >>> > > >>> -- > > >>> Brandon Peskin > > >>> Senior Systems Administrator > > >>> Adobe Systems > > >>> bpes...@adobe.com > > >>> > > >>> > > >>> > > >>> > > >>> > > > > > > > > > > > > > > > -- > > > Kevin O'Dell > > > Customer Operations Engineer, Cloudera > > > > > -- Kevin O'Dell Customer Operations Engineer, Cloudera