Mike, I'm glad it worked out for you! And I'm curious too, this shouldn't be happening. I'd love to take take a look at your master's log from the day of the failure. You could put it on a web server or try to attach it to a reply (but that usually gets filtered).
J-D On Thu, Dec 3, 2009 at 1:23 PM, mike anderson <[email protected]> wrote: > wow! Thanks for all your help. I just took the add_table.rb script for a run > and it worked flawlessly. Kudos to the community! > > I'm still curious as to what might have happened? Was the .META. table just > slightly out of whack? > > -mike > > On Thu, Dec 3, 2009 at 3:36 PM, mike anderson <[email protected]>wrote: > >> This was a table that had been around for almost two months now and had >> many regions. The web UI reports 231 regions, and I am certain that the >> tables being reported don't have nearly that many regions, so perhaps this >> count includes those from the missing table. >> >> In the folder: /hbase/cached_web_pages/1102708773/http is a single 130MB >> file full of rows/columns. We are caching the full html of websites into the >> columns so copying and pasting some of the rows won't be very useful, but >> the chunk starts with this: >> >> "DATABLK*f >> #ŸRhttp%3A%2F%2Fwww.informaworld.com%2Fsmpp%2Ftitle%7Edb%3Dall%7Econtent%3Dg903750466 >> httpdata $í ó " >> >> I tried to enable a region, but get: >> >> from (hbase):3hbase(main):003:0> enable_region >> 'cached_web_pages,metapress_ris_120417,1257429337740' >> NativeException: java.lang.NullPointerException: null >> from org/apache/hadoop/hbase/util/Writables.java:74:in `getWritable' >> from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0' >> from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke' >> from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke' >> from java/lang/reflect/Method.java:597:in `invoke' >> from org/jruby/javasupport/JavaMethod.java:298:in >> `invokeWithExceptionHandling' >> from org/jruby/javasupport/JavaMethod.java:278:in `invoke_static' >> from org/jruby/java/invokers/StaticMethodInvoker.java:57:in `call' >> from org/jruby/runtime/callsite/CachingCallSite.java:150:in `call' >> from org/jruby/ast/CallTwoArgNode.java:59:in `interpret' >> from org/jruby/ast/LocalAsgnNode.java:123:in `interpret' >> from org/jruby/ast/NewlineNode.java:104:in `interpret' >> from org/jruby/ast/BlockNode.java:71:in `interpret' >> from org/jruby/internal/runtime/methods/InterpretedMethod.java:201:in >> `call' >> from org/jruby/internal/runtime/methods/DefaultMethod.java:162:in `call' >> from org/jruby/runtime/callsite/CachingCallSite.java:150:in `call' >> ... 112 levels... >> from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in `call' >> from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in `call' >> from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in `call' >> from org/jruby/runtime/callsite/CachingCallSite.java:253:in `cacheAndCall' >> from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call' >> from usr/local/hbase/bin/$_dot_dot_/bin/hirb.rb:487:in `__file__' >> from usr/local/hbase/bin/$_dot_dot_/bin/hirb.rb:-1:in `load' >> from org/jruby/Ruby.java:577:in `runScript' >> from org/jruby/Ruby.java:480:in `runNormally' >> from org/jruby/Ruby.java:354:in `runFromMain' >> from org/jruby/Main.java:229:in `run' >> from org/jruby/Main.java:110:in `run' >> from org/jruby/Main.java:94:in `main' >> from /usr/local/hbase/bin/../bin/HBase.rb:138:in `enable_region' >> from /usr/local/hbase/bin/../bin/hirb.rb:350:in `enable_region' >> from (hbase):4hbase(main):004:0> >> >> Thanks again. >> >> -mike >> >> On Thu, Dec 3, 2009 at 3:21 PM, Jean-Daniel Cryans >> <[email protected]>wrote: >> >>> What's in the HDFS folder of that table? Here I see that you should >>> have something like: >>> >>> /hbase/cached_web_pages/1325672518/http/ stuff... >>> >>> Was there only this one region? >>> >>> Also are you able to enable a region in the shell? Take one of the row >>> key from .META. and do >>> >>> > enable_region 'region name' >>> >>> J-D >>> >>> On Thu, Dec 3, 2009 at 12:11 PM, mike anderson <[email protected]> >>> wrote: >>> > Here's a snippit from the meta table (I can send you the whole thing, >>> but >>> > it's quite large), >>> > >>> > cached_web_pages,http%3A%2F column=info:serverstartcode, >>> > timestamp=1259853027975, value=1259852967063 >>> > %2Fdx.doi.org%2F10.1002%252 >>> > >>> > Fajpa.21214,1259739437144 >>> > >>> > cached_web_pages,http%3A%2F column=historian:assignment, >>> > timestamp=1259807436758, value=Region assigned to se >>> > %2Fdx.doi.org%2F10.1002%252 rver >>> > ghetto169.projectlounge.com,60020,1256139356112 >>> > >>> > Fejoc.200900768,12555040994 >>> > >>> > 35 >>> > >>> > cached_web_pages,http%3A%2F column=historian:open, >>> timestamp=1259807436723, >>> > value=Region opened on server : g >>> > %2Fdx.doi.org%2F10.1002%252 hetto169.projectlounge.com >>> > >>> > Fejoc.200900768,12555040994 >>> > >>> > 35 >>> > >>> > cached_web_pages,http%3A%2F column=historian:assignment, >>> > timestamp=1259853024917, value=Region assigned to se >>> > %2Fdx.doi.org%2F10.1002%252 rver >>> > ghetto167.projectlounge.com,60020,1259852967063 >>> > >>> > Fsmi.1285,1258589376676 >>> > >>> > cached_web_pages,http%3A%2F column=historian:open, >>> timestamp=1259853027984, >>> > value=Region opened on server : g >>> > %2Fdx.doi.org%2F10.1002%252 hetto167.projectlounge.com >>> > >>> > Fsmi.1285,1258589376676 >>> > >>> > cached_web_pages,http%3A%2F column=info:regioninfo, >>> > timestamp=1258589203875, value=REGION => {NAME => 'cached >>> > %2Fdx.doi.org%2F10.1002%252 _web_pages,http\\x253A\\x252F\\ >>> x252Fdx.doi.org >>> > \\x252F10.1002\\x25252Fsmi.1285,125 >>> > Fsmi.1285,1258589376676 8589376676', STARTKEY => >>> 'http\\x253A\\x252F\\ >>> > x252Fdx.doi.org\\x252F10.1002\\x252 >>> > 52Fsmi.1285', ENDKEY => >>> 'http\\x253A\\x252F\\ >>> > x252Fdx.doi.org\\x252F10.1016\\x252F >>> > j.apergo.2009.09.005', ENCODED => >>> 1325672518, >>> > TABLE => {{NAME => 'cached_web_page >>> > s', FAMILIES => [{NAME => 'http', VERSIONS >>> => >>> > '1', COMPRESSION => 'NONE', TTL => >>> > '2147483647', BLOCKSIZE => '65536', >>> IN_MEMORY >>> > => 'false', BLOCKCACHE => 'true'}]} >>> > } >>> > >>> > >>> > and you can see the table which has gone missing 'cached_web_pages' in >>> the >>> > key spot. The crash over the weekend was pretty traumatic. Complete >>> power >>> > outage to the entire cluster except(!) for the master. The data is >>> > definitely still on HDFS, I will take a look at the add_table script and >>> > upgrade to 0.20.2. >>> > >>> > >>> > Cheers and thanks a lot. >>> > >>> > mike >>> > >>> > >>> > On Thu, Dec 3, 2009 at 2:51 PM, Jean-Daniel Cryans <[email protected] >>> >wrote: >>> > >>> >> This is weird if the table is in .META. and still not showing up... >>> >> could you pastebin the .META. rows? >>> >> >>> >> Also was it a new table that was just created or has it been there for >>> >> some time? >>> >> >>> >> What kind of crash did you get this weekend? >>> >> >>> >> The best way to recover your data, if it's still on HDFS, will be to >>> >> upgrade to 0.20.2 and use the script bin/add_table.rb to rebuild >>> >> .META. >>> >> >>> >> J-D >>> >> >>> >> On Thu, Dec 3, 2009 at 11:29 AM, mike anderson <[email protected] >>> > >>> >> wrote: >>> >> > From the web UI and from calling 'list' in the shell I can't see the >>> >> table >>> >> > name. >>> >> > >>> >> > Hadoop/Hbase 0.20/0.20.1, distributed setup, 10 nodes. >>> >> > >>> >> > -mike >>> >> > >>> >> > On Thu, Dec 3, 2009 at 1:54 PM, Jean-Daniel Cryans < >>> [email protected] >>> >> >wrote: >>> >> > >>> >> >> Mike, >>> >> >> >>> >> >> So if you looked in .META. and the rows are there, how did you >>> figure >>> >> >> that the table is missing? >>> >> >> >>> >> >> Also the usuals: which version of Hadoop/HBase, what kind of setup, >>> etc >>> >> >> >>> >> >> J-D >>> >> >> >>> >> >> On Thu, Dec 3, 2009 at 7:29 AM, mike anderson < >>> [email protected]> >>> >> >> wrote: >>> >> >> > Hbase crashed on me this weekend, and upon restarting one of the >>> >> tables >>> >> >> is >>> >> >> > just completely gone. All of the table data is still in HDFS and >>> my >>> >> >> missing >>> >> >> > table is still mentioned in .META.. I tried restarting hbase a few >>> >> times, >>> >> >> > but the table didn't show up. What else can I do to debug this? I >>> >> looked >>> >> >> > through the logs, but nothing really jumped out at me. Is there >>> >> something >>> >> >> I >>> >> >> > should look for? >>> >> >> > >>> >> >> > I took a look at this ticket, >>> >> >> > http://issues.apache.org/jira/browse/HBASE-1342, but don't know >>> >> enough >>> >> >> about >>> >> >> > the inner workings of hbase to make sense of it. >>> >> >> > >>> >> >> > >>> >> >> > thanks in advance. >>> >> >> > >>> >> >> >>> >> > >>> >> >>> > >>> >> >> >
