It could be tho it means it's not covered by the unit tests and all the testing done on that branch since late February didn't catch this up.
J-D On Fri, May 14, 2010 at 4:07 PM, Todd Lipcon <[email protected]> wrote: > It appears like we might be stuck in an infinite loop here: > > "IPC Server handler 9 on 60020" daemon prio=10 tid=0x00002aaeb42f7800 > nid=0x6508 runnable [0x00000000445bb000] > java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hbase.regionserver.ExplicitColumnTracker.checkColumn(ExplicitColumnTracker.java:128) > at > org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:165) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:176) > > It's holding a lock that some other threads are blocked on. In both of your > pastes, there are some threads stuck here. > > JD, any thoughts? Looks like you made some changes to this code for 0.20.4. > > -Todd > > On Fri, May 14, 2010 at 6:56 AM, Friso van Vollenhoven < > [email protected]> wrote: > >> Hi Todd, >> >> The row counting works fine. It is quite slow, I have to say. But I have >> never used row counting from the shell before, so I don't know what >> performance to expect from it or how it is implemented. It's just that a >> regular scan from software is way faster. >> >> Also, from our application we do a full table scan to populate an in-memory >> index of row keys, because we need to be able to quickly determine if a >> certain key exists or not. I triggered this scan from our UI while there >> were hanging reducers. This also works fine. There are close to 5 million >> records in the table and I checked in the web interface that the table is >> divided across all 4 region servers, so this process should hit them all. >> >> The earlier jstacks of the region servers were taken when the reducers >> (clients) were hanging, before the shutdown was requested. Some 3 to 5 >> reduce tasks hang, not all of them, but surely more than just one. >> >> Because of your question about what is locked up (client or region server), >> I SSH'ed into each of the worker machines after giving HBase the shutdown >> signal (so the shutdown sequence started) and tried to see if the region >> server was still running and if so, shutdown each individual region server >> manually (doing 'hbase-daemon.sh stop regionserver' on each, I'm glad there >> are only 4 nodes). I found that: >> - one of the region servers actually shut down normally (worker3) >> - two region servers shut down normally after the hbase-daemon.sh command >> (worker4 and worker1) >> - one region server does not shut down (worker2) >> >> I put some additional info on pastebin. >> >> Here is the jstack of worker2 (the hanging one): >> http://pastebin.com/5V0UZi7N >> There are two jstack outputs, one from before the shutdown command was >> given and one (starting at line 946) from after the shutdown command was >> given. >> >> Here are the logs of that region server: http://pastebin.com/qCXSKR2A >> I set the log level for org.apache.hadoop.hbase to DEBUG before doing all >> this, so it's more verbose (I don't know if this helps). >> >> So, it appears that it is one of the region servers that is locked up, but >> only for some connections while it can still serve other connections >> normally. From the locked up region server logs, it looks like the shutdown >> sequence runs completely, but the server just won't die afterwards (because >> of running non-daemon threads; maybe it should just do a System.exit() if >> all cleanup is successful). At least nothing gets corrupted, which is nice. >> Of course I am still trying to find out why things get locked up in the >> first place. >> >> I did this test twice today. During the first run it was a different region >> server that was hanging, so I think it has nothing to do with a problem >> related to that specific machine. >> >> My next step is to go through code (including HBase's, so It will take me >> some time...) and see what exactly happens in our scenario, because from my >> current knowledge the jstack outputs don't mean enough to me. >> >> >> >> Friso >> >> >> >> >> On May 13, 2010, at 7:09 PM, Todd Lipcon wrote: >> >> > Hi Friso, >> > >> > When did you take the jstack dumps of the region servers? Was it when the >> > reduce tasks were still hanging? >> > >> > Do all of the reduce tasks hang or is it just one that gets stuck? >> > >> > If, once the reduce tasks are hung, you open the hbase shell and run >> "count >> > 'mytable', 100000" does it successfully count the rows? >> > >> > (I'm trying to determine if the client is locked up, or one of the RSes >> is >> > locked up) >> > >> > Enjoy your holiday! >> > >> > -Todd >> > >> > On Thu, May 13, 2010 at 12:38 AM, Friso van Vollenhoven < >> > [email protected]> wrote: >> > >> >> Hi, >> >> >> >> I was kind of hoping that this was a known thing and I was just >> overlooking >> >> something. Apparently it requires more investigation. >> >> >> >> I reproduced the situation and gathered some more info. I still have to >> >> look through everything myself a bit more (today is a national holiday >> in >> >> The Netherlands, so not much working), but I posted some info to >> pastebin. >> >> >> >> I did the following sequence (with HBase 0.20.4): >> >> - startup HBase (waited for all the regions to come online and let it >> >> settle) >> >> - startup our application >> >> - wait for the importer job to hang (it only happened on the second run, >> >> which started 15 reducers; the first run was really small and only one >> key >> >> was generated, so just one reducer) >> >> - kill the hanging importer job (hadoop job -kill) >> >> - try to shutdown HBase (as I type this it is still producing dots on my >> >> console) >> >> >> >> The HBase master logs are here (includes shutdown attempt): >> >> http://pastebin.com/PYpPVcyK >> >> The jstacks are here: >> >> - HMaster: http://pastebin.com/Da6jCAuA (this includes two thread >> dumps, >> >> one during operation with the hanging clients and one during hanging >> >> shutdown) >> >> - RegionServer 1: http://pastebin.com/5dQXfxCn >> >> - RegionServer 2: http://pastebin.com/XWwBGXYC >> >> - RegionServer 3: http://pastebin.com/mDgWbYGV >> >> - RegionServer 4: http://pastebin.com/XDR14bth >> >> >> >> As you can see in the master logs, the shutdown cannot get a thread >> called >> >> Thread-10 to stop running. The trace for that thread looks like this: >> >> "Thread-10" prio=10 tid=0x000000004d218800 nid=0x1e73 in Object.wait() >> >> [0x00000000427a7000] >> >> java.lang.Thread.State: TIMED_WAITING (on object monitor) >> >> at java.lang.Object.wait(Native Method) >> >> - waiting on <0x00002aaab364c9d0> (a java.lang.Object) >> >> at org.apache.hadoop.hbase.util.Sleeper.sleep(Sleeper.java:89) >> >> - locked <0x00002aaab364c9d0> (a java.lang.Object) >> >> at org.apache.hadoop.hbase.Chore.run(Chore.java:76) >> >> >> >> I still have no clue what happened, but I will investigate a bit more >> >> tomorrow. >> >> >> >> >> >> Thanks for the responses. >> >> >> >> >> >> Friso >> >> >> >> >> >> >> >> On May 12, 2010, at 9:02 PM, Todd Lipcon wrote: >> >> >> >>> Hi Friso, >> >>> >> >>> Also, if you can capture a jstack of the regionservers at thie time >> >>> that would be great. >> >>> >> >>> -Todd >> >>> >> >>> On Wed, May 12, 2010 at 9:26 AM, Jean-Daniel Cryans < >> [email protected]> >> >> wrote: >> >>>> Friso, >> >>>> >> >>>> Unfortunately it's hard to determine the cause with the provided >> >>>> information, the client call you pasted is pretty much normal i.e. the >> >>>> client is waiting to receive a result from a region server. >> >>>> >> >>>> The fact that you can't shut down the master when this happens is very >> >>>> concerning. Do you still have those logs around? Same for the region >> >>>> servers? Can you post this in pastebin or on a web server? >> >>>> >> >>>> Also, feel free to come chat with us on IRC, it's always easier to >> >>>> debug when live. #hbase on freenode >> >>>> >> >>>> J-D >> >>>> >> >>>> On Wed, May 12, 2010 at 8:31 AM, Friso van Vollenhoven >> >>>> <[email protected]> wrote: >> >>>>> Hi all, >> >>>>> >> >>>>> I am using Hadoop (0.20.2) and HBase to periodically import data >> (every >> >> 15 minutes). There are a number of import processes, but generally they >> all >> >> create a sequence file on HDFS, which is then run through a MapReduce >> job. >> >> The MapReduce uses the identity mapper (the input file is a Hadoop >> sequence >> >> file) and a specialized reducer that does the following: >> >>>>> - Combine the values for a key into one value >> >>>>> - Do a Get from HBase to retrieve existing values for the same key >> >>>>> - Combine the existing value from HBase and the new one into one >> value >> >> again >> >>>>> - Put the final value into HBase under the same key (thus 'overwrite' >> >> the existing row; I keep only one version) >> >>>>> >> >>>>> After I upgraded HBase to the 0.20.4 release, the reducers sometimes >> >> start hanging on a Get. When the jobs start, some reducers run to >> completion >> >> fine, but after a while the last reducers will start to hang. Eventually >> the >> >> reducers are killed of by Hadoop (after 600 secs). >> >>>>> >> >>>>> I did a thread dump for one of the hanging reducers. It looks like >> >> this: >> >>>>> "main" prio=10 tid=0x0000000048083800 nid=0x4c93 in Object.wait() >> >> [0x00000000420ca000] >> >>>>> java.lang.Thread.State: WAITING (on object monitor) >> >>>>> at java.lang.Object.wait(Native Method) >> >>>>> - waiting on <0x00002aaaaeb50d70> (a >> >> org.apache.hadoop.hbase.ipc.HBaseClient$Call) >> >>>>> at java.lang.Object.wait(Object.java:485) >> >>>>> at >> >> org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:721) >> >>>>> - locked <0x00002aaaaeb50d70> (a >> >> org.apache.hadoop.hbase.ipc.HBaseClient$Call) >> >>>>> at >> >> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:333) >> >>>>> at $Proxy2.get(Unknown Source) >> >>>>> at >> org.apache.hadoop.hbase.client.HTable$4.call(HTable.java:450) >> >>>>> at >> org.apache.hadoop.hbase.client.HTable$4.call(HTable.java:448) >> >>>>> at >> >> >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1050) >> >>>>> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:447) >> >>>>> at >> >> >> net.ripe.inrdb.hbase.accessor.real.HBaseTableAccessor.get(HBaseTableAccessor.java:36) >> >>>>> at >> >> >> net.ripe.inrdb.hbase.store.HBaseStoreUpdater.getExistingRecords(HBaseStoreUpdater.java:101) >> >>>>> at >> >> >> net.ripe.inrdb.hbase.store.HBaseStoreUpdater.mergeTimelinesWithExistingRecords(HBaseStoreUpdater.java:60) >> >>>>> at >> >> >> net.ripe.inrdb.hbase.store.HBaseStoreUpdater.doInsert(HBaseStoreUpdater.java:40) >> >>>>> at >> >> >> net.ripe.inrdb.core.store.SinglePartitionStore$Updater.insert(SinglePartitionStore.java:92) >> >>>>> at >> >> >> net.ripe.inrdb.core.store.CompositeStore$CompositeStoreUpdater.insert(CompositeStore.java:142) >> >>>>> at >> >> >> net.ripe.inrdb.importer.StoreInsertReducer.reduce(StoreInsertReducer.java:70) >> >>>>> at >> >> >> net.ripe.inrdb.importer.StoreInsertReducer.reduce(StoreInsertReducer.java:17) >> >>>>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) >> >>>>> at >> >> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566) >> >>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) >> >>>>> at org.apache.hadoop.mapred.Child.main(Child.java:170) >> >>>>> >> >>>>> So the client hangs in a wait() call, waiting on a HBaseClient$Call >> >> object. I looked at the code. The wait is in a while() loop and has no >> time >> >> out, so it figures that it never gets out of there if no notify() gets >> >> called on the object. I am not sure for exactly what condition it is >> >> waiting, however. >> >>>>> >> >>>>> Meanwhile, after this has happened, I cannot shutdown the master >> server >> >> normally. I have to kill -9 it, to make it shut down. Normally and >> before >> >> this problem occurs, the master server shuts down just fine. (Sorry, >> didn't >> >> do a thread dump of the master and now I downgraded to 0.20.3 again.) >> >>>>> >> >>>>> I cannot reproduce this error on my local setup (developer machine). >> It >> >> only occurs on our (currently modest) cluster of one machine running >> >> master+NN+Zookeeper and four datanodes which are all task trackers and >> >> region servers as well. The inputs to the periodic MapReduce jobs are >> very >> >> small (ranging from some Kb to several Mb) and thus contain not so many >> >> records. I know this is not very efficient to do in MapReduce and will >> be >> >> faster when inserted in process by the importer process because of >> startup >> >> overhead, but we are setting up this architecture of importers and >> insertion >> >> for anticipated larger loads (up to 80 million records per day). >> >>>>> >> >>>>> Does anyone have a clue about what happens? Or where to look for >> >> further investigation? >> >>>>> >> >>>>> Thanks a lot! >> >>>>> >> >>>>> >> >>>>> Cheers, >> >>>>> Friso >> >>>>> >> >>>>> >> >>>> >> >>> >> >>> >> >>> >> >>> -- >> >>> Todd Lipcon >> >>> Software Engineer, Cloudera >> >> >> >> >> > >> > >> > -- >> > Todd Lipcon >> > Software Engineer, Cloudera >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera >
