Re: HBase client hangs after upgrade to 0.20.4 when used from reducer
Hi, I was kind of hoping that this was a known thing and I was just overlooking something. Apparently it requires more investigation. I reproduced the situation and gathered some more info. I still have to look through everything myself a bit more (today is a national holiday in The Netherlands, so not much working), but I posted some info to pastebin. I did the following sequence (with HBase 0.20.4): - startup HBase (waited for all the regions to come online and let it settle) - startup our application - wait for the importer job to hang (it only happened on the second run, which started 15 reducers; the first run was really small and only one key was generated, so just one reducer) - kill the hanging importer job (hadoop job -kill) - try to shutdown HBase (as I type this it is still producing dots on my console) The HBase master logs are here (includes shutdown attempt): http://pastebin.com/PYpPVcyK The jstacks are here: - HMaster: http://pastebin.com/Da6jCAuA (this includes two thread dumps, one during operation with the hanging clients and one during hanging shutdown) - RegionServer 1: http://pastebin.com/5dQXfxCn - RegionServer 2: http://pastebin.com/XWwBGXYC - RegionServer 3: http://pastebin.com/mDgWbYGV - RegionServer 4: http://pastebin.com/XDR14bth As you can see in the master logs, the shutdown cannot get a thread called Thread-10 to stop running. The trace for that thread looks like this: Thread-10 prio=10 tid=0x4d218800 nid=0x1e73 in Object.wait() [0x427a7000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x2aaab364c9d0 (a java.lang.Object) at org.apache.hadoop.hbase.util.Sleeper.sleep(Sleeper.java:89) - locked 0x2aaab364c9d0 (a java.lang.Object) at org.apache.hadoop.hbase.Chore.run(Chore.java:76) I still have no clue what happened, but I will investigate a bit more tomorrow. Thanks for the responses. Friso On May 12, 2010, at 9:02 PM, Todd Lipcon wrote: Hi Friso, Also, if you can capture a jstack of the regionservers at thie time that would be great. -Todd On Wed, May 12, 2010 at 9:26 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: Friso, Unfortunately it's hard to determine the cause with the provided information, the client call you pasted is pretty much normal i.e. the client is waiting to receive a result from a region server. The fact that you can't shut down the master when this happens is very concerning. Do you still have those logs around? Same for the region servers? Can you post this in pastebin or on a web server? Also, feel free to come chat with us on IRC, it's always easier to debug when live. #hbase on freenode J-D On Wed, May 12, 2010 at 8:31 AM, Friso van Vollenhoven fvanvollenho...@xebia.com wrote: Hi all, I am using Hadoop (0.20.2) and HBase to periodically import data (every 15 minutes). There are a number of import processes, but generally they all create a sequence file on HDFS, which is then run through a MapReduce job. The MapReduce uses the identity mapper (the input file is a Hadoop sequence file) and a specialized reducer that does the following: - Combine the values for a key into one value - Do a Get from HBase to retrieve existing values for the same key - Combine the existing value from HBase and the new one into one value again - Put the final value into HBase under the same key (thus 'overwrite' the existing row; I keep only one version) After I upgraded HBase to the 0.20.4 release, the reducers sometimes start hanging on a Get. When the jobs start, some reducers run to completion fine, but after a while the last reducers will start to hang. Eventually the reducers are killed of by Hadoop (after 600 secs). I did a thread dump for one of the hanging reducers. It looks like this: main prio=10 tid=0x48083800 nid=0x4c93 in Object.wait() [0x420ca000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x2eb50d70 (a org.apache.hadoop.hbase.ipc.HBaseClient$Call) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:721) - locked 0x2eb50d70 (a org.apache.hadoop.hbase.ipc.HBaseClient$Call) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:333) at $Proxy2.get(Unknown Source) at org.apache.hadoop.hbase.client.HTable$4.call(HTable.java:450) at org.apache.hadoop.hbase.client.HTable$4.call(HTable.java:448) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1050) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:447) at net.ripe.inrdb.hbase.accessor.real.HBaseTableAccessor.get(HBaseTableAccessor.java:36) at
HBase has entered Debian (unstable)
Hi, HBase 0.20.4 has entered Debian unstable, should slide into testing after the usual 14 day period and will therefor most likely be included in the upcomming Debian Squeeze. http://packages.debian.org/source/sid/hbase Please note, that this packaging effort is still very much work-in-progress and not yet suitable for production use. However the aim is to have a rock solid stable HBase in squeeze+1 respectively in Debian testing in the next months. Meanwhile the HBase package in Debian can raise HBase's visibility and lower the entrance barrier. So if somebody wants to try out HBase (on Debian), it is as easy as: aptitude install zookeeperd hbase-masterd In other news: zookeeper is in Debian testing as of today. Best regards, Thomas Koch, http://www.koch.ro
GSoC 2010: ZooKeeper Monitoring Recipes and Web-based Administrative Interface
Hi all, My name is Andrei Savu and I am on of the GSoC2010 accepted students. My mentor is Patrick Hunt. My objective in the next 4 months is to write tools and recipes for monitoring ZooKeeper and to implement a web-based administrative interface. I have created a wiki page for this project: - http://wiki.apache.org/hadoop/ZooKeeper/GSoCMonitoringAndWebInterface Are there any HBase / Hadoop specific ZooKeeper monitoring requirements? Regards -- Savu Andrei Website: http://www.andreisavu.ro/
Re: Enabling IHbase
Hi Renato, IHBASE is currently broken. I expect to have it fixed tomorrow or the day after. When it's fixed, I'll publish a release under http://github.com/ykulbak/ihbase and add a wiki page explaining how to get started. I'll also send a note to the mailing list. Please feel free to contact me regarding issues with IHBASE. Yoram On Thu, May 13, 2010 at 2:25 AM, Stack st...@duboce.net wrote: You saw this package doc over in the ihbase's new home on github? http://github.com/ykulbak/ihbase/blob/master/src/main/java/org/apache/hadoop/hbase/client/idx/package.html It'll read better if you build the javadoc. There is also this: http://github.com/ykulbak/ihbase/blob/master/README St.Ack On Wed, May 12, 2010 at 8:27 AM, Renato Marroquín Mogrovejo renatoj.marroq...@gmail.com wrote: Hi Alex, Thanks for your help, but I meant something more like a how-to set it up thing, or like a tutorial of it (= I also read these ones if anyone else is interested. http://blog.sematext.com/2010/03/31/hbase-digest-march-2010/ http://search-hadoop.com/m/5MBst1uL87b1 Renato M. 2010/5/12 alex kamil alex.ka...@gmail.com regarding usage this may be helpful https://issues.apache.org/jira/browse/HBASE-2167 On Wed, May 12, 2010 at 10:48 AM, alex kamil alex.ka...@gmail.com wrote: Renato, just noticed you are looking for *Indexed *Hbase i found this http://blog.reactive.org/2010/03/indexed-hbase-it-might-not-be-what-you.html Alex On Wed, May 12, 2010 at 10:42 AM, alex kamil alex.ka...@gmail.com wrote: http://www.google.com/search?hl=ensource=hpq=hbase+tutorialaq=faqi=g-p1g-sx3g1g-sx4g-msx1aql=oq=gs_rfai= On Wed, May 12, 2010 at 10:25 AM, Renato Marroquín Mogrovejo renatoj.marroq...@gmail.com wrote: Hi eveyone, I just read about IHbase and seems like something I could give it a try, but I haven't been able to find information (besides descriptions and advantages) regarding to how to install it or use it. Thanks in advance. Renato M.
Re: Using HBase on other file systems
On Thu, May 13, 2010 at 12:26 AM, Jeff Hammerbacher ham...@cloudera.comwrote: Some projects sacrifice stability and manageability for performance (see, e.g., http://gluster.org/pipermail/gluster-users/2009-October/003193.html ). On Wed, May 12, 2010 at 11:15 AM, Edward Capriolo edlinuxg...@gmail.com wrote: On Wed, May 12, 2010 at 1:30 PM, Andrew Purtell apurt...@apache.org wrote: Before recommending Gluster I suggest you set up a test cluster and then randomly kill bricks. Also as pointed out in another mail, you'll want to colocate TaskTrackers on Gluster bricks to get I/O locality, yet there is no way for Gluster to export stripe locations back to Hadoop. It seems a poor choice. - Andy From: Edward Capriolo Subject: Re: Using HBase on other file systems To: hbase-user@hadoop.apache.org hbase-user@hadoop.apache.org Date: Wednesday, May 12, 2010, 6:38 AM On Tuesday, May 11, 2010, Jeff Hammerbacher ham...@cloudera.com wrote: Hey Edward, I do think that if you compare GoogleFS to HDFS, GFS looks more full featured. What features are you missing? Multi-writer append was explicitly called out by Sean Quinlan as a bad idea, and rolled back. From internal conversations with Google engineers, erasure coding of blocks suffered a similar fate. Native client access would certainly be nice, but FUSE gets you most of the way there. Scalability/availability of the NN, RPC QoS, alternative block placement strategies are second-order features which didn't exist in GFS until later in its lifecycle of development as well. HDFS is following a similar path and has JIRA tickets with active discussions. I'd love to hear your feature requests, and I'll be sure to translate them into JIRA tickets. I do believe my logic is reasonable. HBase has a lot of code designed around HDFS. We know these tickets that get cited all the time, for better random reads, or for sync() support. HBase gets the benefits of HDFS and has to deal with its drawbacks. Other key value stores handle storage directly. Sync() works and will be in the next release, and its absence was simply a result of the youth of the system. Now that that limitation has been removed, please point to another place in the code where using HDFS rather than the local file system is forcing HBase to make compromises. Your initial attempts on this front (caching, HFile, compactions) were, I hope, debunked by my previous email. It's also worth noting that Cassandra does all three, despite managing its own storage. I'm trying to learn from this exchange and always enjoy understanding new systems. Here's what I have so far from your arguments: 1) HBase inherits both the advantages and disadvantages of HDFS. I clearly agree on the general point; I'm pressing you to name some specific disadvantages, in hopes of helping prioritize our development of HDFS. So far, you've named things which are either a) not actually disadvantages b) no longer true. If you can come up with the disadvantages, we'll certainly take them into account. I've certainly got a number of them on our roadmap. 2) If you don't want to use HDFS, you won't want to use HBase. Also certainly true, but I'm not sure there's not much to learn from this assertion. I'd once again ask: why would you not want to use HDFS, and what is your choice in its stead? Thanks, Jeff Jeff, Let me first mention that you have mentioned some thing as fixed, that are only fixed in trunk. I consider trunk futureware and I do not like to have tempral conversations. Even when trunk becomes current there is no guarentee that the entire problem is solved. After all appends were fixed in .19 or not , or again? I rescanned the gfs white paper to support my argument that hdfs is stripped down. Found Writes at offset ARE supported Checkpoints Application level checkpoints Snapshot Shadow read only master hdfs chose features it wanted and ignored others that is why I called it a pure map reduce implementation. My main point, is that hbase by nature needs high speed random read and random write. Hdfs by nature is bad at these things. If you can not keep a high cache hit rate via large block cache via ram hbase is going to slam hdfs doing large block reads for small parts of files. So you ask. Me what I would use instead. I do not think there is a viable alternative in the 100 tb and up range but I do think for people in the 20 tb range somethink like gluster that is
Re: GSoC 2010: ZooKeeper Monitoring Recipes and Web-based Administrative Interface
On Thu, May 13, 2010 at 2:30 AM, Andrei Savu savu.and...@gmail.com wrote: Hi all, My name is Andrei Savu and I am on of the GSoC2010 accepted students. My mentor is Patrick Hunt. Good to meet you Andrei. Are there any HBase / Hadoop specific ZooKeeper monitoring requirements? In the hbase shell, you can poke at your zk ensemble currently. Here is what it looks like: hbase(main):001:0 zk ZooKeeper -server host:port cmd args connect host:port get path [watch] ls path [watch] set path data [version] delquota [-n|-b] path quit printwatches on|off create [-s] [-e] path data acl stat path [watch] close ls2 path [watch] history listquota path setAcl path acl getAcl path sync path redo cmdno addauth scheme auth delete path [version] setquota -n|-b val path Thats pretty great. What'd be sweeter would be addition of a zktop command. I know its a python script at mo. Maybe there is a pure java implementation? Also in our UI, you can browse to a page of basic ensemble stats. Would be excellent if instead that were the fancy-pants zktop output. Or, if you are doing a zk UI anyways, just make sure it packaged in a way that makes it easy for us to launch as part of our UI? I'd image if it packged as a WAR file that should be fine but we'd need some way of passing in where the zk ensemble is, perhaps as arguments on the url? Thanks for writing the list Andrei, St.Ack
Re: HBase has entered Debian (unstable)
You are a good man Thomas. Thanks for pushing this through. St.Ack On Thu, May 13, 2010 at 1:59 AM, Thomas Koch tho...@koch.ro wrote: Hi, HBase 0.20.4 has entered Debian unstable, should slide into testing after the usual 14 day period and will therefor most likely be included in the upcomming Debian Squeeze. http://packages.debian.org/source/sid/hbase Please note, that this packaging effort is still very much work-in-progress and not yet suitable for production use. However the aim is to have a rock solid stable HBase in squeeze+1 respectively in Debian testing in the next months. Meanwhile the HBase package in Debian can raise HBase's visibility and lower the entrance barrier. So if somebody wants to try out HBase (on Debian), it is as easy as: aptitude install zookeeperd hbase-masterd In other news: zookeeper is in Debian testing as of today. Best regards, Thomas Koch, http://www.koch.ro
Re: HBase client hangs after upgrade to 0.20.4 when used from reducer
Hi Friso, When did you take the jstack dumps of the region servers? Was it when the reduce tasks were still hanging? Do all of the reduce tasks hang or is it just one that gets stuck? If, once the reduce tasks are hung, you open the hbase shell and run count 'mytable', 10 does it successfully count the rows? (I'm trying to determine if the client is locked up, or one of the RSes is locked up) Enjoy your holiday! -Todd On Thu, May 13, 2010 at 12:38 AM, Friso van Vollenhoven fvanvollenho...@xebia.com wrote: Hi, I was kind of hoping that this was a known thing and I was just overlooking something. Apparently it requires more investigation. I reproduced the situation and gathered some more info. I still have to look through everything myself a bit more (today is a national holiday in The Netherlands, so not much working), but I posted some info to pastebin. I did the following sequence (with HBase 0.20.4): - startup HBase (waited for all the regions to come online and let it settle) - startup our application - wait for the importer job to hang (it only happened on the second run, which started 15 reducers; the first run was really small and only one key was generated, so just one reducer) - kill the hanging importer job (hadoop job -kill) - try to shutdown HBase (as I type this it is still producing dots on my console) The HBase master logs are here (includes shutdown attempt): http://pastebin.com/PYpPVcyK The jstacks are here: - HMaster: http://pastebin.com/Da6jCAuA (this includes two thread dumps, one during operation with the hanging clients and one during hanging shutdown) - RegionServer 1: http://pastebin.com/5dQXfxCn - RegionServer 2: http://pastebin.com/XWwBGXYC - RegionServer 3: http://pastebin.com/mDgWbYGV - RegionServer 4: http://pastebin.com/XDR14bth As you can see in the master logs, the shutdown cannot get a thread called Thread-10 to stop running. The trace for that thread looks like this: Thread-10 prio=10 tid=0x4d218800 nid=0x1e73 in Object.wait() [0x427a7000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x2aaab364c9d0 (a java.lang.Object) at org.apache.hadoop.hbase.util.Sleeper.sleep(Sleeper.java:89) - locked 0x2aaab364c9d0 (a java.lang.Object) at org.apache.hadoop.hbase.Chore.run(Chore.java:76) I still have no clue what happened, but I will investigate a bit more tomorrow. Thanks for the responses. Friso On May 12, 2010, at 9:02 PM, Todd Lipcon wrote: Hi Friso, Also, if you can capture a jstack of the regionservers at thie time that would be great. -Todd On Wed, May 12, 2010 at 9:26 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: Friso, Unfortunately it's hard to determine the cause with the provided information, the client call you pasted is pretty much normal i.e. the client is waiting to receive a result from a region server. The fact that you can't shut down the master when this happens is very concerning. Do you still have those logs around? Same for the region servers? Can you post this in pastebin or on a web server? Also, feel free to come chat with us on IRC, it's always easier to debug when live. #hbase on freenode J-D On Wed, May 12, 2010 at 8:31 AM, Friso van Vollenhoven fvanvollenho...@xebia.com wrote: Hi all, I am using Hadoop (0.20.2) and HBase to periodically import data (every 15 minutes). There are a number of import processes, but generally they all create a sequence file on HDFS, which is then run through a MapReduce job. The MapReduce uses the identity mapper (the input file is a Hadoop sequence file) and a specialized reducer that does the following: - Combine the values for a key into one value - Do a Get from HBase to retrieve existing values for the same key - Combine the existing value from HBase and the new one into one value again - Put the final value into HBase under the same key (thus 'overwrite' the existing row; I keep only one version) After I upgraded HBase to the 0.20.4 release, the reducers sometimes start hanging on a Get. When the jobs start, some reducers run to completion fine, but after a while the last reducers will start to hang. Eventually the reducers are killed of by Hadoop (after 600 secs). I did a thread dump for one of the hanging reducers. It looks like this: main prio=10 tid=0x48083800 nid=0x4c93 in Object.wait() [0x420ca000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x2eb50d70 (a org.apache.hadoop.hbase.ipc.HBaseClient$Call) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:721) - locked 0x2eb50d70 (a org.apache.hadoop.hbase.ipc.HBaseClient$Call) at
Re: Regionservers crash with an OutOfMemoryException after a data-intensive map reduce job..
Hello Vidhyashankar: How many regionservers? What version of hbase and hadoop? How much RAM on these machines in total? Can you give HBase more RAM? Also check that you don't have an exceptional cell in your input -- one that is very much larger than the 14KB you not below. 12 column families is at the extreme regards what we've played with, just FYI. You might try with a schema that has less: e.g. one CF for the big cell value and all others into the second CF. There may also be corruption in one of the storefiles given that the OOME below seems to happen when we try and open a region (but the fact of opening may have no relation to why the OOME). St.Ack On Thu, May 13, 2010 at 10:35 AM, Vidhyashankar Venkataraman vidhy...@yahoo-inc.com wrote: This is similar to a mail sent by another user to the group a couple of months back.. I am quite new to Hbase and I’ve been trying to conduct a basic experiment with Hbase.. I am trying to load 200 million records each record around 15 KB : with one column value around 14KB and the rest of the 100 column values 8 bytes each.. The 120 columns are grouped as 10 qualifiers X 12 families: hope I got my jargon right.. Note that only one value is quite large for each doc (when compared to other values)... The data is uncompressed.. And each value is uniformly randomly selected.. I used a map-reduce job to load a data file on hdfs into the database.. Soon after the job finished, the region servers crash with OOM Exception.. Below is part of the trace from the logs in one of the RS’s: I have attached the conf along with the email: Can you guys point out any anamoly in my settings? I have set a heap size of 3 gigs.. Anything significantly more, java 32-bit doesn’t run.. 2010-05-12 19:22:45,068 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=8.43782MB (8847696), Free=1791.2247MB (1878235312), M ax=1799.6626MB (1887083008), Counts: Blocks=1, Access=16947, Hit=52, Miss=16895, Evictions=0, Evicted=0, Ratios: Hit Ratio=0.3068389603868127%, Miss Ratio=99 .69316124916077%, Evicted/Run=NaN 2010-05-12 19:22:45,069 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col5/7617863559659933969, isReference=false, seque nce id=2470632548, length=8456716, majorCompaction=false 2010-05-12 19:22:45,075 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col6/1328113038200437659, isReference=false, seque nce id=2960732840, length=19861, majorCompaction=false 2010-05-12 19:22:45,078 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col6/6484804359703635950, isReference=false, seque nce id=2470632548, length=8456716, majorCompaction=false 2010-05-12 19:22:45,082 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col7/1673569837212457160, isReference=false, seque nce id=2960732840, length=19861, majorCompaction=false 2010-05-12 19:22:45,085 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col7/4737399093829085995, isReference=false, seque nce id=2470632548, length=8456716, majorCompaction=false 2010-05-12 19:22:47,238 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col8/8446828932792437464, isReference=false, seque nce id=2960732840, length=19861, majorCompaction=false2010-05-12 19:22:47,241 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col8/974386128174268353, isReference=false, sequen ce id=2470632548, length=8456716, majorCompaction=false 2010-05-12 19:22:48,804 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col9/2096232603557969237, isReference=false, seque nce id=2470632548, length=8456716, majorCompaction=false 2010-05-12 19:22:48,807 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col9/7088206045660348092, isReference=false, seque nce id=2960732840, length=19861, majorCompaction=false 2010-05-12 19:22:48,808 INFO org.apache.hadoop.hbase.regionserver.HRegion: region DocData,4824176,1273625075099/1651418343 available; sequence id is 29607328 41 2010-05-12 19:22:48,808 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_OPEN: DocData,40682172,1273607630618 2010-05-12 19:22:48,809 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Opening region DocData,40682172,1273607630618, encoded=271889952 2010-05-12 19:22:50,924 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/271889952/CONTENT/4859380626868896307, isReference=false, sequence id=2959849236, length=337563, majorCompaction=false2010-05-12 19:22:53,037 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/271889952/CONTENT/952776139755887312, isReference=false, sequ ence id=2082553088, length=110460013, majorCompaction=false 2010-05-12 19:22:57,404 DEBUG
Re: Regionservers crash with an OutOfMemoryException after a data-intensive map reduce job..
Thanks for the prompt response.. Oops, forgot the specifics: I ran the whole thing on five region servers that also run hadoop's data node and task trackers: Each machine has 6 TB disk space (5TB available for the data node and 1 TB for MR and hbase temps), 24Gigs RAM, 3 gigs Hbase-heap size.. How do I give Hbase more RAM (are you talking about a config variable)? 3-4 gigs heap size is the max that 32-bit Java can take (or am I wrong?).. AFAIK, I had synthetically generated the workload and I am pretty sure the column sizes are what I had mentioned.. 12 column families is at the extreme regards what we've played with, just FYI. Ah, ok.. Will alter the schema then.. There may also be corruption in one of the storefiles given that the OOME below seems to happen when we try and open a region (but the fact of opening may have no relation to why the OOME). True, but then, all the region servers crashed at roughly the same time and for the exact reason (OOME when a region was opened)... Was there a spike in update traffic after the mr job finished? Or was there a compaction happening by any chance? (although I don't see an explicit debug message here: not sure if I had the correct debug log level)... Vidhya On 5/13/10 11:05 AM, Stack st...@duboce.net wrote: Hello Vidhyashankar: How many regionservers? What version of hbase and hadoop? How much RAM on these machines in total? Can you give HBase more RAM? Also check that you don't have an exceptional cell in your input -- one that is very much larger than the 14KB you not below. 12 column families is at the extreme regards what we've played with, just FYI. You might try with a schema that has less: e.g. one CF for the big cell value and all others into the second CF. There may also be corruption in one of the storefiles given that the OOME below seems to happen when we try and open a region (but the fact of opening may have no relation to why the OOME). St.Ack On Thu, May 13, 2010 at 10:35 AM, Vidhyashankar Venkataraman vidhy...@yahoo-inc.com wrote: This is similar to a mail sent by another user to the group a couple of months back.. I am quite new to Hbase and I've been trying to conduct a basic experiment with Hbase.. I am trying to load 200 million records each record around 15 KB : with one column value around 14KB and the rest of the 100 column values 8 bytes each.. The 120 columns are grouped as 10 qualifiers X 12 families: hope I got my jargon right.. Note that only one value is quite large for each doc (when compared to other values)... The data is uncompressed.. And each value is uniformly randomly selected.. I used a map-reduce job to load a data file on hdfs into the database.. Soon after the job finished, the region servers crash with OOM Exception.. Below is part of the trace from the logs in one of the RS's: I have attached the conf along with the email: Can you guys point out any anamoly in my settings? I have set a heap size of 3 gigs.. Anything significantly more, java 32-bit doesn't run.. 2010-05-12 19:22:45,068 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=8.43782MB (8847696), Free=1791.2247MB (1878235312), M ax=1799.6626MB (1887083008), Counts: Blocks=1, Access=16947, Hit=52, Miss=16895, Evictions=0, Evicted=0, Ratios: Hit Ratio=0.3068389603868127%, Miss Ratio=99 .69316124916077%, Evicted/Run=NaN 2010-05-12 19:22:45,069 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col5/7617863559659933969, isReference=false, seque nce id=2470632548, length=8456716, majorCompaction=false 2010-05-12 19:22:45,075 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col6/1328113038200437659, isReference=false, seque nce id=2960732840, length=19861, majorCompaction=false 2010-05-12 19:22:45,078 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col6/6484804359703635950, isReference=false, seque nce id=2470632548, length=8456716, majorCompaction=false 2010-05-12 19:22:45,082 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col7/1673569837212457160, isReference=false, seque nce id=2960732840, length=19861, majorCompaction=false 2010-05-12 19:22:45,085 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col7/4737399093829085995, isReference=false, seque nce id=2470632548, length=8456716, majorCompaction=false 2010-05-12 19:22:47,238 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col8/8446828932792437464, isReference=false, seque nce id=2960732840, length=19861, majorCompaction=false2010-05-12 19:22:47,241 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded /hbase/DocData/1651418343/col8/974386128174268353, isReference=false, sequen ce id=2470632548, length=8456716, majorCompaction=false 2010-05-12 19:22:48,804 DEBUG
Re: Regionservers crash with an OutOfMemoryException after a data-intensive map reduce job..
Looks like my conf file wasn't attached: So here goes some of (what I though were relevant) config values: Obviously, I am not asking anyone to go through every one of them but can someone cursorily eyeball them to see if something seems off? But as of now, it looks like I had too many column families in each region.. Thanks, Vidhya property namehbase.regionserver.handler.count/name value100/value descriptionCount of RPC Server instances spun up on RegionServers Same property is used by the HMaster for count of master handlers. Default is 25. /description /property property namehbase.regionserver.flushlogentries/name value100/value descriptionSync the HLog to the HDFS when it has accumulated this many entries. Default 100. Value is checked on every HLog.sync /description /property property namehbase.regionserver.global.memstore.upperLimit/name value0.4/value descriptionMaximum size of all memstores in a region server before new updates are blocked and flushes are forced. Defaults to 40% of heap /description /property property namehbase.regionserver.global.memstore.lowerLimit/name value0.35/value descriptionWhen memstores are being forced to flush to make room in memory, keep flushing until we hit this mark. Defaults to 30% of heap. This value equal to hbase.regionserver.global.memstore.upperLimit causes the minimum possible flushing to occur when updates are blocked due to memstore limiting. /description /property property namehbase.regionserver.optionallogflushinterval/name value1/value descriptionSync the HLog to the HDFS after this interval if it has not accumulated enough entries to trigger a sync. Default 10 seconds. Units: milliseconds. /description /property property namehbase.regionserver.logroll.period/name value360/value descriptionPeriod at which we will roll the commit log./description /property property namehbase.regionserver.thread.splitcompactcheckfrequency/name value2/value descriptionHow often a region server runs the split/compaction check. /description /property property namehbase.regionserver.nbreservationblocks/name value4/value descriptionThe number of reservation blocks which are used to prevent unstable region servers caused by an OOME. /description /property property namehbase.regions.percheckin/name value10/value descriptionMaximum number of regions that can be assigned in a single go to a region server. /description /property property namehbase.server.thread.wakefrequency/name value1/value descriptionTime to sleep in between searches for work (in milliseconds). Used as sleep interval by service threads such as META scanner and log roller. /description /property property namehbase.hregion.memstore.flush.size/name value67108864/value description Memstore will be flushed to disk if size of the memstore exceeds this number of bytes. Value is checked by a thread that runs every hbase.server.thread.wakefrequency. /description /property property namehbase.hregion.memstore.block.multiplier/name value4/value description MODIFIED Block updates if memstore has hbase.hregion.block.memstore time hbase.hregion.flush.size bytes. Useful preventing runaway memstore during spikes in update traffic. Without an upper-bound, memstore fills such that when it flushes the resultant flush files take a long time to compact or split, or worse, we OOME. /description /property property namehbase.regionserver.maxlogs/name value128/value description Max hlogs you can accumulate before they start rolling (default was 32) Hidden parameter! /description /property property namehbase.hregion.max.filesize/name value268435456/value description Maximum HStoreFile size. If any one of a column families' HStoreFiles has grown to exceed this value, the hosting HRegion is split in two. Default: 256M. /description /property property namehbase.hstore.compactionThreshold/name value3/value description If more than this number of HStoreFiles in any one HStore (one HStoreFile is written per flush of memstore) then a compaction is run to rewrite all HStoreFiles files as one. Larger numbers put off compaction but when it runs, it takes longer to complete. During a compaction, updates cannot be flushed to disk. Long compactions require memory sufficient to carry the logging of all updates across the duration of the compaction. If too large, clients timeout during compaction. /description /property property namehbase.hstore.blockingStoreFiles/name value16/value description MODIFIED FROM 4 If more than this number of StoreFiles in any one
Re: Using HBase on other file systems
Hey, I think one of the key features of HDFS is its ability to be run on standard hardware and integrate nicely in a standardized datacenter environment. I never would have got my project off the ground if I had to convince my company to invest in infiniband switches. So in the situation you described, you are getting only 50TB of storage on 20 nodes and the parts list would be something like: - 20 storage bricks w/infiniband and gigE ports - infiniband switch, min 20 ports - probably better to get more - 20 more HBase nodes, i'd like to have machines with 16+ GB ram, ideally 24GB and above At this point we could compare to my cluster setup which has 67TB of raw space reported by HDFS: - 20 HBase+HDFS nodes, 4TB/node, 16core w/24GB ram In my case I am paying about $3-4k/node (depending on when you bought them and from who) and I can leverage the gigE switching fabric (lower cost per port). So gluster sounds like and interesting but it sounds like at least 2x as expensive for less space. Presumably the performance benefits would make it up, but if the clients aren't connected by infiniband would you really see it? At at least $1000/port I'm not sure it's really worth it. On Thu, May 13, 2010 at 8:09 AM, Edward Capriolo edlinuxg...@gmail.com wrote: On Thu, May 13, 2010 at 12:26 AM, Jeff Hammerbacher ham...@cloudera.comwrote: Some projects sacrifice stability and manageability for performance (see, e.g., http://gluster.org/pipermail/gluster-users/2009-October/003193.html ). On Wed, May 12, 2010 at 11:15 AM, Edward Capriolo edlinuxg...@gmail.com wrote: On Wed, May 12, 2010 at 1:30 PM, Andrew Purtell apurt...@apache.org wrote: Before recommending Gluster I suggest you set up a test cluster and then randomly kill bricks. Also as pointed out in another mail, you'll want to colocate TaskTrackers on Gluster bricks to get I/O locality, yet there is no way for Gluster to export stripe locations back to Hadoop. It seems a poor choice. - Andy From: Edward Capriolo Subject: Re: Using HBase on other file systems To: hbase-user@hadoop.apache.org hbase-user@hadoop.apache.org Date: Wednesday, May 12, 2010, 6:38 AM On Tuesday, May 11, 2010, Jeff Hammerbacher ham...@cloudera.com wrote: Hey Edward, I do think that if you compare GoogleFS to HDFS, GFS looks more full featured. What features are you missing? Multi-writer append was explicitly called out by Sean Quinlan as a bad idea, and rolled back. From internal conversations with Google engineers, erasure coding of blocks suffered a similar fate. Native client access would certainly be nice, but FUSE gets you most of the way there. Scalability/availability of the NN, RPC QoS, alternative block placement strategies are second-order features which didn't exist in GFS until later in its lifecycle of development as well. HDFS is following a similar path and has JIRA tickets with active discussions. I'd love to hear your feature requests, and I'll be sure to translate them into JIRA tickets. I do believe my logic is reasonable. HBase has a lot of code designed around HDFS. We know these tickets that get cited all the time, for better random reads, or for sync() support. HBase gets the benefits of HDFS and has to deal with its drawbacks. Other key value stores handle storage directly. Sync() works and will be in the next release, and its absence was simply a result of the youth of the system. Now that that limitation has been removed, please point to another place in the code where using HDFS rather than the local file system is forcing HBase to make compromises. Your initial attempts on this front (caching, HFile, compactions) were, I hope, debunked by my previous email. It's also worth noting that Cassandra does all three, despite managing its own storage. I'm trying to learn from this exchange and always enjoy understanding new systems. Here's what I have so far from your arguments: 1) HBase inherits both the advantages and disadvantages of HDFS. I clearly agree on the general point; I'm pressing you to name some specific disadvantages, in hopes of helping prioritize our development of HDFS. So far, you've named things which are either a) not actually disadvantages b) no longer true. If you can come up with the disadvantages, we'll certainly take them into account. I've certainly got a number of them on our roadmap. 2) If you don't want to use HDFS, you won't want to use HBase. Also certainly true, but I'm not sure there's not much to learn from this assertion. I'd once again ask: why would you not want to use HDFS, and what is your choice in its stead?
RE: Using HBase on other file systems
Yo I feel the need to speak up. GlusterFS is pretty configurable. It doesn't rely on HBAs but it does support them. Gig or 10G ethernet are also supported options. I would love to see HBase become GlusterFS aware, because the architecture is, frankly, more flexible than HDFS with fewer SPoF concerns. GlusterFS is node aware with the Disco MapReduce framework - why not HBase? NB. I checked out running HBase over Walrus (an AWS S3 clone): bork - you want me to file a Jira on that? -Original Message- From: Ryan Rawson [mailto:ryano...@gmail.com] Sent: Thu 5/13/2010 9:46 PM To: hbase-user@hadoop.apache.org Subject: Re: Using HBase on other file systems Hey, I think one of the key features of HDFS is its ability to be run on standard hardware and integrate nicely in a standardized datacenter environment. I never would have got my project off the ground if I had to convince my company to invest in infiniband switches. So in the situation you described, you are getting only 50TB of storage on 20 nodes and the parts list would be something like: - 20 storage bricks w/infiniband and gigE ports - infiniband switch, min 20 ports - probably better to get more - 20 more HBase nodes, i'd like to have machines with 16+ GB ram, ideally 24GB and above At this point we could compare to my cluster setup which has 67TB of raw space reported by HDFS: - 20 HBase+HDFS nodes, 4TB/node, 16core w/24GB ram In my case I am paying about $3-4k/node (depending on when you bought them and from who) and I can leverage the gigE switching fabric (lower cost per port). So gluster sounds like and interesting but it sounds like at least 2x as expensive for less space. Presumably the performance benefits would make it up, but if the clients aren't connected by infiniband would you really see it? At at least $1000/port I'm not sure it's really worth it. On Thu, May 13, 2010 at 8:09 AM, Edward Capriolo edlinuxg...@gmail.com wrote: On Thu, May 13, 2010 at 12:26 AM, Jeff Hammerbacher ham...@cloudera.comwrote: Some projects sacrifice stability and manageability for performance (see, e.g., http://gluster.org/pipermail/gluster-users/2009-October/003193.html ). On Wed, May 12, 2010 at 11:15 AM, Edward Capriolo edlinuxg...@gmail.com wrote: On Wed, May 12, 2010 at 1:30 PM, Andrew Purtell apurt...@apache.org wrote: Before recommending Gluster I suggest you set up a test cluster and then randomly kill bricks. Also as pointed out in another mail, you'll want to colocate TaskTrackers on Gluster bricks to get I/O locality, yet there is no way for Gluster to export stripe locations back to Hadoop. It seems a poor choice. - Andy From: Edward Capriolo Subject: Re: Using HBase on other file systems To: hbase-user@hadoop.apache.org hbase-user@hadoop.apache.org Date: Wednesday, May 12, 2010, 6:38 AM On Tuesday, May 11, 2010, Jeff Hammerbacher ham...@cloudera.com wrote: Hey Edward, I do think that if you compare GoogleFS to HDFS, GFS looks more full featured. What features are you missing? Multi-writer append was explicitly called out by Sean Quinlan as a bad idea, and rolled back. From internal conversations with Google engineers, erasure coding of blocks suffered a similar fate. Native client access would certainly be nice, but FUSE gets you most of the way there. Scalability/availability of the NN, RPC QoS, alternative block placement strategies are second-order features which didn't exist in GFS until later in its lifecycle of development as well. HDFS is following a similar path and has JIRA tickets with active discussions. I'd love to hear your feature requests, and I'll be sure to translate them into JIRA tickets. I do believe my logic is reasonable. HBase has a lot of code designed around HDFS. We know these tickets that get cited all the time, for better random reads, or for sync() support. HBase gets the benefits of HDFS and has to deal with its drawbacks. Other key value stores handle storage directly. Sync() works and will be in the next release, and its absence was simply a result of the youth of the system. Now that that limitation has been removed, please point to another place in the code where using HDFS rather than the local file system is forcing HBase to make compromises. Your initial attempts on this front (caching, HFile, compactions) were, I hope, debunked by my previous email. It's also worth noting that Cassandra does all three, despite managing its own storage. I'm trying to learn from this exchange and always enjoy understanding new systems. Here's what I have so far from your arguments: 1) HBase inherits both the advantages and disadvantages of HDFS. I clearly agree on
RE: Using HBase on other file systems
You really want to run HBase backed by Eucalyptus' Walrus? What do you have behind that? From: Gibbon, Robert, VF-Group Subject: RE: Using HBase on other file systems [...] NB. I checked out running HBase over Walrus (an AWS S3 clone): bork - you want me to file a Jira on that?
tables disappearing after upgrading 0.20.3 = 0.20.4
Hi, after upgrading from 0.20.3 to 0.20.4 a list of tables almost immediately becomes inconsistent - master.jsp shows no tables even after creating test table in hbase shell, tables which were available before start randomly appearing and disappearing, etc. Upgrading was done by stopping, upgrading code, and then starting (no dump/restore was done). I didn't investigate yet, just checking if somebody had the same problem or if I did upgrade right (I had exactly the same issue in the past when trying to apply HBASE-2174 manually). Environment: Small tables, 100k rows Amazon EC2, c1.xlarge instance type with Ubuntu 9.10 and EBS root, HBase installed manually 1 master (namenode + jobtracker + master), 3 slaves (tasktracker + datanode + regionserver + zookeeper) Hadoop 0.20.1+169.68~1.karmic-cdh2 from Cloudera distribution Flaky DNS issue present, happens about once per day even with dnsmasq installed (heartbeat every 1s, dnsmasq forwards requests once per minute), DDNS set for internal hostnames. This is a testing cluster, nothing important on it. Cheers, -- Viktors
Re: tables disappearing after upgrading 0.20.3 = 0.20.4
Whats the shelll say? Does it see the tables consistently? Can you count your content consistently? St.Ack On Thu, May 13, 2010 at 4:53 PM, Viktors Rotanovs viktors.rotan...@gmail.com wrote: Hi, after upgrading from 0.20.3 to 0.20.4 a list of tables almost immediately becomes inconsistent - master.jsp shows no tables even after creating test table in hbase shell, tables which were available before start randomly appearing and disappearing, etc. Upgrading was done by stopping, upgrading code, and then starting (no dump/restore was done). I didn't investigate yet, just checking if somebody had the same problem or if I did upgrade right (I had exactly the same issue in the past when trying to apply HBASE-2174 manually). Environment: Small tables, 100k rows Amazon EC2, c1.xlarge instance type with Ubuntu 9.10 and EBS root, HBase installed manually 1 master (namenode + jobtracker + master), 3 slaves (tasktracker + datanode + regionserver + zookeeper) Hadoop 0.20.1+169.68~1.karmic-cdh2 from Cloudera distribution Flaky DNS issue present, happens about once per day even with dnsmasq installed (heartbeat every 1s, dnsmasq forwards requests once per minute), DDNS set for internal hostnames. This is a testing cluster, nothing important on it. Cheers, -- Viktors