Re: data loss with hbase 0.19.3
HDFS doesnt allow you to read partially written files, it reports the size as 0 until the file is properly closed, under a crash scenario you are in trouble. The best options right now are to: - dont let hbase crash (not as crazy as this sounds) - consider experimenting with some newer hdfs stuff - wait for hadoop 0.21 in the mean time, you will suffer loss if hbase regionservers crash. That is a crash as in hard crash, controlled shutdowns flush and you dont lose data then. sorry for the confusion! -ryan On Thu, Aug 13, 2009 at 10:56 PM, Chen Xinlichen.d...@gmail.com wrote: For the Hlog, I find an interesting problem. I set the optionallogflushinterval to 1, that's 10 seconds; but it flushes with the interval of 1 hour. After the hlog file generated, I stop hdfs and then kill hmaster and regionservers; then I start all again, the hmaster doesn't restore records from hlog, that's the record lost again. Is there something wrong? 2009/8/14 Chen Xinli chen.d...@gmail.com Thanks Daniel. As you said the latest version has done much to avoid data loss, would you pls give some example? I read the conf file and api, and find some functions related: 1. in hbase-default.xml, hbase.regionserver.optionallogflushinterval described as Sync the HLog to the HDFS after this interval if it has not accumulated enough entries to trigger a sync. I issued one update to my table, but there's no hlog files after the specifed interval. This setting doesn't work, or I make a misunderstanding? 2. HbaseAdmin.flush(tableOrRegionName). It seems that this function flush the memcache to HStorefile. Should I call this function to avoid data loss after several thousand updation? 3. In Htable, there is also a function flushCommits. Where does it flush to? memcache or hdfs? Actually we have a crawler, and want to store webpages(about 1 billion) in hbase. What shall we do to avoid data loss? Any suggestion is appreciated. By the way, we use hadoop 0.19.1 + hbase 0.19.3 Thanks 2009/8/6 Jean-Daniel Cryans jdcry...@apache.org Chen, The main problem is that appends are not supported in HDFS, HBase simply cannot sync its logs to it. But, we did some work to make that story better. The latest revision in the 0.19 branch and 0.20 RC1 both solve much of the data loss problem but it won't be near perfect until we have appends (supposed to be available in 0.21). J-D On Thu, Aug 6, 2009 at 12:45 AM, Chen Xinlichen.d...@gmail.com wrote: Hi, I'm using hbase 0.19.3 on a cluster with 30 machines to store web data. We got a poweroff days before and I found much web data lost. I have searched google, and find it's a meta flush problem. I know there is much performance improvement in 0.20.0; Is the data lost problem handled in the new version? -- Best Regards, Chen Xinli -- Best Regards, Chen Xinli -- Best Regards, Chen Xinli
Re: HBase commit autoflush
oh! yes, it is sorted. Thank you very much. On Thu, Aug 13, 2009 at 12:47 PM, Erik Holstad erikhols...@gmail.comwrote: Hey Schubert! The writeBuffer is sorted in processBatchOfRows just like you suggested. Regards Erik
Re: data loss with hbase 0.19.3
Thanks for your suggestion. As our insertion is daily, that's to insert lots of records at fixed time, can we just call HBaseAdmin.flush to avoid loss? I have done some experiments and find it works. I wonder if it will cause some other problem? 2009/8/14 Ryan Rawson ryano...@gmail.com HDFS doesnt allow you to read partially written files, it reports the size as 0 until the file is properly closed, under a crash scenario you are in trouble. The best options right now are to: - dont let hbase crash (not as crazy as this sounds) - consider experimenting with some newer hdfs stuff - wait for hadoop 0.21 in the mean time, you will suffer loss if hbase regionservers crash. That is a crash as in hard crash, controlled shutdowns flush and you dont lose data then. sorry for the confusion! -ryan On Thu, Aug 13, 2009 at 10:56 PM, Chen Xinlichen.d...@gmail.com wrote: For the Hlog, I find an interesting problem. I set the optionallogflushinterval to 1, that's 10 seconds; but it flushes with the interval of 1 hour. After the hlog file generated, I stop hdfs and then kill hmaster and regionservers; then I start all again, the hmaster doesn't restore records from hlog, that's the record lost again. Is there something wrong? 2009/8/14 Chen Xinli chen.d...@gmail.com Thanks Daniel. As you said the latest version has done much to avoid data loss, would you pls give some example? I read the conf file and api, and find some functions related: 1. in hbase-default.xml, hbase.regionserver.optionallogflushinterval described as Sync the HLog to the HDFS after this interval if it has not accumulated enough entries to trigger a sync. I issued one update to my table, but there's no hlog files after the specifed interval. This setting doesn't work, or I make a misunderstanding? 2. HbaseAdmin.flush(tableOrRegionName). It seems that this function flush the memcache to HStorefile. Should I call this function to avoid data loss after several thousand updation? 3. In Htable, there is also a function flushCommits. Where does it flush to? memcache or hdfs? Actually we have a crawler, and want to store webpages(about 1 billion) in hbase. What shall we do to avoid data loss? Any suggestion is appreciated. By the way, we use hadoop 0.19.1 + hbase 0.19.3 Thanks 2009/8/6 Jean-Daniel Cryans jdcry...@apache.org Chen, The main problem is that appends are not supported in HDFS, HBase simply cannot sync its logs to it. But, we did some work to make that story better. The latest revision in the 0.19 branch and 0.20 RC1 both solve much of the data loss problem but it won't be near perfect until we have appends (supposed to be available in 0.21). J-D On Thu, Aug 6, 2009 at 12:45 AM, Chen Xinlichen.d...@gmail.com wrote: Hi, I'm using hbase 0.19.3 on a cluster with 30 machines to store web data. We got a poweroff days before and I found much web data lost. I have searched google, and find it's a meta flush problem. I know there is much performance improvement in 0.20.0; Is the data lost problem handled in the new version? -- Best Regards, Chen Xinli -- Best Regards, Chen Xinli -- Best Regards, Chen Xinli -- Best Regards, Chen Xinli
Vuk Ercegovac is out of the office.
I will be out of the office starting 08/14/2009 and will not return until 09/02/2009. I will be in europe and plan to check email.
Re: HBase in a real world application
Hey Ryan, Do you mean you run multiple VMs on the 1950s using xen or something? isn't 16gb a lot for single box Ryan Rawson wrote: we are using dell 1950s, 8cpu 16gb ram, dual 1tb disk. you can get machines in this range for in the $2k range. I run hbase on 1tb of data on 20 of these. You can probably look at doing 15+ machines. The master machine doesnt do much work, but it has to be reliable. Raid, dual power supply, etc. If it goes down, namenode takes your entire system down. I run them on a standard node, but with some of the dual power features enabled. The regionservers do way more, so in theory you could have a smaller master, but not too small. Probably best to stick to 1 node time, keep it cheap. You can run ZK on those nodes, but if you run into IO wait issues, you might see stalls that could hurt bad. I'd avoid doing massive map-reduces with a large intermediate output on these machines. -ryan On Tue, Aug 11, 2009 at 4:14 PM, llpindsonny_h...@hotmail.com wrote: Thanks for the link. I will keep that in mind. Yeah 256MB isn't much. Moving up to 3-4G for 10-15 boxes gets expensive. Alejandro Pérez-Linaza wrote: You might want to check out www.rackspacecloud.com where you can get boxes and pay by the hour (as cheap as $0.015 / hour for a 256Mb box). We used it a couple of weeks ago to setup a MySQL Cluster test and ended up having around 18 boxes. Memory can be changed from 256Mb to 16Gb in a couple of minutes. They also have various flavors to choose from. The bottom line is that we love it and it solves the problem of the test boxes that you would need right away. Have fun, Alex Alejandro Pérez-Linaza CEO Vertical Technologies, LLC ape...@vertical-tech.com www.vertical-tech.com 9600 NW 25th Street, Suite 4A Miami, FL 33172 Office: (786) 206-0554 x 108 Toll Free: (866) 382-8918 Fax: (305) 328-5063 The information in this email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. -Original Message- From: llpind [mailto:sonny_h...@hotmail.com] Sent: Tuesday, August 11, 2009 12:21 PM To: hbase-user@hadoop.apache.org Subject: HBase in a real world application As some of you know, I've been playing with HBase on/off for the past few months. I'd like your take on some cluster setup/configuration setting that you’ve found successful. Also, any other thoughts on how I can persuade usage of HBase. Assume: Working with ~2 TB of data. A few very tall tables. Hadoop/HBase 0.20.0. 1. What specs should a master box have (speed, HD, RAM)? Should Slave boxes be different? 2. Recommended size of cluster? I realize this depends on what load/performance requirements we have, but I’d like to know your thoughts based on #1 specs. 3. Should zookeeper quorums run on different boxes than regionservers? Basically if you could give some example cluster configurations with the amount of data your working with that would be a lot of help (or point me to a place were this has been discussed for .20). Currently I don’t have the funds to play around with a lot of boxes, but I hope to soon. :) Thanks. -- View this message in context: http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24920888.html Sent from the HBase User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24927386.html Sent from the HBase User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24973523.html Sent from the HBase User mailing list archive at Nabble.com.
Feed Aggregator Schema
Hello, I am working on a project involving monitoring a large number of rss/atom feeds. I want to use hbase for data storage and I have some problems designing the schema. For the first iteration I want to be able to generate an aggregated feed (last 100 posts from all feeds in reverse chronological order). Currently I am using two tables: Feeds: column families Content and Meta : raw feed stored in Content:raw Urls: column families Content and Meta : raw post version stored in Content:raw and the rest of the data found in RSS stored in Meta I need some sort of index table for the aggregated feed. How should I build that? Is hbase a good choice for this kind of application? In other words: Is it possible( in hbase) to design a schema that could efficiently answer to queries like the one listed bellow? SELECT data FROM Urls ORDER BY date DESC LIMIT 100 Thanks. -- Savu Andrei Website: http://www.andreisavu.ro/
Informal Meetup - San Fran area 6,7,8 September?
Hi All, I'm going to be in the San Fran area the 6,7,8 September and would love the chance to meet with some of the HBase users, developers if anyone is interested? I work with a Global Biodiversity Information network (GBIF) that has several thousand databases publishing data using well defined XML standards. We crawl and build an index of this information that currently resides in Mysql, and has 180million records in each of the 2 largest tables; and we are outgrowing mysql. We already use Hadoop to do various processes, but are about to try HBase as the backend store after doing various tests recently. We are not a huge cluster (16 nodes) but I think we are a nice case study, and because we are able to document openly and freely it could be something to reference from the HBase wikis. We are building search indexes, running statistical reports, annotating records (geocoding, quality control) creating maps (e.g. tile layers) etc so the output is technically quite interesting and the data is used for all kinds of scientific analysis. All our code is open source. We are a small team (3-4 developers) so would very much like the opportunity to pick the brains of others. We are keen to help improve HBase as well; probably more in the testing capacity than committing due to our workloads, but will do whatever we can. Cheers, Tim
Re: HBase in a real world application
Is that region offline? Do a: hbase get .META., TestTable,0001749889,1250092414985, {COLUMNS = info}. If so, can you get its history so we can figure how it went offline? (See region history in UI or grep it in master logs?) St.Ack On Fri, Aug 14, 2009 at 9:55 AM, llpind sonny_h...@hotmail.com wrote: Hey Stack, I tried the following command: hadoop-0.20.0/bin/hadoop jar hbase-0.20.0/hbase-0.20.0-test.jar randomWrite 10 running a map/reduce job, it failed with the following exceptions in each node: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server for region , row '0001753186', but failed after 11 attempts. Exceptions: org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985 org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985 org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985 org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985 org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985 org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985 org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985 org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985 org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985 org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985 at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocationForRowWithRetries(HConnectionManager.java:995) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1025) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:584) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:450) at org.apache.hadoop.hbase.PerformanceEvaluation$RandomWriteTest.testRow(PerformanceEvaluation.java:497) at org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:406) at org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:627) at org.apache.hadoop.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:194) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) This appears to be this issue: http://issues.apache.org/jira/browse/HBASE-1603 Has this been fixed in .20? Thanks. stack-3 wrote: On Wed, Aug 12, 2009 at 8:58 AM, llpind sonny_h...@hotmail.com wrote: Playing with the HBase perfomanceEval Class, but it seems to take a long time to run “sequentialWrite 2” (~20 minutes). If I simply emulate 1 clients in a simple program, I can do 1 Million Puts in about 3 minutes (non mapred). The sequential write is writing 2 million with 2 clients. Please help in understanding how to use the performanceEvaluation Class. If the number of clients is 1, unless you add the '--nomapred' (sp?) argument, PE launches a mapreduce program of N tasks. Each task puts up a client writing 1M rows (IIRC). Try N where N == number_of_map_slots and see what that does? N == 2 probably won't tell you much. You could also set an N 1 and use the '--nomapred'. This will run PE clients in a distinct thread. For small numbers of N, this can put up heavier loading than MR with its setup and teardown cost. St.Ack -- View this message in context: http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24975031.html Sent from the HBase User mailing list archive at Nabble.com.
Re: data loss with hbase 0.19.3
Yes, you can definitely do that. We have tables that we put constraints on in that way. Flushing the table ensures all data is written to HDFS and then you will not have any data loss under HBase fault scenarios. Chen Xinli wrote: Thanks for your suggestion. As our insertion is daily, that's to insert lots of records at fixed time, can we just call HBaseAdmin.flush to avoid loss? I have done some experiments and find it works. I wonder if it will cause some other problem? 2009/8/14 Ryan Rawson ryano...@gmail.com HDFS doesnt allow you to read partially written files, it reports the size as 0 until the file is properly closed, under a crash scenario you are in trouble. The best options right now are to: - dont let hbase crash (not as crazy as this sounds) - consider experimenting with some newer hdfs stuff - wait for hadoop 0.21 in the mean time, you will suffer loss if hbase regionservers crash. That is a crash as in hard crash, controlled shutdowns flush and you dont lose data then. sorry for the confusion! -ryan On Thu, Aug 13, 2009 at 10:56 PM, Chen Xinlichen.d...@gmail.com wrote: For the Hlog, I find an interesting problem. I set the optionallogflushinterval to 1, that's 10 seconds; but it flushes with the interval of 1 hour. After the hlog file generated, I stop hdfs and then kill hmaster and regionservers; then I start all again, the hmaster doesn't restore records from hlog, that's the record lost again. Is there something wrong? 2009/8/14 Chen Xinli chen.d...@gmail.com Thanks Daniel. As you said the latest version has done much to avoid data loss, would you pls give some example? I read the conf file and api, and find some functions related: 1. in hbase-default.xml, hbase.regionserver.optionallogflushinterval described as Sync the HLog to the HDFS after this interval if it has not accumulated enough entries to trigger a sync. I issued one update to my table, but there's no hlog files after the specifed interval. This setting doesn't work, or I make a misunderstanding? 2. HbaseAdmin.flush(tableOrRegionName). It seems that this function flush the memcache to HStorefile. Should I call this function to avoid data loss after several thousand updation? 3. In Htable, there is also a function flushCommits. Where does it flush to? memcache or hdfs? Actually we have a crawler, and want to store webpages(about 1 billion) in hbase. What shall we do to avoid data loss? Any suggestion is appreciated. By the way, we use hadoop 0.19.1 + hbase 0.19.3 Thanks 2009/8/6 Jean-Daniel Cryans jdcry...@apache.org Chen, The main problem is that appends are not supported in HDFS, HBase simply cannot sync its logs to it. But, we did some work to make that story better. The latest revision in the 0.19 branch and 0.20 RC1 both solve much of the data loss problem but it won't be near perfect until we have appends (supposed to be available in 0.21). J-D On Thu, Aug 6, 2009 at 12:45 AM, Chen Xinlichen.d...@gmail.com wrote: Hi, I'm using hbase 0.19.3 on a cluster with 30 machines to store web data. We got a poweroff days before and I found much web data lost. I have searched google, and find it's a meta flush problem. I know there is much performance improvement in 0.20.0; Is the data lost problem handled in the new version? -- Best Regards, Chen Xinli -- Best Regards, Chen Xinli -- Best Regards, Chen Xinli
Re: HBase in a real world application
hbase(main):003:0 get '.META.', 'TestTable,0001749889,1250092414985', {COLUMNS ='info'} 09/08/14 12:28:10 DEBUG client.HConnectionManager$TableServers: Cache hit for row in tableName .META.: location server 192.168.0.196:60020, location region name .META.,,1 NativeException: java.lang.NullPointerException: null from org/apache/hadoop/hbase/client/HTable.java:789:in `get' from org/apache/hadoop/hbase/client/HTable.java:769:in `get' from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0' from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke' from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke' from java/lang/reflect/Method.java:597:in `invoke' from org/jruby/javasupport/JavaMethod.java:298:in `invokeWithExceptionHandling' from org/jruby/javasupport/JavaMethod.java:259:in `invoke' from org/jruby/java/invokers/InstanceMethodInvoker.java:30:in `call' from org/jruby/runtime/callsite/CachingCallSite.java:30:in `call' from org/jruby/ast/CallManyArgsNode.java:59:in `interpret' from org/jruby/ast/LocalAsgnNode.java:123:in `interpret' from org/jruby/ast/NewlineNode.java:104:in `interpret' from org/jruby/ast/IfNode.java:112:in `interpret' from org/jruby/ast/NewlineNode.java:104:in `interpret' from org/jruby/ast/IfNode.java:114:in `interpret' ... 115 levels... from home/hadoop/hbase_minus_0_dot_20_dot_0/bin/$_dot_dot_/bin/hirb#start:-1:in `call' from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in `call' from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in `call' from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in `call' from org/jruby/runtime/callsite/CachingCallSite.java:253:in `cacheAndCall' from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call' from home/hadoop/hbase_minus_0_dot_20_dot_0/bin/$_dot_dot_/bin/hirb.rb:487:in `__file__' from home/hadoop/hbase_minus_0_dot_20_dot_0/bin/$_dot_dot_/bin/hirb.rb:-1:in `load' from org/jruby/Ruby.java:577:in `runScript' from org/jruby/Ruby.java:480:in `runNormally' from org/jruby/Ruby.java:354:in `runFromMain' from org/jruby/Main.java:229:in `run' from org/jruby/Main.java:110:in `run' from org/jruby/Main.java:94:in `main' from /home/hadoop/hbase-0.20.0/bin/../bin/hirb.rb:384:in `get' from (hbase):4hbase(main):004:0 get '.META.', 'TestTable,0001749889,1250092414985' 09/08/14 12:28:13 DEBUG client.HConnectionManager$TableServers: Cache hit for row in tableName .META.: location server 192.168.0.196:60020, location region name .META.,,1 COLUMN CELL historian:assignmenttimestamp=1250108456441, value=Region assigned to server server195,60020,12501083767 79 historian:compactiontimestamp=1250109313965, value=Region compaction completed in 35sec historian:open timestamp=1250108459484, value=Region opened on server : server195 historian:split timestamp=1250092447915, value=Region split from: TestTable,0001634945,1250035163 027 info:regioninfo timestamp=1250109315260, value=REGION = {NAME = 'TestTable,0001749889,125009241 4985', STARTKEY = '0001749889', ENDKEY = '0001866010', ENCODED = 1707908074, O FFLINE = true, TABLE = {{NAME = 'TestTable', FAMILIES = [{NAME = 'info', VER SIONS = '3', COMPRESSION = 'NONE', TTL = '2147483647', BLOCKSIZE = '65536', I N_MEMORY = 'false', BLOCKCACHE = 'true'}]}} = stack-3 wrote: Is that region offline? Do a: hbase get .META., TestTable,0001749889,1250092414985, {COLUMNS = info}. If so, can you get its history so we can figure how it went offline? (See region history in UI or grep it in master logs?) St.Ack On Fri, Aug 14, 2009 at 9:55 AM, llpind sonny_h...@hotmail.com wrote: Hey Stack, I tried the following command: hadoop-0.20.0/bin/hadoop jar hbase-0.20.0/hbase-0.20.0-test.jar randomWrite 10 running a map/reduce job, it failed with the following exceptions in each node: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server for region , row '0001753186', but failed after 11 attempts. Exceptions: org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985 org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985 org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985 org.apache.hadoop.hbase.client.RegionOfflineException: region offline: TestTable,0001749889,1250092414985
Re: HBase in a real world application
Absolutely not. VM = low performance, no good. While it seems that 16 GB ram is a lot, it really isnt. I'd rather have twice, since java sucks up ram like no tomorrow, and also we want a really really effective OS buffer cache. This improves random reads quite a bit. In fact my newer machines are intel i7s, 8 core, HTT (16 cpus on /proc/cpuinfo) and 24 gb ram. I'd prefer to keep 2gb per core, but due to the architecture needs, we'd have to go with slower ram. It's all about the RAM. On Fri, Aug 14, 2009 at 8:23 AM, llpindsonny_h...@hotmail.com wrote: Hey Ryan, Do you mean you run multiple VMs on the 1950s using xen or something? isn't 16gb a lot for single box Ryan Rawson wrote: we are using dell 1950s, 8cpu 16gb ram, dual 1tb disk. you can get machines in this range for in the $2k range. I run hbase on 1tb of data on 20 of these. You can probably look at doing 15+ machines. The master machine doesnt do much work, but it has to be reliable. Raid, dual power supply, etc. If it goes down, namenode takes your entire system down. I run them on a standard node, but with some of the dual power features enabled. The regionservers do way more, so in theory you could have a smaller master, but not too small. Probably best to stick to 1 node time, keep it cheap. You can run ZK on those nodes, but if you run into IO wait issues, you might see stalls that could hurt bad. I'd avoid doing massive map-reduces with a large intermediate output on these machines. -ryan On Tue, Aug 11, 2009 at 4:14 PM, llpindsonny_h...@hotmail.com wrote: Thanks for the link. I will keep that in mind. Yeah 256MB isn't much. Moving up to 3-4G for 10-15 boxes gets expensive. Alejandro Pérez-Linaza wrote: You might want to check out www.rackspacecloud.com where you can get boxes and pay by the hour (as cheap as $0.015 / hour for a 256Mb box). We used it a couple of weeks ago to setup a MySQL Cluster test and ended up having around 18 boxes. Memory can be changed from 256Mb to 16Gb in a couple of minutes. They also have various flavors to choose from. The bottom line is that we love it and it solves the problem of the test boxes that you would need right away. Have fun, Alex Alejandro Pérez-Linaza CEO Vertical Technologies, LLC ape...@vertical-tech.com www.vertical-tech.com 9600 NW 25th Street, Suite 4A Miami, FL 33172 Office: (786) 206-0554 x 108 Toll Free: (866) 382-8918 Fax: (305) 328-5063 The information in this email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. -Original Message- From: llpind [mailto:sonny_h...@hotmail.com] Sent: Tuesday, August 11, 2009 12:21 PM To: hbase-user@hadoop.apache.org Subject: HBase in a real world application As some of you know, I've been playing with HBase on/off for the past few months. I'd like your take on some cluster setup/configuration setting that you’ve found successful. Also, any other thoughts on how I can persuade usage of HBase. Assume: Working with ~2 TB of data. A few very tall tables. Hadoop/HBase 0.20.0. 1. What specs should a master box have (speed, HD, RAM)? Should Slave boxes be different? 2. Recommended size of cluster? I realize this depends on what load/performance requirements we have, but I’d like to know your thoughts based on #1 specs. 3. Should zookeeper quorums run on different boxes than regionservers? Basically if you could give some example cluster configurations with the amount of data your working with that would be a lot of help (or point me to a place were this has been discussed for .20). Currently I don’t have the funds to play around with a lot of boxes, but I hope to soon. :) Thanks. -- View this message in context: http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24920888.html Sent from the HBase User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24927386.html Sent from the HBase User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24973523.html Sent from the HBase User mailing list archive at Nabble.com.
Re: HBase in a real world application
Thanks for trying. Looks like that region is now gone (split is my guess). Check the master log for mentions of this region to see its history. Can you correlate the client failure with an event on this region in master log? It looks like client was being pig-headed fixated on the parent of a split. You could check your table is healthy? Run the rowcounter program to make sure no holes in table? St.Ack On Fri, Aug 14, 2009 at 12:41 PM, llpind sonny_h...@hotmail.com wrote: hbase(main):003:0 get '.META.', 'TestTable,0001749889,1250092414985', {COLUMNS ='info'} 09/08/14 12:28:10 DEBUG client.HConnectionManager$TableServers: Cache hit for row in tableName .META.: location server 192.168.0.196:60020, location region name .META.,,1 NativeException: java.lang.NullPointerException: null from org/apache/hadoop/hbase/client/HTable.java:789:in `get' from org/apache/hadoop/hbase/client/HTable.java:769:in `get' from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0' from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke' from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke' from java/lang/reflect/Method.java:597:in `invoke' from org/jruby/javasupport/JavaMethod.java:298:in `invokeWithExceptionHandling' from org/jruby/javasupport/JavaMethod.java:259:in `invoke' from org/jruby/java/invokers/InstanceMethodInvoker.java:30:in `call' from org/jruby/runtime/callsite/CachingCallSite.java:30:in `call' from org/jruby/ast/CallManyArgsNode.java:59:in `interpret' from org/jruby/ast/LocalAsgnNode.java:123:in `interpret' from org/jruby/ast/NewlineNode.java:104:in `interpret' from org/jruby/ast/IfNode.java:112:in `interpret' from org/jruby/ast/NewlineNode.java:104:in `interpret' from org/jruby/ast/IfNode.java:114:in `interpret' ... 115 levels... from home/hadoop/hbase_minus_0_dot_20_dot_0/bin/$_dot_dot_/bin/hirb#start:-1:in `call' from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in `call' from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in `call' from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in `call' from org/jruby/runtime/callsite/CachingCallSite.java:253:in `cacheAndCall' from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call' from home/hadoop/hbase_minus_0_dot_20_dot_0/bin/$_dot_dot_/bin/hirb.rb:487:in `__file__' from home/hadoop/hbase_minus_0_dot_20_dot_0/bin/$_dot_dot_/bin/hirb.rb:-1:in `load' from org/jruby/Ruby.java:577:in `runScript' from org/jruby/Ruby.java:480:in `runNormally' from org/jruby/Ruby.java:354:in `runFromMain' from org/jruby/Main.java:229:in `run' from org/jruby/Main.java:110:in `run' from org/jruby/Main.java:94:in `main' from /home/hadoop/hbase-0.20.0/bin/../bin/hirb.rb:384:in `get' from (hbase):4hbase(main):004:0 get '.META.', 'TestTable,0001749889,1250092414985' 09/08/14 12:28:13 DEBUG client.HConnectionManager$TableServers: Cache hit for row in tableName .META.: location server 192.168.0.196:60020, location region name .META.,,1 COLUMN CELL historian:assignmenttimestamp=1250108456441, value=Region assigned to server server195,60020,12501083767 79 historian:compactiontimestamp=1250109313965, value=Region compaction completed in 35sec historian:open timestamp=1250108459484, value=Region opened on server : server195 historian:split timestamp=1250092447915, value=Region split from: TestTable,0001634945,1250035163 027 info:regioninfo timestamp=1250109315260, value=REGION = {NAME = 'TestTable,0001749889,125009241 4985', STARTKEY = '0001749889', ENDKEY = '0001866010', ENCODED = 1707908074, O FFLINE = true, TABLE = {{NAME = 'TestTable', FAMILIES = [{NAME = 'info', VER SIONS = '3', COMPRESSION = 'NONE', TTL = '2147483647', BLOCKSIZE = '65536', I N_MEMORY = 'false', BLOCKCACHE = 'true'}]}} = stack-3 wrote: Is that region offline? Do a: hbase get .META., TestTable,0001749889,1250092414985, {COLUMNS = info}. If so, can you get its history so we can figure how it went offline? (See region history in UI or grep it in master logs?) St.Ack On Fri, Aug 14, 2009 at 9:55 AM, llpind sonny_h...@hotmail.com wrote: Hey Stack, I tried the following command: hadoop-0.20.0/bin/hadoop jar hbase-0.20.0/hbase-0.20.0-test.jar randomWrite 10 running a map/reduce job, it failed with the following exceptions in each node: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server
Re: Informal Meetup - San Fran area 6,7,8 September?
I'm game only you are arriving at bad time Tim. Thats labor day w/e. Fellas won't be around (generally). They'll be out in the desert covered in dust dancing around giant dolls. So, 8th is probably the day you'll catch most of the SF crew. Just FYI. Can help you w/ places to go np. St.Ack On Fri, Aug 14, 2009 at 9:46 AM, tim robertson timrobertson...@gmail.comwrote: Hi All, I'm going to be in the San Fran area the 6,7,8 September and would love the chance to meet with some of the HBase users, developers if anyone is interested? I work with a Global Biodiversity Information network (GBIF) that has several thousand databases publishing data using well defined XML standards. We crawl and build an index of this information that currently resides in Mysql, and has 180million records in each of the 2 largest tables; and we are outgrowing mysql. We already use Hadoop to do various processes, but are about to try HBase as the backend store after doing various tests recently. We are not a huge cluster (16 nodes) but I think we are a nice case study, and because we are able to document openly and freely it could be something to reference from the HBase wikis. We are building search indexes, running statistical reports, annotating records (geocoding, quality control) creating maps (e.g. tile layers) etc so the output is technically quite interesting and the data is used for all kinds of scientific analysis. All our code is open source. We are a small team (3-4 developers) so would very much like the opportunity to pick the brains of others. We are keen to help improve HBase as well; probably more in the testing capacity than committing due to our workloads, but will do whatever we can. Cheers, Tim
Re: data loss with hbase 0.19.3
Or just add below to cron: echo flush TABLENAME |./bin/hbase shell Or adjust the configuration in hbase so it flushes once a day (see hbase-default.xml for all options). St.Ack On Fri, Aug 14, 2009 at 2:13 AM, Chen Xinli chen.d...@gmail.com wrote: Thanks for your suggestion. As our insertion is daily, that's to insert lots of records at fixed time, can we just call HBaseAdmin.flush to avoid loss? I have done some experiments and find it works. I wonder if it will cause some other problem? 2009/8/14 Ryan Rawson ryano...@gmail.com HDFS doesnt allow you to read partially written files, it reports the size as 0 until the file is properly closed, under a crash scenario you are in trouble. The best options right now are to: - dont let hbase crash (not as crazy as this sounds) - consider experimenting with some newer hdfs stuff - wait for hadoop 0.21 in the mean time, you will suffer loss if hbase regionservers crash. That is a crash as in hard crash, controlled shutdowns flush and you dont lose data then. sorry for the confusion! -ryan On Thu, Aug 13, 2009 at 10:56 PM, Chen Xinlichen.d...@gmail.com wrote: For the Hlog, I find an interesting problem. I set the optionallogflushinterval to 1, that's 10 seconds; but it flushes with the interval of 1 hour. After the hlog file generated, I stop hdfs and then kill hmaster and regionservers; then I start all again, the hmaster doesn't restore records from hlog, that's the record lost again. Is there something wrong? 2009/8/14 Chen Xinli chen.d...@gmail.com Thanks Daniel. As you said the latest version has done much to avoid data loss, would you pls give some example? I read the conf file and api, and find some functions related: 1. in hbase-default.xml, hbase.regionserver.optionallogflushinterval described as Sync the HLog to the HDFS after this interval if it has not accumulated enough entries to trigger a sync. I issued one update to my table, but there's no hlog files after the specifed interval. This setting doesn't work, or I make a misunderstanding? 2. HbaseAdmin.flush(tableOrRegionName). It seems that this function flush the memcache to HStorefile. Should I call this function to avoid data loss after several thousand updation? 3. In Htable, there is also a function flushCommits. Where does it flush to? memcache or hdfs? Actually we have a crawler, and want to store webpages(about 1 billion) in hbase. What shall we do to avoid data loss? Any suggestion is appreciated. By the way, we use hadoop 0.19.1 + hbase 0.19.3 Thanks 2009/8/6 Jean-Daniel Cryans jdcry...@apache.org Chen, The main problem is that appends are not supported in HDFS, HBase simply cannot sync its logs to it. But, we did some work to make that story better. The latest revision in the 0.19 branch and 0.20 RC1 both solve much of the data loss problem but it won't be near perfect until we have appends (supposed to be available in 0.21). J-D On Thu, Aug 6, 2009 at 12:45 AM, Chen Xinlichen.d...@gmail.com wrote: Hi, I'm using hbase 0.19.3 on a cluster with 30 machines to store web data. We got a poweroff days before and I found much web data lost. I have searched google, and find it's a meta flush problem. I know there is much performance improvement in 0.20.0; Is the data lost problem handled in the new version? -- Best Regards, Chen Xinli -- Best Regards, Chen Xinli -- Best Regards, Chen Xinli -- Best Regards, Chen Xinli