Re: data loss with hbase 0.19.3

2009-08-19 Thread Schubert Zhang
On Fri, Aug 14, 2009 at 10:03 AM, Chen Xinli wrote: > Thanks Daniel. As you said the latest version has done much to avoid data > loss, would you pls give some example? > > I read the conf file and api, and find some functions related: > 1. in hbase-default.xml, "hbase.regionserver.optionallogflu

Re: data loss with hbase 0.19.3

2009-08-15 Thread Andrew Purtell
Yes, if you are using Put.setWriteToWAL(false), then the data won't have any persistence until after a memstore flush.    - Andy --- On Sat, 8/15/09, Amandeep Khurana wrote: From: Amandeep Khurana Subject: Re: data loss with hbase 0.19.3 To: hbase-user@hadoop.apache.org Date: Sat

Re: data loss with hbase 0.19.3

2009-08-15 Thread Amandeep Khurana
Is this needed with 0.20 too? I am skipping the WALs during imports so that makes it even less fault tolerant... On 8/14/09, stack wrote: > Or just add below to cron: > > echo "flush TABLENAME" |./bin/hbase shell > > Or adjust the configuration in hbase so it flushes once a day (see > hbase-defau

Re: data loss with hbase 0.19.3

2009-08-14 Thread Chen Xinli
Yeah, Thanks. I think it works for me. Thanks for all of you and all the responses. 2009/8/15 stack > Or just add below to cron: > > echo "flush TABLENAME" |./bin/hbase shell > > Or adjust the configuration in hbase so it flushes once a day (see > hbase-default.xml for all options). > > St.Ack >

Re: data loss with hbase 0.19.3

2009-08-14 Thread stack
Or just add below to cron: echo "flush TABLENAME" |./bin/hbase shell Or adjust the configuration in hbase so it flushes once a day (see hbase-default.xml for all options). St.Ack On Fri, Aug 14, 2009 at 2:13 AM, Chen Xinli wrote: > Thanks for your suggestion. > As our insertion is daily, that

Re: data loss with hbase 0.19.3

2009-08-14 Thread Jonathan Gray
Yes, you can definitely do that. We have tables that we put constraints on in that way. Flushing the table ensures all data is written to HDFS and then you will not have any data loss under HBase fault scenarios. Chen Xinli wrote: Thanks for your suggestion. As our insertion is daily, that'

Re: data loss with hbase 0.19.3

2009-08-14 Thread Chen Xinli
Thanks for your suggestion. As our insertion is daily, that's to insert lots of records at fixed time, can we just call HBaseAdmin.flush to avoid loss? I have done some experiments and find it works. I wonder if it will cause some other problem? 2009/8/14 Ryan Rawson > HDFS doesnt allow you to

Re: data loss with hbase 0.19.3

2009-08-13 Thread Ryan Rawson
HDFS doesnt allow you to read partially written files, it reports the size as 0 until the file is properly closed, under a crash scenario you are in trouble. The best options right now are to: - dont let hbase crash (not as crazy as this sounds) - consider experimenting with some newer hdfs stuff

Re: data loss with hbase 0.19.3

2009-08-13 Thread Chen Xinli
For the Hlog, I find an interesting problem. I set the optionallogflushinterval to 1, that's 10 seconds; but it flushes with the interval of 1 hour. After the hlog file generated, I stop hdfs and then kill hmaster and regionservers; then I start all again, the hmaster doesn't restore records f

Re: data loss with hbase 0.19.3

2009-08-13 Thread Chen Xinli
Thanks Daniel. As you said the latest version has done much to avoid data loss, would you pls give some example? I read the conf file and api, and find some functions related: 1. in hbase-default.xml, "hbase.regionserver.optionallogflushinterval" described as "Sync the HLog to the HDFS after this

Re: data loss with hbase 0.19.3

2009-08-06 Thread Jean-Daniel Cryans
Chen, The main problem is that appends are not supported in HDFS, HBase simply cannot sync its logs to it. But, we did some work to make that story better. The latest revision in the 0.19 branch and 0.20 RC1 both solve much of the data loss problem but it won't be near perfect until we have append

data loss with hbase 0.19.3

2009-08-05 Thread Chen Xinli
Hi, I'm using hbase 0.19.3 on a cluster with 30 machines to store web data. We got a poweroff days before and I found much web data lost. I have searched google, and find it's a meta flush problem. I know there is much performance improvement in 0.20.0; Is the data lost problem handled in the new