HBase backup and outage scenarios in practice?

2011-07-14 Thread Steinmaurer Thomas
Hello, we are currently evaluating HBase for a project. In respect to available backup options, we found the following blog post here: http://blog.sematext.com/2011/03/11/hbase-backup-options/ Probably well known for you guys here. So, am I right when I say that there is no online

RE: WrongRegionException and inconsistent table found

2011-07-14 Thread Robert Gonzalez
Yeah, I was the one that ran into the same problem. Took the check meta script and converted it based on some of Stack's suggestions and our own experiments. Also fixed boundary conditions for regions at the beginning and at the end of the chain. In addition, we added the assign in here as

Re: HBase backup and outage scenarios in practice?

2011-07-14 Thread Michael Segel
Not sure what you read in Otis' bog but pretty ssure it's out of date. Check out MapR stuff. Sent from my Palm Pre on ATamp;T On Jul 14, 2011 6:57 AM, Steinmaurer Thomas lt;thomas.steinmau...@scch.atgt; wrote: Hello, we are currently evaluating HBase for a project. In respect to

Re: Fetching and iterating through all column values belonging to all Timestamps of a Row

2011-07-14 Thread Narayanan K
Thanks Srikanth for your replies.. I wrote a java code that picks up all the version cell values using the setMaxVersions function of GET API. For the benefit of others who were looking for an implementation of setMaxVersions, please find the code below: *import java.io.IOException; import

Re: HBase backup and outage scenarios in practice?

2011-07-14 Thread Ted Dunning
To clarify what Mike means here, MapR supports HBase as well as atomic, transactionally correct snapshots. These snapshots allow point in time recovery of the complete state of an HBase data set. There is no performance hit when taking the snapshot and no maintenance impact relative to HBase

Re: HBase backup and outage scenarios in practice?

2011-07-14 Thread Doug Meil
Otis' blog entry is from 4 months ago. It still applies for the Apache stack. The Hbase book FAQ has a link to that blog for the backup question. On 7/14/11 12:14 PM, Michael Segel michael_se...@hotmail.com wrote: Not sure what you read in Otis' bog but pretty ssure it's out of date. Check

Avro connector

2011-07-14 Thread Andrew Purtell
HBASE-2400 introduced a new connector contrib architecturally equivalent to the Thrift connector, but using Avro serialization and associated transport and RPC server work. However, it remains unfinished, was developed against an old version of Avro, is currently not maintained, and is regarded

Re: Avro connector

2011-07-14 Thread Doug Meil
+1 On 7/14/11 2:16 PM, Andrew Purtell apurt...@apache.org wrote: HBASE-2400 introduced a new connector contrib architecturally equivalent to the Thrift connector, but using Avro serialization and associated transport and RPC server work. However, it remains unfinished, was developed against an

Slow insertions in our hbase setup

2011-07-14 Thread Mayuresh
Hi, We have a hbase + hadoop setup that we have. Its 10 machines cluster. In my earlier tests we had the default region size (256M) and the insertion of 10,000,000 rows into hbase table took around 23 minutes. For the new tests, I changed the region size to 128 M. Now the insertions are going

Re: Slow insertions in our hbase setup

2011-07-14 Thread Doug Meil
Hi there- You probably want to read this... http://hbase.apache.org/book.html#performance As you've already noticed, going with regions smaller than the default isn't a good idea for large/large-ish tables. On 7/14/11 2:49 PM, Mayuresh mayuresh.kshirsa...@gmail.com wrote: Hi, We have a

How to convert HTable no compress to compressed with lzo.

2011-07-14 Thread 陈加俊
How to convert HTable no compress to compressed with lzo. --

Re: performance problem during read

2011-07-14 Thread Stack
This is interesting. Any chance that the cells on the regions hosted on server A are 5M in size? The hfile block sizes are by default configured to be 64k but rare would an hfile block ever be exactly 64k. We do not cut the hfile block content at 64k exactly. The hfile block boundary will be

Re: data structure

2011-07-14 Thread Stack
On Thu, Jul 14, 2011 at 12:52 PM, Andre Reiter a.rei...@web.de wrote: new we are running mapreduce jobs, to generate a report: for example we want to know how many impressions were done by all users in last x days. therefore the scan of the MR job is running over all data in our hbase table

Re: data structure

2011-07-14 Thread Doug Meil
Hi there- A few high-level suggestions... re: to generate a report: for example we want to know how many impressions were done by all users in last x days Can you create a summary table by day (via MR job), and then have your ad-hoc report hit the summary table? Re: and with the data

Re: How to convert HTable no compress to compressed with lzo.

2011-07-14 Thread Stack
Offline the table, alter it to add lzo compression (after verifying you have lzo installed correctly all over your cluster because it gets ugly when partially or incompletely installed -- see http://hbase.apache.org/book.html#compression), then online the table again. New compressions made post

Re: Avro connector

2011-07-14 Thread Stack
+1 St.Ack On Thu, Jul 14, 2011 at 11:16 AM, Andrew Purtell apurt...@apache.org wrote: HBASE-2400 introduced a new connector contrib architecturally equivalent to the Thrift connector, but using Avro serialization and associated transport and RPC server work. However, it remains unfinished,

Re: data structure

2011-07-14 Thread Andre Reiter
Stack wrote: On Thu, Jul 14, 2011 at 12:52 PM, Andre Reitera.rei...@web.de wrote: Why is 70 seconds too long for a report? 70 seconds seems like a short mapreduce job (to me). You don't have that many regions. How fast would you like this operation to complete in? The report you describe

Re: data structure

2011-07-14 Thread Andre Reiter
- Original Message - From: Doug Meil doug.m...@explorysmedical.com Sent: Thu Jul 14 2011 22:29:16 GMT+0200 (CET) To: CC: Subject: Re: data structure Hi there- A few high-level suggestions... re: to generate a report: for example we want to know how many impressions were done by all

Re: data structure

2011-07-14 Thread Ted Dunning
You can play tricks with the arrangement of the key. For instance, you can put date at the end of the key. That would let you pull data for a particular user for a particular date range. The date should not be a time stamp, but should be a low-res version of time (day-level resolution might be

Re: data structure

2011-07-14 Thread Andre Reiter
- Original Message - From: Ted Dunning tdunn...@maprtech.com Sent: Thu Jul 14 2011 23:17:20 GMT+0200 (CET) To: CC: Subject: Re: data structure You can play tricks with the arrangement of the key. For instance, you can put date at the end of the key. That would let you pull data for a

Re: data structure

2011-07-14 Thread Ted Dunning
Put all of the requests for a particular date together. Today, you update the value for a particular user by appending to the current value. All you need to do is to change that slightly by appending new data to the value for the user + the current date. If you want to migrate your database

Re: Deadlocked Regionserver process

2011-07-14 Thread lohit
Is it possible to open JIRA with full stack trace. Or, if you point to full stack trace one of us can open JIRA for you. 0.90.4 will be out soon and may be we should see if there is a fix for the below problem? 2011/7/14 Matt Davies matt.dav...@tynt.com Hey everyone, We periodically see a

Re: Deadlocked Regionserver process

2011-07-14 Thread Matt Davies
Thanks. I've created HBase-4101. On Thu, Jul 14, 2011 at 3:44 PM, lohit lohit.vijayar...@gmail.com wrote: Is it possible to open JIRA with full stack trace. Or, if you point to full stack trace one of us can open JIRA for you. 0.90.4 will be out soon and may be we should see if there is a

Re: Deadlocked Regionserver process

2011-07-14 Thread Stack
What Lohit says but also, what jvm are you running and what options are you feeding it? The stack trace is a little crazy (especially the mix in of resource bundle loading). We saw something similar over in HBASE-3830 when someone was running profiler. Is that what is going on here? Thanks,

Re: Deadlocked Regionserver process

2011-07-14 Thread Matt Davies
We aren't profiling right now. Here's what is in the hbase-env.sh export TZ=US/Mountain export HBASE_OPTS=$HBASE_OPTS -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:/home/hadoop/gc-hbase.log export HBASE_MANAGES_ZK=false export

Re: Deadlocked Regionserver process

2011-07-14 Thread Stack
Thank you. I've added below to issue. Will take a looksee. If issue, will include fix in 0.90.4. St.Ack On Thu, Jul 14, 2011 at 3:07 PM, Matt Davies matt.dav...@tynt.com wrote: We aren't profiling right now.  Here's what is in the hbase-env.sh export TZ=US/Mountain export

Re: HBase backup and outage scenarios in practice?

2011-07-14 Thread Stack
On Thu, Jul 14, 2011 at 4:57 AM, Steinmaurer Thomas thomas.steinmau...@scch.at wrote: We would also like to try out various outage scenarios, e.g. pulling the network cable out of one node or resetting the server etc ... while the system is in use ... Anybody tried different outage scenarios

Regions not getting reassigned if RS is brought down

2011-07-14 Thread Shrijeet Paliwal
Hi Everyone, Hbase Version: 0.90.3 Hadoop Version: cdh3u0 2 region servers, zookeeper quorum managed by hbase. I was doing some tests and it seemed regions are not getting reassigned by master if RS is brought down. Here are the steps: 0. Cluster in a steady state. Pick a random key: k1

Re: performance problem during read

2011-07-14 Thread Mingjian Deng
Hi stack: Server A or B is the same in the cluster. If I set hfile.block.cache.size=0.1 on other server, the problem will reappear.But When I set hfile.block.cache.size = 0.15 or more, it won't reappear. So I think you can test on your own cluster. With the follow btrace codes:

RE: Deadlocked Regionserver process

2011-07-14 Thread Ramkrishna S Vasudevan
Hi I think this as stack mentioned in HBASE-3830 could be due to profiler. But the problem is in the use of Data class. JD had once replied to the mailing list with the heading Re: Possible dead lock JD's reply = I see what you are

Re: Regions not getting reassigned if RS is brought down

2011-07-14 Thread Shrijeet Paliwal
I have narrowed it down to following : // Server to handle client requests String machineName = DNS.getDefaultHost(conf.get( hbase.regionserver.dns.interface, default), conf.get( hbase.regionserver.dns.nameserver, default)); I am not using the default interface for RS. I

RE: Deadlocked Regionserver process

2011-07-14 Thread Ramkrishna S Vasudevan
Sorry its not Data class But the problem is in the use of Date class. JD had once replied to the mailing list with the heading Re: Possible dead lock :) Regards Ram -Original Message- From: Ramkrishna S Vasudevan [mailto:ramakrish...@huawei.com] Sent: Friday, July 15, 2011 9:26 AM To:

RE: Hadoop/HBase Upgrade from 0.20.3 to 0.90.2

2011-07-14 Thread Zhong, Andy
St.Ack, It's weird. I did Hbase upgrade from Hbase 0.20.3 to Hbase 0.90.2 by following replace the Hadoop JAR file from Hadoop 0.20.2 release with the sync-supporting Hadoop JAR file found in HBase 0.90.2. It was suceesfully before, but now we are facing error from Hbase shell: