Re: is there any way to copy data from one table to another while updating rowKey??

2012-01-12 Thread T Vinod Gupta
Stack, Here are some of the failures im getting now. I don't know whats wrong with my hbase right now.. i literally stopped my main processes that write to the store. i wrote an app to delete bunch of old data which we dont need any more.. so that app is doing scans and deletes (specific columns of

Re: heavy writing and compaction storms

2012-01-12 Thread Jean-Daniel Cryans
On Thu, Jan 12, 2012 at 3:47 PM, Neil Yalowitz wrote: > Thanks for the response, J-D.  Some followups: > > Would love to, but I'm dealing with a "increment a counter" issue.  I > created a separate email thread for that question since it's off-topic from > the compaction storms. And I replied :)

Re: heavy writing and compaction storms

2012-01-12 Thread Neil Yalowitz
Thanks for the response, J-D. Some followups: > First you should consider using bulk import instead of a massive MR job. Would love to, but I'm dealing with a "increment a counter" issue. I created a separate email thread for that question since it's off-topic from the compaction storms. > - m

Re: Missing region data.

2012-01-12 Thread Stack
On Thu, Jan 12, 2012 at 1:30 PM, James Estes wrote: > Thanks for the advice.  We don't have those logs anymore.  Is there > anyway for hbase to recover gracefully here?  The compactions piled up > behind it until we bounced the region server.  Is there already a > ticket filed for recovering from

Re: bulk import and counting increments

2012-01-12 Thread Jean-Daniel Cryans
You could MR the data while it's still in HDFS, a simple count, and then insert those counts separately from the data. It would also reduce the number of increment calls (unless you have a number of incremented cells that is close to the number of increments you have to do). J-D On Thu, Jan 12, 2

Re: upgrade 0.90 to 0.92 - HFile v2

2012-01-12 Thread Stack
On Thu, Jan 12, 2012 at 2:44 PM, Neil Yalowitz wrote: > Is anyone familiar with the upgrade path from HBase 0.90 to 0.92 or greater? > > Specifically, is there a way to upgrade the existing HFiles to v2, or > should this not be attempted? 0.92.0 will do it for you. It can read v1 and v2. Any v1

Re: upgrade 0.90 to 0.92 - HFile v2

2012-01-12 Thread Jean-Daniel Cryans
It's automatic when you start with the 0.92 binaries and it does a compaction. There's no rollback. Stack is supposed to be writing the upgrade documentation (no pressure dude! wink wink) J-D On Thu, Jan 12, 2012 at 2:44 PM, Neil Yalowitz wrote: > Is anyone familiar with the upgrade path from H

upgrade 0.90 to 0.92 - HFile v2

2012-01-12 Thread Neil Yalowitz
Is anyone familiar with the upgrade path from HBase 0.90 to 0.92 or greater? Specifically, is there a way to upgrade the existing HFiles to v2, or should this not be attempted?

Re: is there any way to copy data from one table to another while updating rowKey??

2012-01-12 Thread Stack
And what is happening on the server ip-10-68-145-124.ec2.internal:60020 such that 14 attempts at getting a region failed. Is that region on line during this time or being moved? If not online, why not? Was server opening the region taking too long (because of high-load?). Grep around the region

Re: EC2 remote client woes

2012-01-12 Thread Jean-Daniel Cryans
Interesting, could you start the shell with "-d" and pastebin all the debug that comes out after the first command? BTW the shell does work on remote clusters, so it's some other issue. J-D On Thu, Jan 12, 2012 at 1:56 PM, Peter Wolf wrote: > Sorry, that's a typo in my email.  Here is my config

Re: EC2 remote client woes

2012-01-12 Thread Peter Wolf
Sorry, that's a typo in my email. Here is my config file again (that doesn't work) hbase.zookeeper.quorum ip-AA-BBB-C-DDD.ec2.internal Standalone Server I double checked, and I am using ip-AA-BBB-C-DDD.ec2.internal consistently in config files and code. P On 1/12/12 4:24 PM, Jean-Da

Re: Missing region data.

2012-01-12 Thread James Estes
Thanks for the advice. We don't have those logs anymore. Is there anyway for hbase to recover gracefully here? The compactions piled up behind it until we bounced the region server. Is there already a ticket filed for recovering from double assignment issues like this? Our current plan would be

Re: EC2 remote client woes

2012-01-12 Thread Jean-Daniel Cryans
Yes, it's the same thing, which is why I think the additional ec2.internal in your hbase-site is suspicious. Let me reiterate: This works: echo stat|nc ip-XX-YYY-Z-QQQ.ec2.internal 2181 But this config doesn't: ip-XX-YYY-Z-QQQ.ec2.internal.ec2.internal Now what happens if you just use the same

Re: EC2 remote client woes

2012-01-12 Thread Peter Wolf
I'm a N00B, so I'm not sure of anything... but it is working now using the Java Client API, and XXX.ec2.internal address on both server and client. The problem seems to be 'hbase shell', which is odd as I would have thought it sat on top of the Java API. P On 1/12/12 1:22 PM, Jean-Daniel C

Re: bulk loading and RegionObservers

2012-01-12 Thread Andrew Purtell
> I think that the people demanding such method of access would like to have the > ability to trigger the action on a row level (so again when a Put with new > values come). But I think that this would not scale - it would take a long > time > to scan the new region and fire prePut() call on RO fo

bulk import and counting increments

2012-01-12 Thread Neil Yalowitz
Hi all, When performing a bulk import into HBase, what methods are available to increment a counter? To describe the problem: a large dataset comes in, and the most efficient way to get that data into an HBase table is to bulk load, as described here: http://hbase.apache.org/bulk-loads.html The

Re: does increasing region filesize followed by major compactions supposed to reduce number of regions?

2012-01-12 Thread Doug Meil
Thanks Ian! :-) On 1/12/12 2:03 PM, "Ian Varley" wrote: >Vinod, > >The answers to your questions (and so many more!) are easily found in the >HBase Reference Guide: > >http://hbase.apache.org/book.html#schema.versions > >"Excess versions are removed during major compactions." > > - Ian "Doug

Re: does increasing region filesize followed by major compactions supposed to reduce number of regions?

2012-01-12 Thread Ian Varley
Vinod, The answers to your questions (and so many more!) are easily found in the HBase Reference Guide: http://hbase.apache.org/book.html#schema.versions "Excess versions are removed during major compactions." - Ian "Doug Meil" Varley ;) On Jan 12, 2012, at 10:59 AM, T Vinod Gupta wrote: Th

Re: does increasing region filesize followed by major compactions supposed to reduce number of regions?

2012-01-12 Thread T Vinod Gupta
Thanks ill take a look.. meanwhile, i just decreased the versions for my column families from 3 to 1 and triggered another compaction. does this make hbase delete the previous versions and keep only the latest one? thanks On Thu, Jan 12, 2012 at 10:57 AM, kisalay wrote: > Vinod, > > U will have

Re: does increasing region filesize followed by major compactions supposed to reduce number of regions?

2012-01-12 Thread kisalay
Vinod, U will have to do merge of the regions after the major compact to decrease the number of regions that you have. You can do either an online or an offline merge. You can pickup the online merge jruby script from the jira https://issues.apache.org/jira/browse/HBASE-1621 ~Kisalay On Fri, Ja

ext3 vs. ext4

2012-01-12 Thread Ted Tuttle
Hello All- I've search this list and re-read the section in the George book on this topic. From the book I get the impression ext3 is used more widely but ext4 is gaining popularity. From this list I see quite a few ext4 recommendations but very little in the way of justification. Is there anyo

Re: does increasing region filesize followed by major compactions supposed to reduce number of regions?

2012-01-12 Thread T Vinod Gupta
so there is no way to make the regions merge since they are much below region max size? On Thu, Jan 12, 2012 at 10:40 AM, Doug Meil wrote: > > Major compactions don't change the number of regions. > > > > > > On 1/12/12 1:35 PM, "T Vinod Gupta" wrote: > > >this is probably a rookie question. but

Re: does increasing region filesize followed by major compactions supposed to reduce number of regions?

2012-01-12 Thread Doug Meil
Major compactions don't change the number of regions. On 1/12/12 1:35 PM, "T Vinod Gupta" wrote: >this is probably a rookie question. but my understanding is that if i >increase the region max file size and then initiate major compaction >manually, the number of regions should ideally go do

does increasing region filesize followed by major compactions supposed to reduce number of regions?

2012-01-12 Thread T Vinod Gupta
this is probably a rookie question. but my understanding is that if i increase the region max file size and then initiate major compaction manually, the number of regions should ideally go down by the factor by which i increased the region max file size. isn't that true? im not seeing that happenin

Re: EC2 remote client woes

2012-01-12 Thread Jean-Daniel Cryans
Your config file on the remote machine has: ip-XX-YYY-Z-QQQ.ec2.internal.ec2.internal You sure about the extra ec2.internal? J-D On Thu, Jan 12, 2012 at 9:26 AM, Peter Wolf wrote: > Oh yeah!  The code did it :-D > > For those that come after, I guess 'hbase shell' is broken for remote > access

Re: heavy writing and compaction storms

2012-01-12 Thread Jean-Daniel Cryans
Hi, First you should consider using bulk import instead of a massive MR job. If you decide against that, then - make sure you pre-split: http://hbase.apache.org/book/important_configurations.html#disable.splitting - regarding major compactions, usually people switch off the automatic mode and c

heavy writing and compaction storms

2012-01-12 Thread Neil Yalowitz
Hi all, What strategies do HBase 0.90 users employ to deal with or avoid the so-called "compaction storm"? I'm referring to the issue referred to in 2.8.2.7 here: http://hbase.apache.org/book.html#important_configurations The MR job I'm working with executes many PUTs during the Map phase with

Re: support for M langauage

2012-01-12 Thread Ted Yu
That's an old language. I think the answer is yes. Do you use HBase for versioning of the routines ? On Thu, Jan 12, 2012 at 9:52 AM, Jignesh Patel wrote: > Can we store Mumps routines into > HBase? > > -Jignesh >

support for M langauage

2012-01-12 Thread Jignesh Patel
Can we store Mumps routines into HBase? -Jignesh

Re: is there any way to copy data from one table to another while updating rowKey??

2012-01-12 Thread Ted Yu
I think you need to manipulate the keyvalue to match the new row. Take a look at the check: //Checking that the row of the kv is the same as the put int res = Bytes.compareTo(this.row, 0, row.length, kv.getBuffer(), kv.getRowOffset(), kv.getRowLength()); if(res != 0) { th

Re: HBase Export

2012-01-12 Thread Doug Meil
No problem! I'll add it to the book because this applies to the CopyTable utility as well, and you're not the first to ask this question. On 1/12/12 12:36 PM, "Jahangir Mohammed" wrote: >Thanks Doug, missed it. > >Thanks, >Jahangir. > >On Thu, Jan 12, 2012 at 10:34 AM, Doug Meil >wrote: >

Re: HBase Export

2012-01-12 Thread Jahangir Mohammed
Thanks Doug, missed it. Thanks, Jahangir. On Thu, Jan 12, 2012 at 10:34 AM, Doug Meil wrote: > > setCaching is set in TableInputFormat, and it relies on the following > property set in the jobconf... > > job.getConfiguration().setInt("hbase.client.scanner.caching", > batchSize); > > > > > > >

Re: EC2 remote client woes

2012-01-12 Thread Peter Wolf
Oh yeah! The code did it :-D For those that come after, I guess 'hbase shell' is broken for remote access. Use the raw Java API Many thanks again Mark! On 1/12/12 11:40 AM, Mark Kerzner wrote: 1. Look in the logs; 2. I think hbase shell works only locally; 3. The code below worked for me

Re: is there any way to copy data from one table to another while updating rowKey??

2012-01-12 Thread T Vinod Gupta
hbase version - hbase(main):001:0> version 0.90.3-cdh3u1, r, Mon Jul 18 08:23:50 PDT 2011 here are the different exceptions - when copying table to another table - 12/01/12 11:06:41 INFO mapred.JobClient: Task Id : attempt_201201120656_0012_m_01_0, Status : FAILED org.apache.hadoop.hbase.clie

Re: EC2 remote client woes

2012-01-12 Thread Peter Wolf
Hmm... Perhaps that is my problem. How do I find out? On 1/12/12 11:43 AM, Mark Kerzner wrote: Then where is the zookeeper sending your client to connect? On Thu, Jan 12, 2012 at 10:42 AM, Peter Wolf wrote: BTW, I think Zookeeper is responding. When I try this on the remote machine it wor

Re: EC2 remote client woes

2012-01-12 Thread Mark Kerzner
Then where is the zookeeper sending your client to connect? On Thu, Jan 12, 2012 at 10:42 AM, Peter Wolf wrote: > BTW, I think Zookeeper is responding. When I try this on the remote > machine it works... > > echo stat|nc ip-XX-YYY-Z-QQQ.ec2.internal 2181 > Zookeeper version: 3.3.3-cdh3u2--1, bu

Re: EC2 remote client woes

2012-01-12 Thread Peter Wolf
BTW, I think Zookeeper is responding. When I try this on the remote machine it works... echo stat|nc ip-XX-YYY-Z-QQQ.ec2.internal 2181 Zookeeper version: 3.3.3-cdh3u2--1, built on 10/14/2011 04:59 GMT Clients: /XX-YYY-Z-QQQ:59698[1](queued=0,recved=200,sent=202) /XX-YYY-Z-QQQ:48563[0](queued=

Re: EC2 remote client woes

2012-01-12 Thread Mark Kerzner
1. Look in the logs; 2. I think hbase shell works only locally; 3. The code below worked for me, and I don't use a config file, but give the params directly: public void connect() throws IOException { Configuration hConf = HBaseConfiguration.create(); hConf.set(MyConstants.HBAS

Re: after adding table using add_table.rb, it is not visible (even after enabling)

2012-01-12 Thread Stack
On Thu, Jan 12, 2012 at 5:18 AM, Stanislav Barton wrote: >but after disable/enable the table in order to make the > regions to come up, there still two regions were are problems, the RS throws > NegativeArraySizeException while trying to open the region, the whole stack > trace is here: http:/

Re: hbase hregion was null or empty in -ROOT- problem while creating table

2012-01-12 Thread Stack
On Thu, Jan 12, 2012 at 2:17 AM, Pranay wrote: > Hi, > I am having problem with hbase. i have configured hadoop and it is working > fine. > But after configuring hbase, the UI of HMaster is not launching and also not > creating table in shell. What does it say in the hbase logs? > Following are

Re: Cannot post to hbase mailing list anymore

2012-01-12 Thread Stack
On Thu, Jan 12, 2012 at 3:13 AM, Christian Schäfer wrote: > For some weeks I can't post to the hbase user mailing list anymore. > Every message of mine is marked as spam. > > I suspected yahoo (my mail provider) at first, but with another yahoo mail > adress I could post. > > Couly anyone unblock

EC2 remote client woes

2012-01-12 Thread Peter Wolf
Still no love... Any suggestions? I'm on EC2, and I am trying to set up a Pseudo-Distributed HBaser Server on one machine, and access it from another. Both machines are EC2. I have already found the doc below, and I followed the instructions http://hbase.apache.org/book.html#client_depen

RE: Question about HBase for OLTP

2012-01-12 Thread Michael Segel
Mark, There are a lot of things that have gone in to RDBMs where it was the only tool available, although not the best tool. You can put unstructured data in to some RDBMs. Its a pain but it can be done. The point I am trying to make, which has also been made by myself and others on numerou

Key Region is not moved to alternative RegionServer on RegionServer shutoff

2012-01-12 Thread Christian Schäfer
Hello, I currently perform some tests that measure put/second performance. Created a pre-splitted table to spread load like this: create 'myTable', 'Family1', {SPLITS => ['11.0.0', '22.0.0', '33.0.0', '44.0.0']}

Cannot post to hbase mailing list anymore

2012-01-12 Thread Christian Schäfer
Hi there, as I didn't know where to put that question to..I post it here. For some weeks I can't post to the hbase user mailing list anymore. Every message of mine is marked as spam. I suspected yahoo (my mail provider) at first, but with another yahoo mail adress I could post. Couly anyone u

Re: HBase Export

2012-01-12 Thread Doug Meil
setCaching is set in TableInputFormat, and it relies on the following property set in the jobconf... job.getConfiguration().setInt("hbase.client.scanner.caching", batchSize); On 1/11/12 11:13 PM, "Jahangir Mohammed" wrote: >Hello, > >Have couple of questions around hbase "export" facil

Re: Question about HBase for OLTP

2012-01-12 Thread Dhruba Borthakur
Here are some of our stats of FB messages on HBase: 6B+ messages/day Traffic to HBase 75+ Billion R+W ops/day At peak: 1.5M ops/sec ~ 55% Read vs. 45% Write ops Avg write op inserts ~16 records across multiple column families and column family updates in the same record are atomic (inside the s

Re: Question about HBase for OLTP

2012-01-12 Thread fullysane
Hi Mike: Thereason I am thinking Hbase for OLTP is that I need a column-based (key-value pair)OLTP DBMS which allows me not to predefine columns for a table and can add new column to a table on the fly like Hbase does. Alougth this can be done in any RDBMS with so called skinny and tall table con

Re: Question about HBase for OLTP

2012-01-12 Thread fullysane
Hi dhruba: Thank you for the information. Can you let me know the distribution of insert, update, delete, and query of your OLTP transactions? Are they mostly Insert and query? how do you set up/configure the Hbase table's column family (versioning, compression, ...) for your OLTP application? T

Re: Question about HBase for OLTP

2012-01-12 Thread MARK CALLAGHAN
On Mon, Jan 9, 2012 at 4:37 PM, Michael Segel wrote: > > Ok.. > > Look, here's the thing... HBase has no transactional support. > OLTP systems like PoS systems, Hotel Reservation Systems, Trading systems... > among others really need this. > > Again, I can't stress this point enough... DO NOT THI

Re: after adding table using add_table.rb, it is not visible (even after enabling)

2012-01-12 Thread Stanislav Barton
Stack writes: > > On Tue, Jan 10, 2012 at 4:38 AM, Stanislav Barton > wrote: > > I tried to import a table from hbase 0.90.3 to hbase 0.90.4 on a > > different cluster by copying the data between those two clusters. I > > uploaded the data into HDFS and called add_table.rb on that, that > > fin

Re: is there any way to copy data from one table to another while updating rowKey??

2012-01-12 Thread yuzhihong
What version of hbase did you use ? Can you post the stack trace for the exception ? Thanks On Jan 12, 2012, at 3:37 AM, T Vinod Gupta wrote: > I am badly stuck and can't find a way out. i want to change my rowkey > schema while copying data from 1 table to another. but a map reduce job to >

Re: HBase in Hadoop-1.0.0

2012-01-12 Thread Harsh J
You are right in your understanding. Apache HBase is not bundled with the Apache Hadoop release, but the article rather meant that it now supports HBase better. Also, Apache Hadoop 1.0.0, being a rename of the 0.20.205 series, has append/sync APIs from branch-0.20-append, not the hflush/hsync,

is there any way to copy data from one table to another while updating rowKey??

2012-01-12 Thread T Vinod Gupta
I am badly stuck and can't find a way out. i want to change my rowkey schema while copying data from 1 table to another. but a map reduce job to do this won't work because of large row sizes (responseTooLarge errors). so i am left with a 2 steps processing of exporting to hdfs files and importing f

Re: bulk loading and RegionObservers

2012-01-12 Thread Stanislav Barton
Andrew Purtell writes: > > Yes this is correct. > > Coprocessors / RegionObservers and bulk loading have been developing separately in parallel. > > Now that bulk loading changes are settling down, I've been considering adding CP hooks into the bulk load > process, at the HRegion level, witho

hbase hregion was null or empty in -ROOT- problem while creating table

2012-01-12 Thread Pranay
Hi, I am having problem with hbase. i have configured hadoop and it is working fine. But after configuring hbase, the UI of HMaster is not launching and also not creating table in shell. Following are the details: I am using hadoop 0.20.2 and HBase 0.90.4 on 2 nodes 1. umaster : namenode, sec namen

HBase in Hadoop-1.0.0

2012-01-12 Thread Graeme Seaton
Hi, I have seen in the media coverage about the release of Hadoop-1.0.0 that "HBase is officially part of the 1.0 release". My understanding from the release notes that it includes changes (append/hsynch/hflush, and security) to support HBase more effectively but HBase is still separately bu