Re: snappy error during completebulkload

2012-01-10 Thread Oliver Meyn (GBIF)
Thanks Todd, that makes more sense now. I gave up on trying to build the native libraries on os x (not officially supported presumably because it's such a PITA) and instead ran from a centos machine and that worked flawlessly out of the box. Cheers, Oliver On 2012-01-09, at 7:21 PM, Todd

Hadoop corrupt blocks after killing name node - during adding Hbase data

2012-01-10 Thread V_sriram
I am using hadoop 0.20.append and hbase 0.90.0. I uploaded few data into Hbase and then killed HMaster and Namenode for an evaluation purpose. After this I added few more data to the Hbase and I could see them in the hbase shell. Now when I started the Namenode, I am facing problems. The log

size and column count recommendations for rows in hbase

2012-01-10 Thread T Vinod Gupta
i was scanning through different questions that people asked in this mailing list regarding choosing the right schema so that map reduce jobs can be run appropriately and hot regions avoided due to sequential accesses. somewhere, i got the impression that it is ok for a row to have millions of

Re: Capturing RegionServerMetrics during inserts

2012-01-10 Thread Mikael Sitruk
Christian hi In short this is because you cannot assume that your loop will get exactly there by a whole sec. There are a lot of other internal process running in JVM they will make your if irrelevant. Generally using busy wait loop is not the best way to do periodic task like you want to. You

Re: Hadoop corrupt blocks after killing name node - during adding Hbase data

2012-01-10 Thread Yifeng Jiang
The NameNode keeps in safe mode might because it can not reach the block reported ratio threshold as some files are corrupted. You can use hadoop dfsadmin -safemode leave to leave safe mode at first, and then hadoop fsck / -move or -delete to move/delete inconsistent files. -Yifeng On Jan 10,

HBase - is it time to add another node?

2012-01-10 Thread Ronen Itkin
Hey all, I have a small cluster of HBase, Hadoop HDFS, Zookeeper that is being deployed on the same nodes (kind of a common best practice for small clusters). Everything is quite straight forward and works fine. I am wondering how can I efficiently predict the right time to add another node from

Re: HBase - is it time to add another node?

2012-01-10 Thread Doug Meil
Hi there- You probably want to see the Hbase Book/RefGuide and the Performance and Operational Mgt chapters. Capture and track metrics in tools like OpenTSDB, for example. http://hbase.apache.org/book.html On 1/10/12 8:34 AM, Ronen Itkin ro...@taykey.com wrote: Hey all, I have a small

Re: NPE while obtaining row lock

2012-01-10 Thread Yves Langisch
Still happens with HBase 0.90.5/Hadoop 1.0.0. But I think I have some more insights on this topic. Following an up to date stack trace: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:986) at

Re: NPE while obtaining row lock

2012-01-10 Thread yuzhihong
Thanks for the analysis. Do you mind opening a Jira ? On Jan 10, 2012, at 7:51 AM, Yves Langisch y...@langisch.ch wrote: Still happens with HBase 0.90.5/Hadoop 1.0.0. But I think I have some more insights on this topic. Following an up to date stack trace:

after adding table using add_table.rb, it is not visible (even after enabling)

2012-01-10 Thread Stanislav Barton
Hello, I tried to import a table from hbase 0.90.3 to hbase 0.90.4 on a different cluster by copying the data between those two clusters. I uploaded the data into HDFS and called add_table.rb on that, that finished ok. After that i went to hbase shell and did enable new_table which also finished

Problem Stopping 0.90.5 Standalone

2012-01-10 Thread Peter Wolf
Hello all, I am having trouble starting and stopping Standalone HBase. I successfully start Standalone, or at least there is no error message However, when I try to use the shell, I get this hbase(main):001:0 list TABLE ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException:

Multiple Clients and Standalone HBase

2012-01-10 Thread Peter Wolf
Another Standalone question- Can I have multiple clients, running on multiple machines access a Standalone HBase? Is there a problem with having 10's of clients hitting a Standalone system? What is the advantage of running Pseudo-Distributed? Note that this is not a production system. I

Re: Multiple Clients and Standalone HBase

2012-01-10 Thread Jean-Daniel Cryans
On Tue, Jan 10, 2012 at 9:42 AM, Peter Wolf opus...@gmail.com wrote: Another Standalone question- Can I have multiple clients, running on multiple machines access a Standalone HBase? Yes. Is there a problem with having 10's of clients hitting a Standalone system? Now that's more than one

Re: Problem Stopping 0.90.5 Standalone

2012-01-10 Thread Stack
On Tue, Jan 10, 2012 at 9:03 AM, Peter Wolf opus...@gmail.com wrote: Hello all, I am having trouble starting and stopping Standalone HBase. I successfully start Standalone, or at least there is no error message Anything in the hbase logs -- see under the logs/ dir? Have you changed config?

Re: Multiple Clients and Standalone HBase

2012-01-10 Thread Mark Kerzner
I just did it on EC2 and will publish the notes soon. Would you need them? Mark On Tue, Jan 10, 2012 at 11:42 AM, Peter Wolf opus...@gmail.com wrote: Another Standalone question- Can I have multiple clients, running on multiple machines access a Standalone HBase? Is there a problem with

Re: Multiple Clients and Standalone HBase

2012-01-10 Thread Mark Kerzner
I just did it on EC2 and will publish the notes soon. Would you need them? mark On Tue, Jan 10, 2012 at 11:42 AM, Peter Wolf opus...@gmail.com wrote: Another Standalone question- Can I have multiple clients, running on multiple machines access a Standalone HBase? Is there a problem with

Re: after adding table using add_table.rb, it is not visible (even after enabling)

2012-01-10 Thread Stack
On Tue, Jan 10, 2012 at 4:38 AM, Stanislav Barton stanislav.bar...@internetmemory.net wrote: I tried to import a table from hbase 0.90.3 to hbase 0.90.4 on a different cluster by copying the data between those two clusters. I uploaded the data into HDFS and called add_table.rb on that, that

Re: NPE while obtaining row lock

2012-01-10 Thread Stack
On Tue, Jan 10, 2012 at 7:51 AM, Yves Langisch y...@langisch.ch wrote: IMHO this case should be handled somehow and must not lead to a NPE. Agree. Thanks for filing issue. St.Ack

Re: size and column count recommendations for rows in hbase

2012-01-10 Thread Stack
On Tue, Jan 10, 2012 at 3:17 AM, T Vinod Gupta tvi...@readypulse.com wrote: i was scanning through different questions that people asked in this mailing list regarding choosing the right schema so that map reduce jobs can be run appropriately and hot regions avoided due to sequential accesses.

Re: Missing region data.

2012-01-10 Thread Stack
On Thu, Dec 22, 2011 at 1:34 PM, James Estes james.es...@gmail.com wrote: We have a 6 node 0.90.3-cdh3u1 cluster.  We have 8092 regions.  I realize we have too many regions and too few nodes…we're addressing that. Good. We currently have an issue where we seem to have lost region data.  

Re: Missing region data.

2012-01-10 Thread Stack
On Mon, Jan 9, 2012 at 1:57 PM, James Estes james.es...@gmail.com wrote: Should we file a ticket for this issue?  FWIW we got this fixed (not sure if we actually lost any data though). We had to bounce the region server (non-gracefully). The region server seemed to have some stale file handles

Re: Multiple Clients and Standalone HBase

2012-01-10 Thread Peter Wolf
Awesome Many thanks JD On 1/10/12 12:57 PM, Jean-Daniel Cryans wrote: On Tue, Jan 10, 2012 at 9:42 AM, Peter Wolfopus...@gmail.com wrote: Another Standalone question- Can I have multiple clients, running on multiple machines access a Standalone HBase? Yes. Is there a problem with having

Re: Multiple Clients and Standalone HBase

2012-01-10 Thread Peter Wolf
Yes Please! :-D On 1/10/12 1:02 PM, Mark Kerzner wrote: I just did it on EC2 and will publish the notes soon. Would you need them? Mark On Tue, Jan 10, 2012 at 11:42 AM, Peter Wolfopus...@gmail.com wrote: Another Standalone question- Can I have multiple clients, running on multiple

Re: Multiple Clients and Standalone HBase

2012-01-10 Thread Peter Wolf
Excellent. Thanks P On 1/10/12 1:35 PM, Michael Segel wrote: We've done this and have a couple of projects in production. Advantage of running Pseudo-Distributed? None unless you don't care about performance or have a lot of data. (Meaning you're trying to learn only.) Its a bit easier to

Re: Problem Stopping 0.90.5 Standalone

2012-01-10 Thread Tom
Hi Peter, I just tried 0.90.5 standalone on linux; in my case there was a problem with my hostname resolving to 127.0.1.1 (not 127.0.0.1, which is what hbase expects in standalone). If you use linux, have a look into your /etc/hosts file to see if this is the issue and fix it as needed.

Re: Problem Stopping 0.90.5 Standalone

2012-01-10 Thread Stack
On Tue, Jan 10, 2012 at 10:55 AM, Tom fivemile...@gmail.com wrote: BTW, a little comment in the HBase book would have saved me quite a bit of time... We are on it(Doug?) St.Ack

Re: Problem Stopping 0.90.5 Standalone (2)

2012-01-10 Thread Peter Wolf
Perhaps not related, but I followed the Run the '${HBASE_HOME}/bin/hbase migrate' script suggestion and... Hemiola:logs peter$ ../bin/hbase migrate Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/util/Migrate Caused by: java.lang.ClassNotFoundException:

Re: size and column count recommendations for rows in hbase

2012-01-10 Thread kisalay
would it make sense to convert your fat table into a tall table by keeping the source of the metric as part of the row key (may be as the suffix ? ). For accessing all the metrics associated with a particular user, metric and time, u will be resorting to prefix match on ur key. Also all the keys

Re: size and column count recommendations for rows in hbase

2012-01-10 Thread T Vinod Gupta
Thanks St.Ack and Kisalay. In my case, I have primary users and people who interact with my primary users. Lets call them secondary users. Kisalay, you are right and I already have the primary user, metric name and timestamp in my row key. did you mean having the secondary user also part of the

Re: keyvalue size is too large, even though i am inserting 2.5MB

2012-01-10 Thread Evan
Did you ever figure out how to insert an image into Hbase? Vamshi Krishna vamshi2105@... writes:

Re: Problem Stopping 0.90.5 Standalone

2012-01-10 Thread Doug Meil
Yep. I got it. On 1/10/12 2:07 PM, Stack st...@duboce.net wrote: On Tue, Jan 10, 2012 at 10:55 AM, Tom fivemile...@gmail.com wrote: BTW, a little comment in the HBase book would have saved me quite a bit of time... We are on it(Doug?) St.Ack

Re: Problem Stopping 0.90.5 Standalone

2012-01-10 Thread Peter Wolf
Thanks Tom, I am running on a Mac (Lion). My /etc/hosts is below. Would the last 2 lines confuse Standalone HBase? P ## # Host Database # # localhost is used to configure the loopback interface # when the system is booting. Do not change this entry. ## 127.0.0.1localhost

Re: Problem Stopping 0.90.5 Standalone -Ah ha!

2012-01-10 Thread Peter Wolf
Fixed it. I manually killed the process, removed the HBase files *AND* the directory, and restarted. I am guessing that the HBase directory in Standalone contains some sort of hidden file or lock, and it got stuck. P On 1/10/12 2:43 PM, Doug Meil wrote: Yep. I got it. On 1/10/12

Re: Capturing RegionServerMetrics during inserts

2012-01-10 Thread Christian Schäfer
Hi Mikael, shame on me I did that. You're totally right that it is by chance if it the round second. Now I use Thread.sleep ...benchmark have nearly doubled. Thanks for the simple hint.. Christian hi In short this is because you cannot assume that your loop will get exactly there

Re: size and column count recommendations for rows in hbase

2012-01-10 Thread kisalay
Yes, Vinod, you got it right. I was suggesting to have the secondary users also part of the row key as the suffix. On Wed, Jan 11, 2012 at 1:02 AM, T Vinod Gupta tvi...@readypulse.comwrote: Thanks St.Ack and Kisalay. In my case, I have primary users and people who interact with my primary