Re: HBaseAdmin needs a close methord

2012-04-18 Thread Eason Lee
I don't think this issue can resovle the problem ZKWatcher is removed,but the configuration and HConnectionImplementation objects are still in HConnectionManager this may still cause memery leak but calling HConnectionManager.deleteConnection may resolve HBASE-5073 problem. I can see i

Re: some region Could not seek StoreFileScanner[HFileScanner for reader

2012-04-18 Thread 永江梁
hi : recently, i use OffMetaRepair to rebulid meta table, but some region Could not seek StoreFileScanner[HFileScanner for reader. Then i prefix scan rows of the table, it will wait with some row until scannertimeout。I try to delete the error row with Delete(),but not worked。how i can r

Re: HBaseAdmin needs a close methord

2012-04-18 Thread Harsh J
Nope, AFAIK its not in CDH3u3, but CDH3u4 will have it. On Thu, Apr 19, 2012 at 10:57 AM, Ramkrishna.S.Vasudevan wrote: > Hi Lee > > Is HBASE-5073 resolved in that release? > > Regards > Ram > >> -Original Message- >> From: Eason Lee [mailto:softse@gmail.com] >> Sent: Thursday, April

RE: HBaseAdmin needs a close methord

2012-04-18 Thread Ramkrishna.S.Vasudevan
Hi Lee Is HBASE-5073 resolved in that release? Regards Ram > -Original Message- > From: Eason Lee [mailto:softse@gmail.com] > Sent: Thursday, April 19, 2012 10:40 AM > To: user@hbase.apache.org > Subject: Re: HBaseAdmin needs a close methord > > I am using cloudera's cdh3u3 > > Hi L

Re: HBaseAdmin needs a close methord

2012-04-18 Thread Eason Lee
I am using cloudera's cdh3u3 Hi Lee Which version of HBase are you using? Regards Ram -Original Message- From: Eason Lee [mailto:softse@gmail.com] Sent: Thursday, April 19, 2012 9:36 AM To: user@hbase.apache.org Subject: HBaseAdmin needs a close methord Resently, my app meets a p

RE: HBaseAdmin needs a close methord

2012-04-18 Thread Ramkrishna.S.Vasudevan
Hi Lee Which version of HBase are you using? Regards Ram > -Original Message- > From: Eason Lee [mailto:softse@gmail.com] > Sent: Thursday, April 19, 2012 9:36 AM > To: user@hbase.apache.org > Subject: HBaseAdmin needs a close methord > > Resently, my app meets a problem list as fol

HBaseAdmin needs a close methord

2012-04-18 Thread Eason Lee
Resently, my app meets a problem list as follows Can't construct instance of class org/apache/hadoop/hbase/client/HBaseAdmin Exception in thread "Thread-2" java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java

More tables, or add a prefix to each row key?

2012-04-18 Thread Tom Brown
All, I'm writing an OLAP cube database and I can implement the storage in one of two schemas, and I don't know if there's any unexpected performance trade-offs I'm not aware of. Each row represents a unique cell in the cube, with about 5 columns for each row. The row key format is a set of attrib

Re: Basic -Hbase table question

2012-04-18 Thread Doug Meil
Hi there- Because your topic is webcrawling, you might want to read the BigTable paper because the example in that paper is about webcrawling. You can find that, and other info, in the RefGuide... http://hbase.apache.org/book.html#other.info.papers On 4/18/12 2:08 PM, "petri koski" wrote

Basic -Hbase table question

2012-04-18 Thread petri koski
Hello, I am quite new to Hbase, and here comes my question: I have a table. What I do with hadoop is to download webpages in MAP -phase, extract Urls found and save them in Reduce -phase. I read from one table, and I save them (put) to same table to avoid duplicates etc. I will get millions of

Re: regions stuck in transition

2012-04-18 Thread Stack
On Wed, Apr 18, 2012 at 7:38 AM, Bryan Beaudreault wrote: > Yes, I can get data from the region through the shell.  The problem is the > balancer cannot run when a region is in transition, so it is never running. >  Our region servers are becoming increasingly unbalanced. > Yes. This is by desig

Re: regions stuck in transition

2012-04-18 Thread Stack
On Mon, Apr 16, 2012 at 8:21 AM, Bryan Beaudreault wrote: > We've recently had a problem where regions will get stuck in transition for > a long period of time.  In fact, they don't ever appear to get > out-of-transition unless we take manual action.  Last time this happened I > restarted the mast

[Job] Data Engineer at Sematext

2012-04-18 Thread Otis Gospodnetic
Hello, If you’ve always wanted to work with Hadoop, HBase, Flume, and friends and build massively scalable, high-throughput distributed systems, we have a Data Engineer position that is all about that!  If you are interested, please send your resume to j...@sematext.com .  Details below. Otis

Re: I cannot find the hbase configuration templates

2012-04-18 Thread Jean-Daniel Cryans
Yeah I'm not sure why that page is not in the reference guide but it should be moved and fixed since that file has been removed a long time ago. I created https://issues.apache.org/jira/browse/HBASE-5822 J-D On Tue, Apr 17, 2012 at 11:09 PM, wrote: > I asked the same question on Stackoverflow:

Re: Applying filters to ResultScanner

2012-04-18 Thread Alok Kumar
Hi, I think you need to recreate "a Filter + attach it to Scan" and make a call to Hbase again in order to get a new set of results or ResultScanner. You are right, ResultScanner object need to be released quickly when u r done with it at middle tier. Below are the text from HBase book... *10.8.4

Re: Storing extremely large size file

2012-04-18 Thread Doug Meil
This would be a good entry for the new "Use Cases" chapter in the Schema Design section. On 4/18/12 1:02 AM, "lars hofhansl" wrote: >I disagree. This comes up frequently and some basic guidelines should be >documented in the Reference Guide. >If it is indeed not difficult than the section

I cannot find the hbase configuration templates

2012-04-18 Thread niko . schwarz
I asked the same question on Stackoverflow: http://stackoverflow.com/questions/10175313/where-is-the-pseudo-distributed-template-for-hbase The [HBase documentation][1] asks me to Copy the pseudo-distributed suggested configuration file cp conf/hbase-site.xml{.pseudo-distributed.template,}

Re: regions stuck in transition

2012-04-18 Thread Bryan Beaudreault
Yes, I can get data from the region through the shell. The problem is the balancer cannot run when a region is in transition, so it is never running. Our region servers are becoming increasingly unbalanced. I don't want to restart a RegionServer, because it would cause a blip in requests for any

Re: Performance issues of prepending a table

2012-04-18 Thread Ian Varley
I would guess that this approach would be susceptible to the same kind of "hot spotting" as inserting sequential keys; if you're prepending globally (i.e. there's one global "first" row), then all activity will be taking place on the same region server, so you wouldn't be taking advantage of the

Performance issues of prepending a table

2012-04-18 Thread de Souza Medeiros Andre
Hi all, For some specific reason, I have a HBase table that should be frequently prepended. The row keys in this table are long integers (converted to bytes of course). "Prepend" is an operation that does the following: 1. Scans the table just for the purpose of getting the row key X of the firs

Re: Storing extremely large size file

2012-04-18 Thread Michel Segel
Look, I don't want to be *that* guy, but just my $0.02 cents ... I don't disagree that this topic doesn't come up all the time. But you have a couple of issues. What do you consider to be large? 1kb? 10? 100? >1MB? (and then there's that lie that size doesn't matter... But let's not go there...

Re: compression

2012-04-18 Thread Harsh J
Hey, Data in HBase is compressed upon compaction/flushes (i.e. upon creation of the storefiles). Hence the compression is also done over blocks of data (akin to SequenceFiles) and is efficient. The memstore isn't kept compressed nor is the WAL. RPCs in Apache HBase aren't compressed yet, but http

Re: Storing extremely large size file

2012-04-18 Thread Andrew Purtell
I've stored up to 100MB in cells, but only at the end of a long tail.This is a typical "webtable" type application.You can get away with it, but only if they are outliers, and you need to be careful with scan load with those things in the store like icebergs.  For many (most?) use cases, I'd sa

Re: Min/Max Column Value and Row Count

2012-04-18 Thread Mikael Sitruk
you can use endpoints see https://blogs.apache.org/hbase/entry/coprocessor_introduction if you look at the jira code sample (HBASE-1512) there is min/max.. implementation for a certain column, family and range, it should be a good starting point. Mikael.S On Wed, Apr 18, 2012 at 11:08 AM, Bing

Min/Max Column Value and Row Count

2012-04-18 Thread Bing Li
Dear all, I noticed that there were no ways to get the min/max of a specific column value using the current available filters. Right? Any more convenient approaches to get the row count of a family? I plan to use FamilyFilter to do that. Thanks so much! Best regards, Bing

Re: Storing extremely large size file

2012-04-18 Thread Andrey Stepachev
I think, that HBase should have something like http://www.mongodb.org/display/DOCS/GridFS+Specification. On Wed, Apr 18, 2012 at 10:55 AM, kim young ill wrote: > +1 documentation please > > On Wed, Apr 18, 2012 at 7:21 AM, anil gupta wrote: > >> +1 for documentation. It will help a lot of people