Re: Scan cause too many connection

2010-07-29 Thread baggio liu
In our cluster, there're thousands of region and each region has 5-8 store files, so the connection number is very terrible. And if we close the store file after scanning , the connection number may bring down some region servers. (In my cluster , my region server went down caused by socket OOM, o

Re: Scan cause too many connection

2010-07-29 Thread Angus He
Hi Baggio, > 2. During scanning, it'll new StoreFile one by one. And in constructor of > StoreFile , HFile.Reader will be created. HFile.Reader act as DFSClient, > it'll keep a connection with DataNode when something should be read. Yes, There is a HFile.Reader opend in StoreFile constructor.

Re: Scan cause too many connection

2010-07-29 Thread baggio liu
Hi Angus, 2. During scanning, it'll new StoreFile one by one. And in constructor of StoreFile , HFile.Reader will be created. HFile.Reader act as DFSClient, it'll keep a connection with DataNode when something should be read. 1.After checking code, and I've seen in scanner close method, HFil

Re: Scan cause too many connection

2010-07-29 Thread Angus He
1. try to close the scanner explictly? 2. I do not think HBase will issue a new connection for each StoreFile for the scan operation. 2010/7/29 baggio liu : > Hi all, >We have 53 machines in our hbase cluster and run 6 clients to scan a > table. During scanning, we found when a region is scan

Re: stargate response in binary format

2010-07-29 Thread Todd Lipcon
Hey Avani, What happens if you also specify the "-i" flag to curl to dump the header output? -Todd On Thu, Jul 29, 2010 at 6:38 PM, Sharma, Avani wrote: > Hi, > > I want to get binary output from HBase via stargate. > > I used -H "Accept: application/octet-stream" format as described at > htt

Re: Table goes offline - temporary outage + Retries Exhausted (related?)

2010-07-29 Thread Stuart Smith
Hello Ryan, I'll get the logs together tomorrow. I had to spend the last couple hours getting a static ip block & bringing the cluster up & down. Thanks for the help & enthusiasm! Take care, -stu --- On Thu, 7/29/10, Ryan Rawson wrote: > From: Ryan Rawson > Subject: Re: Table goes off

stargate response in binary format

2010-07-29 Thread Sharma, Avani
Hi, I want to get binary output from HBase via stargate. I used -H "Accept: application/octet-stream" format as described at http://wiki.apache.org/hadoop/Hbase/Stargate However, my simple curl command comes back with 0 bytes. What does this mean -" If the encoding is binary, returns row, col

Re: Unable to contact Regionserver in spite of META entry...

2010-07-29 Thread Todd Lipcon
Hey Vidhya, What version are you on, again? If you're on 0.89, the "hbase hbck" utility might be of use here. Any logs in that server that pertain to the given region name? Any exceptions there? What if you run the shell with HBASE_ROOT_LOGGER=DEBUG,console set so that you see the debug output as

Re: GC [ParNew...] took 299 secs causing region server to die

2010-07-29 Thread Todd Lipcon
Agree with what JD said - also check for swapping on the machine. GC can take forever if any of the Java heap gets swapped out, since GC by its nature has to traverse most of the pages in the heap. -Todd On Thu, Jul 29, 2010 at 3:41 PM, Jean-Daniel Cryans wrote: > Well it says; Times: user=0.17

Re: [stargate] status

2010-07-29 Thread Andrew Purtell
I'll update the wiki to remove the bit about alpha status. Stargate surely has bugs somewhere but is known to operate stably under load in several applications. - Andy > From: sasha.maksimenko > Subject: [stargate] status > To: user@hbase.apache.org > Date: Wednesday, July 28, 2010, 2:42 P

Re: Table goes offline - temporary outage + Retries Exhausted (related?)

2010-07-29 Thread Ryan Rawson
There is some root cause behind the 'failed to flush' message... I'd like to get to the root of that. Unfortunately it means lots of log groveling. If you want to post logs, try pastebin.com instead of trying to attach files. Dig some dirt up and lets check it out :-) -ryan On Thu, Jul 29, 201

Re: Table goes offline - temporary outage + Retries Exhausted (related?)

2010-07-29 Thread Stuart Smith
Hello Ryan, Thanks! Just to verify - my xceiver count is 4K, my ulimit reports 64000, my datanode handler count is 15, my socket write timeout is zero, my swappiness is 1 on datanodes and 0 on the namenode, and my memory has been tweaked according to the machines - hadoop and hbase both get

Re: GC [ParNew...] took 299 secs causing region server to die

2010-07-29 Thread Jean-Daniel Cryans
Well it says; Times: user=0.17 sys=0.04, real=299.23 secs So why did it take 0.04 of system time but 300 secs of real time? That's insane. Either the region server process was completely starved of CPU cycles (are you on EC2 or any virtualized service like that?), or the computer was put to sleep.

Re: Table goes offline - temporary outage + Retries Exhausted (related?)

2010-07-29 Thread Ryan Rawson
Hi, There is a lot going on in this email, the logs might look promising but they are standard split messages, not really indicative of anything going wrong. It sounds like you might be coming across some of the standard foils that are well documented in here: http://hbase.apache.org/docs/r0.20.5

Re: Table goes offline - temporary outage + Retries Exhausted (related?)

2010-07-29 Thread Stuart Smith
Hello all, It looks like I had an ensemble of unrelated errors. To follow up with the table going offline error: I noticed today the the gui will say: "Enabled False", and the shell will say: hbase(main):004:0> describe 'filestore' DESCRIPTION

Re: Unable to contact Regionserver in spite of META entry...

2010-07-29 Thread Ted Yu
Have you verified that Some server is indeed the same as 63.250.207.87 ? On Thu, Jul 29, 2010 at 12:31 PM, Vidhyashankar Venkataraman < vidhy...@yahoo-inc.com> wrote: > I have an MR job that sends streams of updates (puts and deletes) to an > existing db and all the tasks are crashing complaining

GC [ParNew...] took 299 secs causing region server to die

2010-07-29 Thread Steve Kuo
I kept running into the stop-the-world GC during batch import of data into hbase. The configuration of a node in the 8-node cluster is as follows. * 4-core * 64-bit JVM * 8 GB of memory * CDH2 for hadoop and 0.20.5 for hbase * TT: 128 MB * DN: 128 MB * 2 Mappers at 512 MB each * 2 Reducer at 512

Unable to contact Regionserver in spite of META entry...

2010-07-29 Thread Vidhyashankar Venkataraman
I have an MR job that sends streams of updates (puts and deletes) to an existing db and all the tasks are crashing complaining of the exceptions similar to the following: Exception in thread "main" org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server S

Re: Table goes offline - temporary outage + Retries Exhausted (related?)

2010-07-29 Thread Stuart Smith
To follow up on the retry error (still have no idea about the table going offline): It was coding error, sorta kinda. I was doing large batches with AutoFlush disabled, and flushing at the end, figuring I could gain performance, and just reprocess bad batches. Bad call. It appears I was consi

Table goes offline - temporary outage + Retries Exhausted (related?)

2010-07-29 Thread Stuart Smith
Hello, I have two problems that may or may not be related. One is trying to figure out a self-correcting outage I had last evening. I noticed issues starting with clients reporting: RetriesExhaustedException: Trying to contact region server Some server... I didn't see much going on in the re

[ANNOUNCE] Next HUG meetup: Noida/NCR- India - 31st July 2010 : Reminder

2010-07-29 Thread Sanjay Sharma
Hi All, We are planning to hold the next Hadoop India User Group meet up on 31st July 2010 in Noida, India. The registration and event details are available at - http://hugindia-absolutezeroforum.eventbrite.com/ We currently have the following talks lined up- - HIHO

Re: sometimes more than 1 value stored, even though VERSIONS is 1

2010-07-29 Thread Jean-Daniel Cryans
Inline. J-D On Thu, Jul 29, 2010 at 8:54 AM, Ferdy Galema wrote: > Using Hbase 0.20.5 with Hadoop CDH2 0.20.1+169.89 I noticed something very > strange. > > When overwriting a certain column in a column family with 1 VERSIONS, and > removing that value later (for example after several minutes) t

sometimes more than 1 value stored, even though VERSIONS is 1

2010-07-29 Thread Ferdy Galema
Using Hbase 0.20.5 with Hadoop CDH2 0.20.1+169.89 I noticed something very strange. When overwriting a certain column in a column family with 1 VERSIONS, and removing that value later (for example after several minutes) the older value still shows when listing all the KeyValues of the row. Al