Re: RS unresponsive after series of deletes

2012-06-22 Thread lars hofhansl
Subject: RE: RS unresponsive after series of deletes Good hint, Ted By calling Delete.deleteColumn(family, qual, ts) instead of deleteColumn w/o timestamp, the time to delete row keys is reduced by 95%. I am going to experiment w/ limited batches of Deletes, too. Thanks everyone for help on this one

RE: RS unresponsive after series of deletes

2012-06-21 Thread Ted Tuttle
>Ted T: > Can you log a JIRA summarizing the issue ? https://issues.apache.org/jira/browse/HBASE-6254

Re: RS unresponsive after series of deletes

2012-06-21 Thread Ted Yu
Ted Yu [mailto:yuzhih...@gmail.com] > *Sent:* Thursday, June 21, 2012 10:32 AM > *To:* user@hbase.apache.org > *Cc:* Development > > *Subject:* Re: RS unresponsive after series of deletes > > ** ** > > > > Cheers > > On Thu, Jun 21, 2012 at 7:02 AM, Ted Tu

RE: RS unresponsive after series of deletes

2012-06-21 Thread Ted Tuttle
ted batches of Deletes, too. Thanks everyone for help on this one. -Original Message- From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Wednesday, June 20, 2012 10:13 PM To: user@hbase.apache.org Subject: Re: RS unresponsive after series of deletes As I mentioned earlier, prepareDe

Re: RS unresponsive after series of deletes

2012-06-21 Thread Ted Yu
--- > From: Ted Yu [mailto:yuzhih...@gmail.com] > Sent: Wednesday, June 20, 2012 10:13 PM > To: user@hbase.apache.org > Subject: Re: RS unresponsive after series of deletes > > As I mentioned earlier, prepareDeleteTimestamps() performs one get > operation per column qualifier: >

Re: RS unresponsive after series of deletes

2012-06-21 Thread Ted Yu
.@gmail.com] > Sent: Wednesday, June 20, 2012 10:13 PM > To: user@hbase.apache.org > Subject: Re: RS unresponsive after series of deletes > > As I mentioned earlier, prepareDeleteTimestamps() performs one get > operation per column qualifier: > get.addColumn(family, qua

RE: RS unresponsive after series of deletes

2012-06-21 Thread Ted Tuttle
[mailto:yuzhih...@gmail.com] Sent: Wednesday, June 20, 2012 10:13 PM To: user@hbase.apache.org Subject: Re: RS unresponsive after series of deletes As I mentioned earlier, prepareDeleteTimestamps() performs one get operation per column qualifier: get.addColumn(family, qual); List

Re: RS unresponsive after series of deletes

2012-06-20 Thread Ted Yu
As I mentioned earlier, prepareDeleteTimestamps() performs one get operation per column qualifier: get.addColumn(family, qual); List result = get(get, false); This is too costly in your case. I think you can group some configurable number of qualifiers in each get and perform c

Re: RS unresponsive after series of deletes

2012-06-20 Thread Ted Tuttle
> Do your 100s of thousands cell deletes overlap (in terms of column family) > across rows ? Our schema contains only one column family per table. So, each Delete contains cells from a single column family. I hope this answers your question.

Re: RS unresponsive after series of deletes

2012-06-20 Thread Ted Yu
Ted T: Do your 100s of thousands cell deletes overlap (in terms of column family) across rows ? In HRegionServer: public MultiResponse multi(MultiAction multi) throws IOException { ... for (Action a : actionsForRegion) { action = a.getAction(); ... if (action instanceof

Re: RS unresponsive after series of deletes

2012-06-20 Thread Ted Yu
Looking at the stack trace, I found the following hot spot: 1. org.apache.hadoop.hbase.regionserver.StoreFileScanner.realSeekDone(StoreFileScanner.java:340) 2. org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:331) 3. org.apache.hadoop.hbase.regions

RE: RS unresponsive after series of deletes

2012-06-20 Thread Ted Tuttle
First off, J-D, thanks for helping me work through this. You've inspired some different angles and I think I've finally made it bleed in a controlled way. > - That data you are deleting needs to be read when you scan, like I > said earlier a delete is in fact an insert in HBase and this isn't > c

Re: RS unresponsive after series of deletes

2012-06-20 Thread Jean-Daniel Cryans
What you are describing here seems very different from what shown earlier. In any case, a few remarks: - You have major compactions running during the time of that log trace, this usually sucks up a lot of IO. See http://hbase.apache.org/book.html#managed.compactions - That data you are deletin

RE: RS unresponsive after series of deletes

2012-06-20 Thread Ted Tuttle
> Like Stack said in his reply, have you thread dumped the slow region > servers when this happens? I've been having difficulty reproducing this behavior in controlled manner. While I haven't been able to get my client to hang up while doing deletes, I have found a query that when issued after a

Re: RS unresponsive after series of deletes

2012-06-18 Thread Jean-Daniel Cryans
Mass deleting in HBase is equivalent to mass inserting, it's just that the former doesn't have to write values out (just keys). Almost everything that applies to batch insert tunings applies to batch deleting. Now the error you get comes from this: https://issues.apache.org/jira/browse/HBASE-5190

RE: RS unresponsive after series of deletes

2012-06-18 Thread Ted Tuttle
We had another of these delete-related RS hang ups. This time we are getting a different error on the client: java.io.IOException: Call queue is full, is ipc.server.max.callqueue.size too small? full stack here: http://pastebin.com/uq68Mvhm Looking at the RS log, it appears the RS was working o

RE: RS unresponsive after series of deletes

2012-06-14 Thread Ted Tuttle
> What kind of a delete are you doing? A mixture of row and cell deletes. Interestingly, the first 19 (successful) deletes were row deletes. The client got hung up while submitting its first batch of cell deletes. However, I think the cell/row distinction is a red herring as we've experienced

Re: RS unresponsive after series of deletes

2012-06-14 Thread Stack
On Wed, Jun 13, 2012 at 12:09 PM, Ted Tuttle wrote: > My client code has a set of deletes to carry out.  After successfully issuing > 19 such deletes the client begins logging HBase errors while trying to > complete the deletes.  It logs ERRORs every 60s for 10 times and then gives > up. > Wha

RS unresponsive after series of deletes

2012-06-13 Thread Ted Tuttle
Hi All- I have a repeatable and troublesome HBase interaction that I would like some advice on. I am running a 5 node cluster on v0.94 on cdh3u3 and accessing through Java client API. Each RS has 32G of RAM, is running w/ 16G heap w/ 4G for block cache. Used heap of each RS is well below 16G