Great. Thank you. Checking out now. I hope I get everything assembled. I
keep you posted.
On Sun, Aug 24, 2014 at 9:57 AM, Stack wrote:
> I put up patches in the issue Johannes. Hopefully the reproduced
> stackoverflow is same as yours. See HBASE-11813.
> St.Ack
>
>
> On Sat, Aug 23, 2014 at 9:
I put up patches in the issue Johannes. Hopefully the reproduced
stackoverflow is same as yours. See HBASE-11813.
St.Ack
On Sat, Aug 23, 2014 at 9:25 PM, Johannes Schaback <
johannes.schab...@visual-meta.com> wrote:
> We us all plain gets and puts (sometimes batched).
>
> We have hbase.client.ke
We us all plain gets and puts (sometimes batched).
We have hbase.client.keyvalue.maxsize increased to 536870912 bytes on the
client. That is the only thing I can see.
I am about to send you a zip file with the respective classes to your email
address directly. I probably better dont post the code
Hi Qiang,
no, we dont use coprocessors.
Thanks, Johannes
On Sun, Aug 24, 2014 at 6:04 AM, Qiang Tian wrote:
> Hi Johannes,
> Do you use endpoint / coprocessor stuff?
> thanks.
>
>
> On Sun, Aug 24, 2014 at 11:51 AM, Qiang Tian wrote:
>
> > Hi Stack,
> > I think you are right. the multiple q
Hi Johannes,
Do you use endpoint / coprocessor stuff?
thanks.
On Sun, Aug 24, 2014 at 11:51 AM, Qiang Tian wrote:
> Hi Stack,
> I think you are right. the multiple queue change was introduced by
> HBASE-11355(0.98.4). if there is only 1 queue, the stuck will not happen.
> some handlers are gon
Hi Stack,
I think you are right. the multiple queue change was introduced by
HBASE-11355(0.98.4). if there is only 1 queue, the stuck will not happen.
some handlers are gone but still some left to service the request(all
handlers gone looks a rare case)...
so the problem might have been there for s
On Sat, Aug 23, 2014 at 4:06 PM, Stack wrote:
> ...
> If you were looking for something to try, set
> hbase.ipc.server.callqueue.handler.factor
> to 0. Multiple queues is what is new here. It should not make a difference
> but...
>
>
Hmm. Ignore above I'd say. I can't see how it would trigger
I am having trouble reproducing the stack overflow. Some particular
response is triggering it (the code here has been around a while). Any
particulars on how your client is accessing hbase? Anything unusual?
If you were looking for something to try, set
hbase.ipc.server.callqueue.handler.factor
t
Thank you.
>From the proposed resolution I imagine that the RS would then die in case
of a handler error. So the question remains what error originally occured
in the handler in the first place. The log of the entire lifecycle of the
RS (http://schabby.de/wp-content/uploads/2014/08/filtered.txt) d
On Sat, Aug 23, 2014 at 12:11 PM, Johannes Schaback <
johannes.schab...@visual-meta.com> wrote:
> Exception in thread "defaultRpcServer.handler=5,queue=2,port=60020"
> java.lang.StackOverflowError
> at org.apache.hadoop.hbase.CellUtil$1.advance(CellUtil.java:210)
> at org.apache.ha
Can you show the complete stack trace for StackOverflowException (using
pastebin) ?
Thanks
On Aug 23, 2014, at 12:11 PM, Johannes Schaback
wrote:
> Hi,
>
> we had to reduce load on the cluster yesterday night which reduced the
> frequency of the phenomenon. That is why I could not get a jsta
Hi,
we had to reduce load on the cluster yesterday night which reduced the
frequency of the phenomenon. That is why I could not get a jstack dump yet
because it did not occur since a couple hours. We will now get the load
back up hoping to trigger it again.
Yes, I cut out the properties from the
Anything in your .out that could help explain our losing handlers if you
can't find anything in the logs?
You did the 'snipp' in the below, right Johannes?
RS Configuration:
===
[snipp] no fancy stuff, all default, except absolute necessary
Did you set hbase.ipc.server.callqueue.handler.factor?
it looks there are 3 queues, handlers on queue 1 are all gone as Stack
mentioned. jstack and pastebin regions server log would help.
On Sat, Aug 23, 2014 at 7:02 AM, Stack wrote:
> On Fri, Aug 22, 2014 at 3:24 PM, Johannes Schaback <
>
On Fri, Aug 22, 2014 at 3:24 PM, Johannes Schaback <
johannes.schab...@visual-meta.com> wrote:
> ...
> I grep'ed "defaultRpcServer.handler=" on the log from that particular RS.
> The
> RS started at 15:35. After that, the handlers
>
> 6, 24, 0, 15, 28, 26, 7, 19, 21, 3, 5 and 23
>
> make an appear
I havent managed to pull a jstack of a stuck node yet (I will do that first
thing in the morning). But...
I just killed and restarted a RS and called /dump right away to see whether
the defaultRpcServer.handler instances are present. And yes, they are. From
0 to 29, even in consecutive order. I ki
Are we losing handler threads, the workers that take from the pool we are
blocked on?
The attached thread dump has ten with non-sequential numbers:
Thread 97 (defaultRpcServer.handler=27,queue=0,port=60020):
Thread 94 (defaultRpcServer.handler=24,queue=0,port=60020):
Thread 91 (defaultRpcServer.h
nvm. misread. Trying to figure why the scheduling queue is filled to the
brim such that no more calls can be added/dispatched...
St.Ack
On Fri, Aug 22, 2014 at 12:45 PM, Stack wrote:
> Are you replicating?
> St.Ack
>
>
> On Fri, Aug 22, 2014 at 10:28 AM, Johannes Schaback <
> johannes.schab...
Are you replicating?
St.Ack
On Fri, Aug 22, 2014 at 10:28 AM, Johannes Schaback <
johannes.schab...@visual-meta.com> wrote:
> Dear HBase-Pros,
>
> we face a serious issue with our HBase production cluster for two days now.
> Every couple minutes, a random RegionServer gets stuck and does not pro
Do you have a few thread dumps from a 'deaf' instance?
St.Ack
On Fri, Aug 22, 2014 at 10:28 AM, Johannes Schaback <
johannes.schab...@visual-meta.com> wrote:
> Dear HBase-Pros,
>
> we face a serious issue with our HBase production cluster for two days now.
> Every couple minutes, a random Regio
Dear HBase-Pros,
we face a serious issue with our HBase production cluster for two days now.
Every couple minutes, a random RegionServer gets stuck and does not process
any requests. In addition this causes the other RegionServers to
freeze within a minute which brings down the entire cluster. Sto
21 matches
Mail list logo