Did Gaojinchao attached the stack dump people received it (Lars?). Could some one or Gaojinchao attach it to the jira.
-Shrijeet On Sun, Dec 4, 2011 at 12:24 PM, lars hofhansl <[email protected]> wrote: > > Thanks. Now the question is: How many connection threads do we have? > > I think there is one per regionserver, which would indeed be a problem. > Need to look at the code again (I'm only partially familiar with the client > code). > > Either the client should chunk (like the server does), or there should be a > limited number of thread that > perform IO on behalf of the client (or both). > > -- Lars > > > ----- Original Message ----- > From: Gaojinchao <[email protected]> > To: "[email protected]" <[email protected]>; lars hofhansl > <[email protected]> > Cc: Chenjian <[email protected]>; wenzaohua <[email protected]> > Sent: Saturday, December 3, 2011 11:22 PM > Subject: Re: FeedbackRe: Suspected memory leak > > This is dump stack. > > > -----邮件原件----- > 发件人: lars hofhansl [mailto:[email protected]] > 发送时间: 2011年12月4日 14:15 > 收件人: [email protected] > 抄送: Chenjian; wenzaohua > 主题: Re: FeedbackRe: Suspected memory leak > > Dropping user list. > > Could you (or somebody) point me to where the client is using NIO? > I'm looking at HBaseClient and I do not see references to NIO, also it seems > that all work is handed off to > separate threads: HBaseClient.Connection, and the JDK will not cache more > than 3 direct buffers per thread. > > It's possible (likely?) that I missed something in the code. > > Thanks. > > -- Lars > > ________________________________ > From: Gaojinchao <[email protected]> > To: "[email protected]" <[email protected]>; "[email protected]" > <[email protected]> > Cc: Chenjian <[email protected]>; wenzaohua <[email protected]> > Sent: Saturday, December 3, 2011 7:57 PM > Subject: FeedbackRe: Suspected memory leak > > Thank you for your help. > > This issue appears to be a configuration problem: > 1. HBase client uses NIO(socket) API that uses the direct memory. > 2. Default -XXMaxDirectMemorySize value is equal to -Xmx value, So if there > doesn't have "full gc", all direct memory can't reclaim. Unfortunately, using > GC confiugre parameter of our client doesn't produce any "full gc". > > This is only a preliminary result, All tests is running, If have any further > results , we will be fed back. > Finally , I will update our story to issue > https://issues.apache.org/jira/browse/HBASE-4633. > > If our digging is crrect, whether we should set a default value for the > "-XXMaxDirectMemorySize" to prevent this situation? > > > Thanks > > -----邮件原件----- > 发件人: bijieshan [mailto:[email protected]] > 发送时间: 2011年12月2日 15:37 > 收件人: [email protected]; [email protected] > 抄送: Chenjian; wenzaohua > 主题: Re: Suspected memory leak > > Thank you all. > I think it's the same problem with the link provided by Stack. Because the > heap-size is stabilized, but the non-heap size keep growing. So I think not > the problem of the CMS GC bug. > And we have known the content of the problem memory section, all the records > contains the info like below: > "|www.hostname00000000000002087075.comlhggmdjapwpfvkqvxgnskzzydiywoacjnpljkarlehrnzzbpbxc||||||460|||||||||||Agent||||"; > "BBZHtable_UFDR_058,048342220093168-02570" > ........ > > Jieshan. > > -----邮件原件----- > 发件人: Kihwal Lee [mailto:[email protected]] > 发送时间: 2011年12月2日 4:20 > 收件人: [email protected] > 抄送: Ramakrishna s vasudevan; [email protected] > 主题: Re: Suspected memory leak > > Adding to the excellent write-up by Jonathan: > Since finalizer is involved, it takes two GC cycles to collect them. Due to > a bug/bugs in the CMS GC, collection may not happen and the heap can grow > really big. See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7112034 > for details. > > Koji tried "-XX:-CMSConcurrentMTEnabled" and confirmed that all the socket > related objects were being collected properly. This option forces the > concurrent marker to be one thread. This was for HDFS, but I think the same > applies here. > > Kihwal > > On 12/1/11 1:26 PM, "Stack" <[email protected]> wrote: > > Make sure its not the issue that Jonathan Payne identifiied a while > back: > https://groups.google.com/group/asynchbase/browse_thread/thread/c45bc7ba788b2357# > St.Ack >
