Hi,

After some poking around. I think the problem here is as follows:


1.      Apache Stratos contains multiple dependencies directly on libthrift.

o   We can modify the build to change the version of libthrift in these cases, 
except that we need 0.9.3, and (AFAIK) the version has to be in the wso2 maven 
repositories.

2.      Apache Stratos also contains one or more dependencies indirectly on 
libthrift (e.g. via wso2’s carbon) for which we have no source or ability to 
rebuild.

3.      From the stack trace, we cannot tell which instance of these 
dependencies is at fault.

Suggestions/advice welcome!

Thanks, Shaheed

From: Shaheedur Haque (shahhaqu)
Sent: 25 November 2015 11:45
To: dev@stratos.apache.org
Cc: Martin Eppel (meppel); Ali Bidabadi (abidabad)
Subject: RE: File handle leak in Thrift

BTW, to be clear, I have no idea *which* copy of libthrift is at fault, as 
apache-stratos seems to contain more than one either directly or indirectly 
(via carbon?); the stack trace I provided is the full stack trace.

From: Shaheedur Haque (shahhaqu)
Sent: 25 November 2015 11:15
To: dev@stratos.apache.org<mailto:dev@stratos.apache.org>
Cc: Martin Eppel (meppel); Ali Bidabadi (abidabad)
Subject: RE: File handle leak in Thrift

Hi Imesh,

I’m pretty sure that STARTOS-1108 is some other leak/issue, not least because 
we have that fix ☺. You will see my analysis has a rather different stack 
trace, and points to a very obvious coding bug in Thrift. As I say, I am happy 
to verify that the Thrift bug fix addresses the issue we see, but I believe I 
need the wso2 package repository to have the new Thrift version (or else some 
instructions on how to point to maven.org for this package only) to do so.

Thanks, Shaheed

From: Imesh Gunaratne [mailto:im...@apache.org]
Sent: 25 November 2015 02:26
To: dev
Cc: Martin Eppel (meppel); Ali Bidabadi (abidabad)
Subject: Re: File handle leak in Thrift

Hi Shaheed,

AFAIK we fixed this memory issue in CEP in Stratos 4.1.0 RC3:
https://issues.apache.org/jira/browse/STRATOS-1108

Thanks

On Tue, Nov 24, 2015 at 2:51 PM, Shaheedur Haque (shahhaqu) 
<shahh...@cisco.com<mailto:shahh...@cisco.com>> wrote:
Hi Isuru,

I now believe I cited the wrong dependency in my original email. In fact, I 
don’t know what the correct dependency is to update Thrift. You’ll see from my 
second email that I tried setting thrift.version here:

$ grep -r thrift.version ../stratos/
../stratos/features/manager/logging-mgt/pom.xml:                
<version>${libthrift.version}</version>
../stratos/features/manager/logging-mgt/pom.xml:        
<libthrift.version>0.7.wso2v1</libthrift.version>

But as I said, that just gave the build error I noted. Also confusing me is 
that the above pom.xml sets the version to 0.7.wso2v1 whereas my .zip file 
clearly contains 0.7.wso2v2. Note that I am not familiar with how Maven config 
works...so any clarification is most welcome!

Thanks, Shaheed

From: isu...@wso2.com<mailto:isu...@wso2.com> 
[mailto:isu...@wso2.com<mailto:isu...@wso2.com>] On Behalf Of Isuru Haththotuwa
Sent: 24 November 2015 02:26
To: dev
Cc: Martin Eppel (meppel); Ali Bidabadi (abidabad)
Subject: Re: File handle leak in Thrift

Shaheed,

I could not find the jar with the mentioned version hosted in the relevant 
nexus repository [1]. Can you please double check if the version is correct?

[1]. 
http://maven.wso2.org/nexus/content/groups/wso2-public/org/wso2/carbon/org.wso2.carbon.databridge.agent.thrift/


On Mon, Nov 23, 2015 at 11:41 PM, Shaheedur Haque (shahhaqu) 
<shahh...@cisco.com<mailto:shahh...@cisco.com>> wrote:
It seems the upstream fix is in Thrift 0.9.3. Now, I think I pasted the wrong 
dependency in the email below, but changing the variable “thrift.version” to 
0.9.3 simply resulted in a build failure:

[ERROR] Failed to execute goal on project org.apache.stratos.common: Could not 
resolve dependencies for project 
org.apache.stratos:org.apache.stratos.common:bundle:4.1.0: Could not find 
artifact org.wso2.carbon:org.wso2.carbon.databridge.agent.thrift:jar:0.9.3 in 
central (http://repo1.maven.org/maven2) -> [Help 1]

I’m not sure (a) if I got the right variable, and if I did (b) why it did not 
work. How else do I get the fix?

From: Shaheedur Haque (shahhaqu)
Sent: 23 November 2015 13:46
To: d...@stratos.incubator.apache.org<mailto:d...@stratos.incubator.apache.org>
Cc: Martin Eppel (meppel); Ali Bidabadi (abidabad)
Subject: File handle leak in Thrift

Hi all,

I believe that Stratos is missing a memory leak fix in 
libthrift_0.7.0.wso2v2.jar as follows…


1.     For unknown reasons, we sometimes get Stratos’ memory footprint growing 
from the normal “1.0something” GB of virtual memory to 10 GB and then 34 GB in 
a matter of seconds:



top - 21:21:55 up  4:39,  1 user,  load average: 0.01, 0.08, 0.18
Tasks: 135 total,   3 running, 132 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.5 us,  0.9 sy,  0.0 ni, 95.3 id,  1.1 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem:  16434456 total,  9867416 used,  6567040 free,    95772 buffers
KiB Swap:        0 total,        0 used,        0 free.  6485696 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2913 netiq     20   0 4702012 1.083g  18976 S   2.9  6.9   1:28.04 java
 2741 root      20   0 3395640 600544  14504 S   1.8  3.7   0:34.09 java
25941 root      20   0  186084  37700  26636 S   0.7  0.2   1:48.86 corosync
...



top - 21:23:55 up  4:41,  1 user,  load average: 1.08, 0.55, 0.35
Tasks: 137 total,   3 running, 134 sleeping,   0 stopped,   0 zombie
%Cpu(s): 34.1 us, 10.8 sy,  0.0 ni, 45.9 id,  0.9 wa,  0.0 hi,  8.3 si,  0.0 st
KiB Mem:  16434456 total, 10957936 used,  5476520 free,    96024 buffers
KiB Swap:        0 total,        0 used,        0 free.  6599088 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2913 netiq     20   0 10.236g 1.411g  18956 S  91.0  9.0   3:17.37 java
 2741 root      20   0 3395776 621352  14520 S  12.3  3.8   0:48.84 java
25941 root      20   0  186084  37700  26636 S   0.7  0.2   1:49.68 corosync
...



2.     The logs fill very rapidly at this point, so all we see is that after 
the fact, all 10 GB of logs look like this:

TID: [0] [STRATOS] [2015-11-22 21:27:22,795]  WARN 
{org.apache.thrift.server.TThreadPoolServer} -  Transport error occurred during 
acceptance of message.
org.apache.thrift.transport.TTransportException: java.net.SocketException: Too 
many open files
        at 
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:118)
        at 
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:35)
        at 
org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
        at 
org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:106)
        at 
org.wso2.carbon.databridge.receiver.thrift.internal.ThriftDataReceiver$ServerThread.run(ThriftDataReceiver.java:199)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at 
java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at 
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:113)
        ... 5 more

Now a cursory glance at upstream shows this was probably fixed upstream in 2015:



https://git-wip-us.apache.org/repos/asf?p=thrift.git;a=commitdiff;h=b1a35da9168cca5a7524ab9814161f024da145df

and given that our 0.7.0 jar file has content dated 2011, it likely does not 
have the fix. I also note that upstream has evolved considerably overall. Now, 
what I am not sure of is whether we are using an old library for some specific 
reason, e.g. was it hacked/modified by wso2? Is the new code not compatible 
with the Stratos codebase? If I am looking in the right place, the 
stratos/components/org.apache.stratos.common/pom.xml seems to be picking up a 
specific version:

    <dependency>
            <groupId>org.wso2.carbon</groupId>
            <artifactId>org.wso2.carbon.databridge.agent.thrift</artifactId>
            <version>${wso2carbon.version}</version>
        </dependency>

Do we know why? How to go about getting the fix? Please advise,

Thanks, Shaheed

--
Thanks and Regards,

Isuru H.
+94 716 358 048<tel:%2B94%20716%20358%20048>




--
Imesh Gunaratne

Senior Technical Lead, WSO2
Committer & PMC Member, Apache Stratos

Reply via email to