Hi Martin,

I believe you are using 4.1.0-RC4 with some custom changes you have done
locally. Will you be able to test this on stratos-4.1.x branch latest
commit (without having any other changes)? I cannot recall a fix we did
after 4.1.0-RC4 for this but it would be better if you can verify with the
latest code in stratos-4.1.x branch.

At the same time will you be able to do following:

   - Take a thread dump of the running Stratos, CEP instances once this
   happens
   - Check the file descriptor limits of the OS

Thanks

On Sat, Sep 12, 2015 at 10:56 PM, Martin Eppel (meppel) <mep...@cisco.com>
wrote:

> Resending in case it got lost,
>
>
>
> Thanks
>
>
>
> Martin
>
>
>
> *From:* Martin Eppel (meppel)
> *Sent:* Thursday, September 10, 2015 2:39 PM
> *To:* dev@stratos.apache.org
> *Subject:* Stratos 4.1: "Too many open files" issue
>
>
>
> Hi,
>
>
>
> We are seeing an issue with stratos running out of file handles when
> creating a number of applications and VM instances:
>
>
>
> The scenario is as follows:
>
>
>
> 13 applications are deployed, each with a single cluster and a single
> member instance,
>
>
>
> As the VMs spin up stratos becomes unresponsive and checking the logs we
> find the following exceptions (see below). I remember we had seen similar
> issues (same exceptions) back in stratos 4.0 in the context of longevity
> tests.
>
>
>
> We are running stratos 4.1 RC4 with the latest  commit
>
>
>
> commit 0fd41840fb04d92ba921bf58c59c2c3fbad0c561
>
> Author: Imesh Gunaratne <im...@apache.org>
>
> Date:   Tue Jul 7 12:54:47 2015 +0530
>
>
>
> Is this a known issue which might have been fixed in a later commit or
> something new ? Can we verify that the fixes for the previous issues are
> included in our system (jars, commit,s etc …) ?
>
>
>
>
>
>
>
>
>
> rg.apache.thrift.transport.TTransportException: java.net.SocketException:
> Too many open files
> at
> org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:118)
> at
> org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:35)
> at
> org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
> at
> org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:106)
> at
> org.wso2.carbon.databridge.receiver.thrift.internal.ThriftDataReceiver$ServerThread.run(ThriftDataReceiver.java:199)
> at java.lang.Thread.run(Thread.java:745)
> TID: [0] [STRATOS] [2015-08-17 17:38:17,499] WARN
> {org.apache.thrift.server.TThreadPoolServer} - Transport error occurred
> during acceptance of message.
> org.apache.thrift.transport.TTransportException: java.net.SocketException:
> Too many open files
> at
> org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:118)
> at
> org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:35)
> at
> org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
> at
> org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:106)
> at
> org.wso2.carbon.databridge.receiver.thrift.internal.ThriftDataReceiver$ServerThread.run(ThriftDataReceiver.java:199)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.net.SocketException: Too many open files
>
>
>
> // listing the applications, member isntances and cartridge state:
>
>
>
> [di-000-xxx] – application name,
>
>
>
> di-000-010: applicationInstances 1, groupInstances 0, clusterInstances 1,
> members 1 (Starting 1)
>
> di-000-011: applicationInstances 1, groupInstances 0, clusterInstances 1,
> members 1 (Initialized 1)
>
> cartridge-proxy: applicationInstances 1, groupInstances 0,
> clusterInstances 1, members 1 (Active 1)
>
> di-000-001: applicationInstances 1, groupInstances 0, clusterInstances 1,
> members 1 (Active 1)
>
> di-000-002: applicationInstances 1, groupInstances 0, clusterInstances 1,
> members 1 (Active 1)
>
> di-000-012: applicationInstances 1, groupInstances 0, clusterInstances 1,
> members 1 (Created 1)
>
> di-000-003: applicationInstances 1, groupInstances 0, clusterInstances 1,
> members 1 (Starting 1)
>
> di-000-004: applicationInstances 1, groupInstances 0, clusterInstances 1,
> members 1 (Starting 1)
>
> di-000-006: applicationInstances 1, groupInstances 0, clusterInstances 1,
> members 1 (Starting 1)
>
> di-000-005: applicationInstances 1, groupInstances 0, clusterInstances 1,
> members 1 (Starting 1)
>
> di-000-008: applicationInstances 1, groupInstances 0, clusterInstances 1,
> members 1 (Starting 1)
>
> di-000-007: applicationInstances 1, groupInstances 0, clusterInstances 1,
> members 1 (Starting 1)
>
> di-000-009: applicationInstances 1, groupInstances 0, clusterInstances 1,
> members 1 (Starting 1)
>
>
>



-- 
Imesh Gunaratne

Senior Technical Lead, WSO2
Committer & PMC Member, Apache Stratos

Reply via email to