Touched, not typed. Erroneous words are a feature, not a typo.
On Jul 24, 2014 8:55 AM, "Jeffrey Nguyen (jeffrngu)" <jeffr...@cisco.com>
wrote:
>
> Hi Devs,
>
> We have a setup where Stratos stopped functioning after running for about
24 hours.  wso2carbon.log shows a lot of exceptions like one listed below.
  We were advised to increase "ulimit" from default 1024 to around 65k.
  Just wanted to post this issue here to see if anybody has come across
this issue and has successfully fixed the problem.
>
> Naturally one would ask if increasing the ulimit would just defer the
problem a little longer.   I mean 1024 seems like a lot to me if resources
are only acquired for short duration and recycled properly.   Which
components/subcomponents of Stratos requires such large amount of resources?
>
> I took a memory dump of the hung Stratos process and ran memory leak
analysis and found some leak suspects captured in the attached screenshots.
  These looks very suspicious to me.   Can someone take a look and see if
they are legit or just false alarms?    One of the suspect leaks points
to org.wso2.carbon.event.builder.core.internal.CarbonEventBuilderService,
which is part of the Carbon core.   I don't believe we have source code for
Carbon core in Stratos code base.
>
> If increasing ulimit end up being the ultimate solution to this problem,
have we done any kind of analysis that shows system load (e.g. Number of
vms spawned) vs. ulimit values?   I mean how do we know which ulimit value
is good given a pre-defined system load?  Or what type of load can 65k
ulimit withstand?
>
IFAIR there was no analysis of it. Yes. It ia a good analysis to perform
> Also, beside the default mysql DB, does Stratos use any other types of DB
like Cassandra?    I can see that we're launching Stratos with the option
"-Ddisable.cassandra.server.startup=true" so I assume there's no embedded
Cassandra.
>
Cassendra database is used for BAM for log publishing. I think Dinesh can
provide a detailed answer for this.
> From the exception stack trace below, it looks like Stratos uses Apache
Thrift transport protocol.   Just for my information, where do I go to
learn about how/where this is used within Stratos?
>
Thrift protocol is used for communication between CEP and cartridge agents
when cartridge agent sends statistics to CEP.
> TID: [0] [STRATOS] [2014-07-16 07:03:58,597]  WARN
{org.apache.thrift.server.TThreadPoolServer} -  Transport error occurred d
> uring acceptance of message. {org.apache.thrift.server.TThreadPoolServer}
> org.apache.thrift.transport.TTransportException:
java.net.SocketException: Too many open files
> at
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:118)
> at
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:35)
> at
org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
> at
org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:106)
> at
org.wso2.carbon.databridge.receiver.thrift.internal.ThriftDataReceiver$ServerThread.run(ThriftDataReceiver.java:19
> 9)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.net.SocketException: Too many open files
> at java.net.PlainSocketImpl.socketAccept(Native Method)
> at
java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
> at java.net.ServerSocket.implAccept(ServerSocket.java:530)
> at java.net.ServerSocket.accept(ServerSocket.java:498)
> at
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:113)
> ... 5 more
>
> Regards,
> -Jeffrey

Reply via email to