[ https://issues.apache.org/jira/browse/AMQ-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440868#comment-16440868 ]
Tomas Pavelka commented on AMQ-6937: ------------------------------------ I have discovered that the CPU spin loop can happen even on Linux: whenever the process runs out of file descriptors the accept enters a loop that spins the CPU at 100%. I have attached a patch that slows down exception handling in such case. > Recycling TCP/IP stack on z/OS causes an infinite error loop in transport > server > -------------------------------------------------------------------------------- > > Key: AMQ-6937 > URL: https://issues.apache.org/jira/browse/AMQ-6937 > Project: ActiveMQ > Issue Type: Bug > Components: Broker > Affects Versions: 5.15.3 > Reporter: Tomas Pavelka > Priority: Major > Attachments: AMQ-6937-cpu-spin-prevention.patch > > > The ActiveMQ transport servers (e.g. TcpTransportServer) run the socket > accept (java.net.ServerSocket#accept) in an infinite loop. The accept call > can repeatedly fail with an exception spinning the CPU at full speed an > filling up logs quickly. > Here is an example of an exception that gets repeated indefinitely: > java.net.SocketException: EDC5122I Input/output error. (Accept failed) > at java.net.ServerSocket.implAccept(ServerSocket.java:623) > at java.net.ServerSocket.accept(ServerSocket.java:582) > at > org.apache.activemq.transport.tcp.TcpTransportServer.run(TcpTransportServer.java:351) > at java.lang.Thread.run(Thread.java:785) > This is a common problem on z/OS because the pattern of running accept in a > loop is used in many open source projects. For example, here is the same > issue in Derby: > https://issues.apache.org/jira/browse/DERBY-5347 > And here in Jetty: > [https://github.com/eclipse/jetty.project/issues/283] > Whenever the problem appears the socket becomes unusable. Would it be > possible for ActiveMQ to allow to insert a custom > org.apache.activemq.transport.TransportAcceptListener that would detect the > problem and do a re-bind on the socket? > -- This message was sent by Atlassian JIRA (v7.6.3#76005)