[ 
https://issues.apache.org/jira/browse/THRIFT-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Duxbury closed THRIFT-1493.
---------------------------------

    Resolution: Not A Problem
    
> Possible infinite loop in TThreadPoolServer
> -------------------------------------------
>
>                 Key: THRIFT-1493
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1493
>             Project: Thrift
>          Issue Type: Bug
>          Components: Java - Library
>    Affects Versions: 0.7
>         Environment: Debian Squeeze
>            Reporter: bert Passek
>
> I just faced a major problem in Thrift in combination with Flume, but the 
> problem actually could be tracked down to the Thrift library.
> I'm using Thrift in a typical client/server environment for tracking tons of 
> data. We ran into an exception which basically looks like:
> 2012-01-11 14:57:30,487 ERROR com.cloudera.flume.core.connector.DirectDriver: 
> Exiting driver logicalNode newsletterImpressionLog01-21 in error state 
> ThriftEventSource | CassandraSink because sleep interrupted 
> 2012-01-11 17:18:14,808 WARN org.apache.thrift.server.TSaneThreadPoolServer: 
> Transport error occurred during acceptance of message. 
> org.apache.thrift.transport.TTransportException: java.net.SocketException: 
> Too many open files 
>         at 
> org.apache.thrift.transport.TSaneServerSocket.acceptImpl(TSaneServerSocket.java:139)
>  
>         at 
> org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31) 
>         at 
> org.apache.thrift.server.TSaneThreadPoolServer$1.run(TSaneThreadPoolServer.java:175)
>  
> Caused by: java.net.SocketException: Too many open files 
>         at java.net.PlainSocketImpl.socketAccept(Native Method) 
>         at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408) 
>         at java.net.ServerSocket.implAccept(ServerSocket.java:462) 
>         at java.net.ServerSocket.accept(ServerSocket.java:430) 
>         at 
> org.apache.thrift.transport.TSaneServerSocket.acceptImpl(TSaneServerSocket.java:134)
>  
>         ... 2 more 
> 2012-01-11 17:18:14,809 WARN org.apache.thrift.server.TSaneThreadPoolServer: 
> Transport error occurred during acceptance of message. 
> org.apache.thrift.transport.TTransportException: java.net.SocketException: 
> Too many open files 
>         at 
> org.apache.thrift.transport.TSaneServerSocket.acceptImpl(TSaneServerSocket.java:139)
>  
>         at 
> org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31) 
>         at 
> org.apache.thrift.server.TSaneThreadPoolServer$1.run(TSaneThreadPoolServer.java:175)
>  
> Caused by: java.net.SocketException: Too many open files 
>         at java.net.PlainSocketImpl.socketAccept(Native Method) 
>         at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408) 
>         at java.net.ServerSocket.implAccept(ServerSocket.java:462) 
>         at java.net.ServerSocket.accept(ServerSocket.java:430) 
>         at 
> org.apache.thrift.transport.TSaneServerSocket.acceptImpl(TSaneServerSocket.java:134)
>  
>         ... 2 more 
> Note: Flume is using their own implementation of TThreadPoolServer which is 
> literally copied and pasted from original source code from Thrift. Flume 
> embedded this part of thrift library in a massive multi-threading environment.
> I was running out of socket connection indicated by exception "too many open 
> files". This exception causes an infinite loop in this part of method serve():
> while (!stopped_) {
>       int failureCount = 0;
>       try {
>         TTransport client = serverTransport_.accept();
>         WorkerProcess wp = new WorkerProcess(client);
>         executorService_.execute(wp);
>       } catch (TTransportException ttx) {
>         if (!stopped_) {
>           ++failureCount;
>           LOGGER.warn("Transport error occurred during acceptance of 
> message.", ttx);
>         }
>       }
>     }
> Furthermore in an overnight process i was running out of disk space because 
> the logged exceptions were increasing the size of the log file dramatically. 
> There was no way of recovery.
> If there are any critical exceptions the while-loop will never be stopped. 
> This can only be done by calling stop() method.
> The question is how to handle such exceptions as described above in general? 
> I can't even catch an exception because the exception is just logged but not 
> handled in any way. So there is no way of reacting for doing some cleanup or 
> restarting the server for example.
> Best Regards 
> Bert Passek

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to