Writable RPC had a lot of leftover TCP connections in CLOSE_WAIT after RPC_TIMEOUT is enabled
Hi, I'm using hadoop-2.2.0 and take advantage of Hadoop WritableRpcEngine to build my distributed application, and I have 'heartbeat' interface in my application to check availability periodically, in order to detect any potential failure, I enabled rpc_timeout when creating the proxy as below int rpcTimeout=1000;// 1 second as rpc timeout RPC.waitForProxy( MyApplicationInterface.class, MyApplicationInterface.versionID, socAddr, conf, rpcTimeout, timeout); Everything went fine initially, I can see failures can be detected by the heartbeat, but after a period of time(2 days or so), I saw a lot of TCP connections in CLOSE_WAIT state on server side, and client was not able to connect to it again. Any clue about this? Thanks -- --Anfernee
Re: Writable RPC had a lot of leftover TCP connections in CLOSE_WAIT after RPC_TIMEOUT is enabled
Why don't you base your application on ProtobufRpcEngine ? Cheers On Tue, Jun 10, 2014 at 10:42 AM, Anfernee Xu anfernee...@gmail.com wrote: Hi, I'm using hadoop-2.2.0 and take advantage of Hadoop WritableRpcEngine to build my distributed application, and I have 'heartbeat' interface in my application to check availability periodically, in order to detect any potential failure, I enabled rpc_timeout when creating the proxy as below int rpcTimeout=1000;// 1 second as rpc timeout RPC.waitForProxy( MyApplicationInterface.class, MyApplicationInterface.versionID, socAddr, conf, rpcTimeout, timeout); Everything went fine initially, I can see failures can be detected by the heartbeat, but after a period of time(2 days or so), I saw a lot of TCP connections in CLOSE_WAIT state on server side, and client was not able to connect to it again. Any clue about this? Thanks -- --Anfernee
Re: Writable RPC had a lot of leftover TCP connections in CLOSE_WAIT after RPC_TIMEOUT is enabled
Because it's kind of legacy system I built 4-5 years back with Hadoop 0.2.x release, and recently we moved to 2.2.0 release. Moving to ProtocolBuffer is one option but we need to migrate our infrastructure(hadoop and so on) first and get it working(no regressions). Is it a known issue? Thanks On Tue, Jun 10, 2014 at 10:47 AM, Ted Yu yuzhih...@gmail.com wrote: Why don't you base your application on ProtobufRpcEngine ? Cheers On Tue, Jun 10, 2014 at 10:42 AM, Anfernee Xu anfernee...@gmail.com wrote: Hi, I'm using hadoop-2.2.0 and take advantage of Hadoop WritableRpcEngine to build my distributed application, and I have 'heartbeat' interface in my application to check availability periodically, in order to detect any potential failure, I enabled rpc_timeout when creating the proxy as below int rpcTimeout=1000;// 1 second as rpc timeout RPC.waitForProxy( MyApplicationInterface.class, MyApplicationInterface.versionID, socAddr, conf, rpcTimeout, timeout); Everything went fine initially, I can see failures can be detected by the heartbeat, but after a period of time(2 days or so), I saw a lot of TCP connections in CLOSE_WAIT state on server side, and client was not able to connect to it again. Any clue about this? Thanks -- --Anfernee -- --Anfernee