Thanks Colin! On Mon, Jun 17, 2013 at 11:39 PM, Colin McCabe <cmcc...@alumni.cmu.edu> wrote: > Thanks for reminding me. I filed > https://issues.apache.org/jira/browse/HDFS-4911 for this. > > 4307 was about making the cache robust against programs that change > the wall-clock time. > > best, > Colin > > > On Sun, Jun 16, 2013 at 7:29 AM, Harsh J <ha...@cloudera.com> wrote: >> Hi Colin, >> >> Do we have a JIRA already for this? Is it >> https://issues.apache.org/jira/browse/HDFS-4307? >> >> On Mon, Jun 10, 2013 at 11:05 PM, Todd Lipcon <t...@cloudera.com> wrote: >>> +1 for dropping the client side expiry down to something like 1-2 seconds. >>> I'd rather do that than up the server side, since the server side resource >>> (DN threads) is likely to be more contended. >>> >>> -Todd >>> >>> On Fri, Jun 7, 2013 at 4:29 PM, Colin McCabe <cmcc...@alumni.cmu.edu> wrote: >>> >>>> Hi all, >>>> >>>> HDFS-941 added dfs.datanode.socket.reuse.keepalive. This allows >>>> DataXceiver worker threads in the DataNode to linger for a second or >>>> two after finishing a request, in case the client wants to send >>>> another request. On the client side, HDFS-941 added a SocketCache, so >>>> that subsequent client requests could reuse the same socket. Sockets >>>> were closed purely by an LRU eviction policy. >>>> >>>> Later, HDFS-3373 added a minimum expiration time to the SocketCache, >>>> and added a thread which periodically closed old sockets. >>>> >>>> However, the default timeout for SocketCache (which is now called >>>> PeerCache) is much longer than the DN would possibly keep the socket >>>> open. Specifically, dfs.client.socketcache.expiryMsec defaults to 2 * >>>> 60 * 1000 (2 minutes), whereas dfs.datanode.socket.reuse.keepalive >>>> defaults to 1000 (1 second). >>>> >>>> I'm not sure why we have such a big disparity here. It seems like >>>> this will inevitably lead to clients trying to use sockets which have >>>> gone stale, because the server closes them way before the client >>>> expires them. Unless I'm missing something, we should probably either >>>> lengthen the keepalive, or shorten the socket cache expiry, or both. >>>> >>>> thoughts? >>>> Colin >>>> >>> >>> >>> >>> -- >>> Todd Lipcon >>> Software Engineer, Cloudera >> >> >> >> -- >> Harsh J
-- Harsh J