[ 
https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866485#action_12866485
 ] 

bc Wong commented on HDFS-941:
------------------------------

Thanks for looking at the patch, Sam.

bq. avoid making the cache implementation tied to the class ReaderSocketCache.

Do you mean separating out a more generic cache class? I've thought about that 
quite a bit. But in the end, the socket cache I need is too tied to the 
{{BlockReader}} (i.e. the concept of "owner"), because the {{BlockReader}} 
doesn't multiplex on the same socket. Detaching the {{BlockReader}} from the 
{{CacheEntry}} could mean having an extra interface, which seems unnecessary 
until another usage scenario arrives. We can factor it out later when needed, 
since the caching implementation shouldn't affect the API/protocol.

bq. Don't make the cache a static member of the same class. Let the cache be an 
instantiable object. Let DFSClient store the cache either as an instance or 
static var

Sounds good. I'll storing it in the DFSClient.

bq. not clear why sockets are always in the cache even if not usable

Only usable sockets are put into the cache. However, sockets can become 
ususable while they sit in the cache. The connection could drop. The DN side 
closes it after a short wait. So when a client wants to retrieve a socket, it 
might be bad. Here, we could:
# Handle that from within the cache and do our best to return a good socket to 
the client, or
# Let the client check the socket and retry.

I went for (1) because it's less waste. Sorry about the misunderstanding. I'll 
put in more comments.

bq. shouldn't there be a cap on the # of sockets there can be in the cache?

There is. The code checks against {{cacheLimit}} when adding sockets to the 
free map.

bq. the freemap has to be traversed every time to get a socket in a sync block.

True. I think it only sounds bad (but actually isn't). The reader has to read a 
lot of blocks simultaneously to see contention. First, the {{cacheLimit}} 
should likely be small (default to 16) for performance reason, to avoid hitting 
too many unusable sockets. Second, it's only traversing through *a portion* of 
the {{freeMap}}, the list of sockets to the same target IP. Plus, I kind of 
want it to check for unusable sockets before returning to the caller. I feel 
that a finer locking granularity here could actually make it slower.

bq. do we have real performance benchmarks from actual clusters that show a 
significant benefit?

I'll ask Todd to provide more details. He was testing a backported version of 
the patch with HBase. *Preliminary* results showed a 20% gain.


> Datanode xceiver protocol should allow reuse of a connection
> ------------------------------------------------------------
>
>                 Key: HDFS-941
>                 URL: https://issues.apache.org/jira/browse/HDFS-941
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node, hdfs client
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: bc Wong
>         Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, 
> HDFS-941-3.patch, HDFS-941-4.patch
>
>
> Right now each connection into the datanode xceiver only processes one 
> operation.
> In the case that an operation leaves the stream in a well-defined state (eg a 
> client reads to the end of a block successfully) the same connection could be 
> reused for a second operation. This should improve random read performance 
> significantly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to