[jira] Commented: (HDFS-941) Datanode xceiver protocol should allow reuse of a connection

Eli Collins (JIRA) Thu, 29 Apr 2010 15:00:29 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862439#action_12862439
 ]


Eli Collins commented on HDFS-941:
----------------------------------

Hey bc,

Nice change!  

Do you have any results from a non-random workload? Please collect:
# before/after TestDFSIO runs so we can see if sequential throughput is affected
# hadoop fs -put of a 1g file from n clients in parallel. I suspect this will 
improve, socket resuse should limit slow start but good to check.

How did you choose DEFAULT_CACHE_SIZE?

In the exception handler in sendReadResult can we be more specific about when 
it's OK not to be able to send the result, and throw an exception in the cases 
when it's no OK rather than swallowing all IOExceptions?

In DataXceiver#opReadBlock you throw an IOException in a try block that catches 
IOException. I think that should LOG.error and close the output stream. You can 
also chain the following if statements that check stat. 

How about asserting sock != null in putCachedSocket? Seems like this should 
never happen if the code is correct and it's easy to ignore logs.

File a jira for ERROR_CHECKSUM?

Please add a comment to the head of ReaderSocketCache explaining why we cache 
BlockReader socket pairs, as opposed to just caching sockets (because we don't 
multiplex BlockReaders over a single socket between hosts).

Nits:
* Nice comment in the BlockReader header, please define "packet" as well. Is 
the RPC specification in DataNode outdated? If so fix it or file a jira instead 
of warning readers it may be outdated. 
* Maybe better name for DN_KEEPALIVE_TIMEOUT since there is no explicit 
keepalive?  TRANSFER_TIMEOUT?
* Would rename workDone to something specific like opsProcessed or make it a 
boolean  
* Add an "a" in "with checksum"
* if needs braces eg BlockReader#read

Thanks,
Eli

> Datanode xceiver protocol should allow reuse of a connection
> ------------------------------------------------------------
>
>                 Key: HDFS-941
>                 URL: https://issues.apache.org/jira/browse/HDFS-941
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node, hdfs client
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: bc Wong
>         Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, 
> HDFS-941-3.patch
>
>
> Right now each connection into the datanode xceiver only processes one 
> operation.
> In the case that an operation leaves the stream in a well-defined state (eg a 
> client reads to the end of a block successfully) the same connection could be 
> reused for a second operation. This should improve random read performance 
> significantly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-941) Datanode xceiver protocol should allow reuse of a connection

Reply via email to