[jira] Commented: (HADOOP-4386) RPC support for large data transfers.

Doug Cutting (JIRA) Fri, 10 Oct 2008 09:24:06 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638583#action_12638583
 ]


Doug Cutting commented on HADOOP-4386:
--------------------------------------

Raghu> what happens when data can not be written or read without blocking.

That's indeed the rub.  RPC multiplexes over a shared connection all traffic 
between to JVMs.  But if we want to avoid a buffer copy with 
FileChannel#transferTo, we'd have to lock that connection for an unbounded time.

TransferTo and transferFrom are not async operations, but blocking operations.  
We cannot use them without a thread per connection, which is also one of the 
things we're trying to avoid.  So, unless I'm missing something, use of 
transferTo and transferFrom mandates both a socket and a thread per open file.  
I don't yet see a happy middle ground.  It seems that in order to eliminate 
extra sockets and threads we're forced to do at least one buffer copy.  Am I 
missing something?

Devaraj> We should experiment with this feature for MapReduce shuffle as well 
(maybe as a follow-up jira). 

+1  That's indeed the other non-RPC-based protocol that would be great to 
replace with RPC.


> RPC support for large data transfers.
> -------------------------------------
>
>                 Key: HADOOP-4386
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4386
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, ipc
>            Reporter: Raghu Angadi
>
> Currently HDFS has a socket level protocol for serving HDFS data to clients. 
> Clients do not use RPCs to read or write data. Fundamentally there is no 
> reason why this data transfer  can not use RPCs.
> This jira is place holder for any porting Datanode transfers to RPC. This 
> topic has been discussed in varying detail many times, the latest being in 
> the context of HADOOP-3856. There are quite a few issues to be resolved both 
> at API level and at implementation level. 
> We should probably copy some of the comments from HADOOP-3856 to here.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4386) RPC support for large data transfers.

Reply via email to