[ 
https://issues.apache.org/jira/browse/HDFS-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12755627#action_12755627
 ] 

Raghu Angadi commented on HDFS-516:
-----------------------------------


bq. somehow, from 213 seconds to 112 seconds to stream 1GB from a remote HDFS 
file.

This is 5MBps for HDFS and 9MBps for RadFS. Assuming 9MBps is probably 100Mbps 
network limit (is it?), 5MBps is too low for any FS. Since both reads are from 
the same physical files, this may not be hardware related. Could you check what 
is causing this delay? This might be affecting other benchmarks as well. 
Checking netstat on the client while this read is going on might help.

Regd reads in RAD fs, does client fetch 32KB each time (single RPC) or does it 
pipeline (multiple requests for single client's stream)?

@Todd, I essentially see this as POC of what could/should be improved in HDFS 
for addressing latency issues. Contrib makes sense, but I would not expect this 
to go to production in this form and should be marked 'Experimental'. The 
benchmarks also help greatly in setting priorities for features. I don't think 
this needs a branch since it does not touch core at all.  

> Low Latency distributed reads
> -----------------------------
>
>                 Key: HDFS-516
>                 URL: https://issues.apache.org/jira/browse/HDFS-516
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jay Booth
>            Priority: Minor
>         Attachments: hdfs-516-20090912.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I created a method for low latency random reads using NIO on the server side 
> and simulated OS paging with LRU caching and lookahead on the client side.  
> Some applications could include lucene searching (term->doc and doc->offset 
> mappings are likely to be in local cache, thus much faster than nutch's 
> current FsDirectory impl and binary search through record files (bytes at 
> 1/2, 1/4, 1/8 marks are likely to be cached)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to