[jira] Commented: (HDFS-516) Low Latency distributed reads

Todd Lipcon (JIRA) Wed, 16 Sep 2009 15:36:24 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756264#action_12756264
 ]


Todd Lipcon commented on HDFS-516:
----------------------------------

bq. Unless you want to add checksums for better comparison, I don't think it is 
every essential.

I disagree here - checksums are responsible for a reasonable amount of the 
overhead in the current HDFS implementation. If this is mostly seen as a 
testing ground for performance improvements, we can't be comparing apples to 
oranges. If we don't want to implement checksums in RadFs, then the other 
option for fair comparison is to remove checksums from Hdfs.

If this is about a testing ground for new features, then that makes sense, but 
I understood this mostly as a "turbo HDFS client experimentation"

> Low Latency distributed reads
> -----------------------------
>
>                 Key: HDFS-516
>                 URL: https://issues.apache.org/jira/browse/HDFS-516
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jay Booth
>            Priority: Minor
>         Attachments: hdfs-516-20090912.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I created a method for low latency random reads using NIO on the server side 
> and simulated OS paging with LRU caching and lookahead on the client side.  
> Some applications could include lucene searching (term->doc and doc->offset 
> mappings are likely to be in local cache, thus much faster than nutch's 
> current FsDirectory impl and binary search through record files (bytes at 
> 1/2, 1/4, 1/8 marks are likely to be cached)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-516) Low Latency distributed reads

Reply via email to