[ 
https://issues.apache.org/jira/browse/HDFS-516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Booth updated HDFS-516:
---------------------------

    Attachment: hdfs-516-20090824.patch

Changed caching system to ehcache, which gives us a configurable api with 
better caching options.  Fixed all known correctness bugs and added lots of 
test coverage.

Got SequenceFileSearcher working, but it depends on HADOOP-6196 -- I could fold 
the searcher code and searcher test into HADOOP-6196 if people are interested 
in the functionality.

Added benchmarking utils, .sh scripts intended to be run from HADOOP_HOME/bin.

Note, if anyone installs, run ant bin-package from project root before trying 
to run radfs unit tests.

TODO:  
benchmark
javadoc, pretty up codebase, change unit tests to JUnit 4 style, add filesystem 
contract unit test

I'll get a fork going on github soon if anyone's interested in trying it out.  

> Low Latency distributed reads
> -----------------------------
>
>                 Key: HDFS-516
>                 URL: https://issues.apache.org/jira/browse/HDFS-516
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jay Booth
>            Priority: Minor
>         Attachments: hdfs-516-20090824.patch, radfs.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I created a method for low latency random reads using NIO on the server side 
> and simulated OS paging with LRU caching and lookahead on the client side.  
> Some applications could include lucene searching (term->doc and doc->offset 
> mappings are likely to be in local cache, thus much faster than nutch's 
> current FsDirectory impl and binary search through record files (bytes at 
> 1/2, 1/4, 1/8 marks are likely to be cached)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to