Shi, you can certainly use one of the in-memory file systems like mcached,
but it's a lot of work to set that up just to avoid two i/o operations.
HTH
Cliff

On Wed, Sep 22, 2010 at 5:06 PM, Shi Yu <sh...@uchicago.edu> wrote:

> Dear Hadoopers,
>
> I am stuck at a probably very simple problem but can't figure it out. In
> the Hadoop Map/Reduce framework, I want to search a huge file (which is
> generated by another Reduce task) for a unique line of record (a <String,
> double> value actually). That record is expected to be passed to another
> function. I have read the previous post about using Mapper only output to
> HBase (
> http://www.mail-archive.com/hbase-u...@hadoop.apache.org/msg06579.html)
> and another post (
> http://www.mail-archive.com/hbase-u...@hadoop.apache.org/msg07337.html).
> They are both very interesting, however, I am still confused about how to
> circle away from writing to HBase, but to use the returned record directly
> from memory? I guess my problem doesn't need a reducer, so basically
> load-balance the search task via multiple Mappers. I want to have something
> like this
>
>   class myClass
>          method seekResultbyMapper (string toSearch, path reduceFile)
>               call Map(a,b)
>               do some simple calculation
>               return <String, double> result
>
>    class anotherClass
> <String, double>  para  =  myClass.seekResultbyMapper (c,d)
>
>
> I don't know whether this is doable (maybe it is not a valid style in
> Map/Reduce framework)? How to implement it using JAVA API? Thanks for any
> suggestion in advance.
>
>
> Best Regards,
>
> Shi
>
> --
> Postdoctoral Scholar
> Institute for Genomics and Systems Biology
> Department of Medicine, the University of Chicago
> Knapp Center for Biomedical Discovery
> 900 E. 57th St. Room 10148
> Chicago, IL 60637, US
> Tel: 773-702-6799
>
>

Reply via email to