Shi, you can certainly use one of the in-memory file systems like mcached, but it's a lot of work to set that up just to avoid two i/o operations. HTH Cliff
On Wed, Sep 22, 2010 at 5:06 PM, Shi Yu <sh...@uchicago.edu> wrote: > Dear Hadoopers, > > I am stuck at a probably very simple problem but can't figure it out. In > the Hadoop Map/Reduce framework, I want to search a huge file (which is > generated by another Reduce task) for a unique line of record (a <String, > double> value actually). That record is expected to be passed to another > function. I have read the previous post about using Mapper only output to > HBase ( > http://www.mail-archive.com/hbase-u...@hadoop.apache.org/msg06579.html) > and another post ( > http://www.mail-archive.com/hbase-u...@hadoop.apache.org/msg07337.html). > They are both very interesting, however, I am still confused about how to > circle away from writing to HBase, but to use the returned record directly > from memory? I guess my problem doesn't need a reducer, so basically > load-balance the search task via multiple Mappers. I want to have something > like this > > class myClass > method seekResultbyMapper (string toSearch, path reduceFile) > call Map(a,b) > do some simple calculation > return <String, double> result > > class anotherClass > <String, double> para = myClass.seekResultbyMapper (c,d) > > > I don't know whether this is doable (maybe it is not a valid style in > Map/Reduce framework)? How to implement it using JAVA API? Thanks for any > suggestion in advance. > > > Best Regards, > > Shi > > -- > Postdoctoral Scholar > Institute for Genomics and Systems Biology > Department of Medicine, the University of Chicago > Knapp Center for Biomedical Discovery > 900 E. 57th St. Room 10148 > Chicago, IL 60637, US > Tel: 773-702-6799 > >