Re: MapFile performance

2009-08-03 Thread Tom White
--- From: "Andy Liu" > > Newsgroups: gmane.comp.jakarta.lucene.hadoop.user > To: > Sent: Tuesday, July 28, 2009 7:53 AM > Subject: MapFile performance > > >> I have a bunch of Map/Reduce jobs that process documents and writes the >> results out to a

Re: MapFile performance

2009-08-02 Thread Billy Pearson
- Original Message - From: "Andy Liu" Newsgroups: gmane.comp.jakarta.lucene.hadoop.user To: Sent: Tuesday, July 28, 2009 7:53 AM Subject: MapFile performance I have a bunch of Map/Reduce jobs that process documents and writes the results out to a few MapFiles. These MapFiles are su

Re: MapFile performance

2009-07-30 Thread David B. Ritch
I've wondered about the possibility of adding HDFS as a back-end to an existing key-value store, like EHCache or Sleepycat. There are several such projects that have excellent engineering and address problems such as this. There are advantages to incorporating them, rather than re-writing them.

MapFile performance

2009-07-28 Thread Andy Liu
I have a bunch of Map/Reduce jobs that process documents and writes the results out to a few MapFiles. These MapFiles are subsequently searched in an interactive application. One problem I'm running into is that if the values in the MapFile data file are fairly large, lookup can be slow. This is