--- From: "Andy Liu"
>
> Newsgroups: gmane.comp.jakarta.lucene.hadoop.user
> To:
> Sent: Tuesday, July 28, 2009 7:53 AM
> Subject: MapFile performance
>
>
>> I have a bunch of Map/Reduce jobs that process documents and writes the
>> results out to a
- Original Message -
From: "Andy Liu"
Newsgroups: gmane.comp.jakarta.lucene.hadoop.user
To:
Sent: Tuesday, July 28, 2009 7:53 AM
Subject: MapFile performance
I have a bunch of Map/Reduce jobs that process documents and writes the
results out to a few MapFiles. These MapFiles are su
I've wondered about the possibility of adding HDFS as a back-end to an
existing key-value store, like EHCache or Sleepycat. There are several
such projects that have excellent engineering and address problems such
as this. There are advantages to incorporating them, rather than
re-writing them.
I have a bunch of Map/Reduce jobs that process documents and writes the
results out to a few MapFiles. These MapFiles are subsequently searched in
an interactive application.
One problem I'm running into is that if the values in the MapFile data file
are fairly large, lookup can be slow. This is