d the following wording, to mean "can I mmap
> files that are in hdfs"
> On Mon, Apr 11, 2011 at 3:57 PM, Benson Margulies
> wrote:
>>
>> We have some very large files that we access via memory mapping in
>> Java. Someone's asked us about how to make this c
Guys, I'm not the one who said 'HDFS' unless I had a brain bubble in
my original message. I asked for a distribution mechanism for
code+mappable data. I appreciate the arrival of some suggestions.
Ted is correct that I know quite a bit about mmap; I had a lot to do
with the code in ObjectStore tha
Here's the OP again.
I want to make it clear that my question here has to do with the
problem of distributing 'the program' around the cluster, not 'the
data'. In the case at hand, the issue a system that has a large data
resource that it needs to do its work. Every instance of the code
needs the
We have some very large files that we access via memory mapping in
Java. Someone's asked us about how to make this conveniently
deployable in Hadoop. If we tell them to put the files into hdfs, can
we obtain a File for the underlying file on any given node?
We have fairly good evidence that, as of 0.20.2, hadoop does not set
the thread context class loader to the class loader than includes all
the .jar files from the lib subdirectory of a job jar.
Code we wrote (which is sitting in the 'main' part of the job jar)
calls a class in Mahout (which is sit