> The others you will have to read more conventionally True. I think there are emergent use cases that demand data locality, eg, an optimized HBase system, search, and MMap'ing.
> If all blocks are guaranteed local, this would work. I don't think that > guarantee is possible > on a non-trivial cluster Interesting. I'm not familiar with how blocks go local, however I'm interested in how to make this occur via a manual oriented call. Eg, is there an option available that guarantees locality, and if not, perhaps there's work being done towards that path? On Tue, Apr 12, 2011 at 8:08 AM, Ted Dunning <tdunn...@maprtech.com> wrote: > Well, no. > You could mmap all the blocks that are local to the node your program is on. > The others you will have to read more conventionally. If all blocks are > guaranteed local, this would work. I don't think that guarantee is possible > on a non-trivial cluster. > > On Tue, Apr 12, 2011 at 6:32 AM, Jason Rutherglen > <jason.rutherg...@gmail.com> wrote: >> >> Then one could MMap the blocks pertaining to the HDFS file and piece >> them together. Lucene's MMapDirectory implementation does just this >> to avoid an obscure JVM bug. >> >> On Mon, Apr 11, 2011 at 9:09 PM, Ted Dunning <tdunn...@maprtech.com> >> wrote: >> > Yes. But only one such block. That is what I meant by chunk. >> > That is fine if you want that chunk but if you want to mmap the entire >> > file, >> > it isn't real useful. >> > >> > On Mon, Apr 11, 2011 at 6:48 PM, Jason Rutherglen >> > <jason.rutherg...@gmail.com> wrote: >> >> >> >> What do you mean by local chunk? I think it's providing access to the >> >> underlying file block? >> >> >> >> On Mon, Apr 11, 2011 at 6:30 PM, Ted Dunning <tdunn...@maprtech.com> >> >> wrote: >> >> > Also, it only provides access to a local chunk of a file which isn't >> >> > very >> >> > useful. >> >> > >> >> > On Mon, Apr 11, 2011 at 5:32 PM, Edward Capriolo >> >> > <edlinuxg...@gmail.com> >> >> > wrote: >> >> >> >> >> >> On Mon, Apr 11, 2011 at 7:05 PM, Jason Rutherglen >> >> >> <jason.rutherg...@gmail.com> wrote: >> >> >> > Yes you can however it will require customization of HDFS. Take a >> >> >> > look at HDFS-347 specifically the HDFS-347-branch-20-append.txt >> >> >> > patch. >> >> >> > I have been altering it for use with HBASE-3529. Note that the >> >> >> > patch >> >> >> > noted is for the -append branch which is mainly for HBase. >> >> >> > >> >> >> > On Mon, Apr 11, 2011 at 3:57 PM, Benson Margulies >> >> >> > <bimargul...@gmail.com> wrote: >> >> >> >> We have some very large files that we access via memory mapping >> >> >> >> in >> >> >> >> Java. Someone's asked us about how to make this conveniently >> >> >> >> deployable in Hadoop. If we tell them to put the files into hdfs, >> >> >> >> can >> >> >> >> we obtain a File for the underlying file on any given node? >> >> >> >> >> >> >> > >> >> >> >> >> >> This features it not yet part of hadoop so doing this is not >> >> >> "convenient". >> >> > >> >> > >> > >> > > >