>  The others you will have to read more conventionally

True.  I think there are emergent use cases that demand data locality,
eg, an optimized HBase system, search, and MMap'ing.

> If all blocks are guaranteed local, this would work.  I don't think that 
> guarantee is possible
> on a non-trivial cluster

Interesting.  I'm not familiar with how blocks go local, however I'm
interested in how to make this occur via a manual oriented call.  Eg,
is there an option available that guarantees locality, and if not,
perhaps there's work being done towards that path?

On Tue, Apr 12, 2011 at 8:08 AM, Ted Dunning <tdunn...@maprtech.com> wrote:
> Well, no.
> You could mmap all the blocks that are local to the node your program is on.
>  The others you will have to read more conventionally.  If all blocks are
> guaranteed local, this would work.  I don't think that guarantee is possible
> on a non-trivial cluster.
>
> On Tue, Apr 12, 2011 at 6:32 AM, Jason Rutherglen
> <jason.rutherg...@gmail.com> wrote:
>>
>> Then one could MMap the blocks pertaining to the HDFS file and piece
>> them together.  Lucene's MMapDirectory implementation does just this
>> to avoid an obscure JVM bug.
>>
>> On Mon, Apr 11, 2011 at 9:09 PM, Ted Dunning <tdunn...@maprtech.com>
>> wrote:
>> > Yes.  But only one such block. That is what I meant by chunk.
>> > That is fine if you want that chunk but if you want to mmap the entire
>> > file,
>> > it isn't real useful.
>> >
>> > On Mon, Apr 11, 2011 at 6:48 PM, Jason Rutherglen
>> > <jason.rutherg...@gmail.com> wrote:
>> >>
>> >> What do you mean by local chunk?  I think it's providing access to the
>> >> underlying file block?
>> >>
>> >> On Mon, Apr 11, 2011 at 6:30 PM, Ted Dunning <tdunn...@maprtech.com>
>> >> wrote:
>> >> > Also, it only provides access to a local chunk of a file which isn't
>> >> > very
>> >> > useful.
>> >> >
>> >> > On Mon, Apr 11, 2011 at 5:32 PM, Edward Capriolo
>> >> > <edlinuxg...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> On Mon, Apr 11, 2011 at 7:05 PM, Jason Rutherglen
>> >> >> <jason.rutherg...@gmail.com> wrote:
>> >> >> > Yes you can however it will require customization of HDFS.  Take a
>> >> >> > look at HDFS-347 specifically the HDFS-347-branch-20-append.txt
>> >> >> > patch.
>> >> >> >  I have been altering it for use with HBASE-3529.  Note that the
>> >> >> > patch
>> >> >> > noted is for the -append branch which is mainly for HBase.
>> >> >> >
>> >> >> > On Mon, Apr 11, 2011 at 3:57 PM, Benson Margulies
>> >> >> > <bimargul...@gmail.com> wrote:
>> >> >> >> We have some very large files that we access via memory mapping
>> >> >> >> in
>> >> >> >> Java. Someone's asked us about how to make this conveniently
>> >> >> >> deployable in Hadoop. If we tell them to put the files into hdfs,
>> >> >> >> can
>> >> >> >> we obtain a File for the underlying file on any given node?
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >> This features it not yet part of hadoop so doing this is not
>> >> >> "convenient".
>> >> >
>> >> >
>> >
>> >
>
>

Reply via email to