Re: LocalDirAllocator and getLocalPathForWrite

Todd Lipcon Wed, 05 Jan 2011 12:19:27 -0800

Hi Marc,

LocalDirAllocator is an internal-facing API and you shouldn't be using it
from user code. If you write into mapred.local.dir like this, you will end
up with conflicts between different tasks running from the same node.


The working directory of your MR task is already within one of the drives,
and there isn't usually a good reason to write to multiple drives from
within a task - you should get parallelism by running multiple tasks at the
same time, not by having each task write to multiple places.

Thanks
-Todd

On Wed, Jan 5, 2011 at 8:35 AM, Marc Sturlese <marc.sturl...@gmail.com>wrote:

>
> I have a doubt about how this works. The API documentation says that the
> class LocalDirAllocator is: "An implementation of a round-robin scheme for
> disk allocation for creating files"
> I am wondering, the disk allocation is done in the constructor?
> Let's say I have a cluster of just 1 node and 4 disks and I do inside a
> reducer:
> LocalDirAllocator localDirAlloc = new
> LocalDirAllocator("mapred.local.dir");
> Path pathA = localDirAlloc.getLocalPathForWrite("a") ;
> Path pathB = localDirAlloc.getLocalPathForWrite("b") ;
>
> The local paths pathA and pathB will for sure be in the same local disk as
> it was allocated by new LocalDirAllocator("mapred.local.dir") or is
> getLocalPathForWrite who gets the disk and so the two paths might not be in
> the same disk (as I have 4 disks)?
>
> Thanks in advance
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/LocalDirAllocator-and-getLocalPathForWrite-tp2199517p2199517.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: LocalDirAllocator and getLocalPathForWrite

Reply via email to