Jim Brennan created HADOOP-15548:
------------------------------------

             Summary: Randomize local dirs
                 Key: HADOOP-15548
                 URL: https://issues.apache.org/jira/browse/HADOOP-15548
             Project: Hadoop Common
          Issue Type: Bug
            Reporter: Jim Brennan
            Assignee: Jim Brennan


shuffle LOCAL_DIRS, LOG_DIRS and LOCAL_USER_DIRS when launching container. Some 
applications will process these in exactly the same way in every container 
(e.g. roundrobin) which can cause disks to get unnecessarily overloaded (e.g. 
one output file written to first entry specified in the environment variable).

There are two paths for local dir allocation, depending on whether the size is 
unknown or known.  The unknown path already uses a random algorithm.  The known 
path initializes with a random starting point, and then goes round-robin after 
that.  When selecting a dir, it increments the last used by one and then checks 
sequentially until it finds a dir that satisfies the request.  Proposal is to 
increment by a random value of between 1 and num_dirs - 1, and then check 
sequentially from there.  This should result in a more random selection in all 
cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to