[ https://issues.apache.org/jira/browse/HADOOP-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andras Bokor resolved HADOOP-2829. ---------------------------------- Resolution: Invalid It seems obsolete. > JT should consider the disk each task is on before scheduling jobs... > --------------------------------------------------------------------- > > Key: HADOOP-2829 > URL: https://issues.apache.org/jira/browse/HADOOP-2829 > Project: Hadoop Common > Issue Type: Improvement > Reporter: eric baldeschwieler > Priority: Major > > The DataNode can support a JBOD config, where blocks exist on explicit disks. > But this information is not exported or considered by the JT when assigning > tasks. This leads to non-optimal disk use. if 4 slots are used, 2 running > tasks will likely be on the same disk and we observe them running more slowly > then other tasks on the same machine. > We could follow a number of strategies to address this. > for example: The data nodes could support a what disk is this block on call. > Then the JT could discover the info and assign jobs accordingly. > Of course the TT itself uses disks for merge and temp space and the datanodes > on the same machine can be used by off node sources, so it is not clear > optimizing all of this is simple enough to be worth it. > This issue deserves study. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org