ConfX created MAPREDUCE-7445: -------------------------------- Summary: ShuffleSchedulerImpl causes ArithmeticException due to improper detailsInterval value checking Key: MAPREDUCE-7445 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7445 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: ConfX Attachments: reproduce.sh
h2. What happened There is no value checking for parameter {{{}mapreduce.reduce.shuffle.maxfetchfailures{}}}. This may cause improper calculations and crashes the system like division by 0. h2. Buggy code In {{{}ShuffleSchedulerImpl.java{}}}, there is no value checking for {{maxFetchFailuresBeforeReporting}} and this variable is directly passed to method {{{}checkAndInformMRAppMaster{}}}. When {{maxFetchFailuresBeforeReporting }} is mistakenly set to 0, the code would cause division by 0 and throw ArithmeticException to crash the system. {noformat} private void checkAndInformMRAppMaster( ... if (connectExcpt || (reportReadErrorImmediately && readError) || ((failures % maxFetchFailuresBeforeReporting) == 0) || hostFailed) { ... }{noformat} h2. How to reproduce (1) set {{{}mapreduce.reduce.shuffle.maxfetchfailures{}}}={{{}0{}}}, {{{}mapreduce.reduce.shuffle.notify.readerror{}}}={{{}false{}}} (2) run {{mvn surefire:test -Dtest=org.apache.hadoop.mapreduce.task.reduce.TestShuffleScheduler#TestSucceedAndFailedCopyMap}} h2. Stacktrace {noformat} java.lang.ArithmeticException: / by zero at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkAndInformMRAppMaster(ShuffleSchedulerImpl.java:347) at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:308) at org.apache.hadoop.mapreduce.task.reduce.TestShuffleScheduler.TestSucceedAndFailedCopyMap(TestShuffleScheduler.java:285){noformat} For an easy reproduction, run the reproduce.sh in the attachment. We are happy to provide a patch if this issue is confirmed. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org