[
https://issues.apache.org/jira/browse/MAPREDUCE-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17777399#comment-17777399
]
ASF GitHub Bot commented on MAPREDUCE-7457:
-------------------------------------------
mudit1289 commented on code in PR #6155:
URL: https://github.com/apache/hadoop/pull/6155#discussion_r1365889906
##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestMapTask.java:
##########
@@ -84,4 +93,77 @@ public void testShufflePermissions() throws Exception {
Assert.assertEquals("Incorrect index file perms",
(short)0640, perms.toShort());
}
+
+ @Test
+ public void testSpillFilesCountLimitInvalidValue() throws Exception {
+ JobConf conf = new JobConf();
+ conf.set(CommonConfigurationKeys.FS_PERMISSIONS_UMASK_KEY, "077");
+ conf.set(MRConfig.LOCAL_DIR, TEST_ROOT_DIR.getAbsolutePath());
+ conf.setInt(MRJobConfig.SPILL_FILES_COUNT_LIMIT, -2);
+ MapOutputFile mof = new MROutputFiles();
+ mof.setConf(conf);
+ TaskAttemptID attemptId = new TaskAttemptID("12345", 1, TaskType.MAP, 1,
1);
+ MapTask mockTask = mock(MapTask.class);
+ doReturn(mof).when(mockTask).getMapOutputFile();
+ doReturn(attemptId).when(mockTask).getTaskID();
+ doReturn(new Progress()).when(mockTask).getSortPhase();
+ TaskReporter mockReporter = mock(TaskReporter.class);
+ doReturn(new Counter()).when(mockReporter).getCounter(
+ any(TaskCounter.class));
Review Comment:
Addressed now, please check
##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java:
##########
@@ -984,10 +987,16 @@ public void init(MapOutputCollector.Context context
MRJobConfig.DEFAULT_IO_SORT_MB);
indexCacheMemoryLimit = job.getInt(JobContext.INDEX_CACHE_MEMORY_LIMIT,
INDEX_CACHE_MEMORY_LIMIT_DEFAULT);
+ spillFilesCountLimit = job.getInt(JobContext.SPILL_FILES_COUNT_LIMIT,
+ SPILL_FILES_COUNT_LIMIT_DEFAULT);
Review Comment:
Addressed now, please check
> Limit number of spill files getting created
> -------------------------------------------
>
> Key: MAPREDUCE-7457
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7457
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Reporter: Mudit Sharma
> Priority: Critical
> Labels: pull-request-available
>
> Hi,
>
> We have been facing some issues where many of our cluster node disks go full
> because of some rogue applications creating a lot of spill data
> We wanted to fail the app if more than a threshold amount of spill files are
> written
> Please let us know if any such capability is supported
>
> If the capability is not there, we are proposing it to support it via a
> config, we have added a PR for the same:
> [https://github.com/apache/hadoop/pull/6155] please let us know your
> thoughts on it
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]