[
https://issues.apache.org/jira/browse/TEZ-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773335#comment-17773335
]
Mudit Sharma commented on TEZ-4518:
-----------------------------------
[~okumin] thanks for sharing related tickets, we wanted to propose limit on
count rather than size as in some systems like HDFS, size of data and also
amount of objects both can create an issue if going beyond a certain value and
since by limiting count we can indirectly limit size also so we started with
this
But we are also evaluating right now some external watcher service which can
kill apps if tasks are buggy
> Limit number of spill files getting created
> -------------------------------------------
>
> Key: TEZ-4518
> URL: https://issues.apache.org/jira/browse/TEZ-4518
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Mudit Sharma
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Hi,
>
> We have been facing some issues where many of our cluster node disks go full
> because of some rogue applications creating a lot of spill data
> We wanted to fail the app if more than a threshold amount of spill files are
> written
> Please let us know if any such capability is supported
>
> If the capability is not there, we are proposing it to support it via a
> config, we have added a PR for the same:
> https://github.com/apache/tez/pull/312, please let us know your thoughts on it
--
This message was sent by Atlassian Jira
(v8.20.10#820010)