[ 
https://issues.apache.org/jira/browse/TEZ-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775145#comment-17775145
 ] 

Rajesh Balamohan commented on TEZ-4518:
---------------------------------------

Number of spills can be different between DefaultSorter and PipelineSorter. 
PipelineSorter tries to allocate 32 MB chunk (slightly lesser than that). So 
the cluster level number you are mentioning will be dependent on which sorter 
is being chosen. Value for one sorter will not be the same for other sorter. 

You can rename it mainly for sort spills. I will leave it to rest of the 
members for any further opinions and reviews.

> Limit number of spill files getting created
> -------------------------------------------
>
>                 Key: TEZ-4518
>                 URL: https://issues.apache.org/jira/browse/TEZ-4518
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Mudit Sharma
>            Priority: Major
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Hi,
>  
> We have been facing some issues where many of our cluster node disks go full 
> because of some rogue applications creating a lot of spill data
> We wanted to fail the app if more than a threshold amount of spill files are 
> written
> Please let us know if any such capability is supported
>  
> If the capability is not there, we are proposing it to support it via a 
> config, we have added a PR for the same: 
> https://github.com/apache/tez/pull/312, please let us know your thoughts on it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to