[ 
https://issues.apache.org/jira/browse/TEZ-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17780162#comment-17780162
 ] 

Ayush Saxena commented on TEZ-4518:
-----------------------------------

{quote}I was trying to point out that having a restriction on number of spills 
may not be a generic way (e.g some apps gets launched with higher memory and 
their spill ratios & size will be different than the regular ones). So having a 
limit of say 500 on this, can be different for apps with different mem 
requirements.
{quote}
And
{quote}Number of spills can be different between DefaultSorter and 
PipelineSorter. PipelineSorter tries to allocate 32 MB chunk (slightly lesser 
than that). So the cluster level number you are mentioning will be dependent on 
which sorter is being chosen. Value for one sorter will not be the same for 
other sorter. 
{quote}
hmm, reading comments from Rajesh above, to me also it feels like limiting 
spills might not be very cool thing to do, I don't have strong opinions against 
it, but doesn't look very useful to me in generic way, I will pass & let other 
take a call. thnx

> Limit number of spill files getting created
> -------------------------------------------
>
>                 Key: TEZ-4518
>                 URL: https://issues.apache.org/jira/browse/TEZ-4518
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Mudit Sharma
>            Priority: Critical
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Hi,
>  
> We have been facing some issues where many of our cluster node disks go full 
> because of some rogue applications creating a lot of spill data
> We wanted to fail the app if more than a threshold amount of spill files are 
> written
> Please let us know if any such capability is supported
>  
> If the capability is not there, we are proposing it to support it via a 
> config, we have added a PR for the same: 
> https://github.com/apache/tez/pull/312, please let us know your thoughts on it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to