Re: Can't Change Retention Period for YARN Log Aggregation

2019-11-21 Thread Prabhu Josephraj
The deletion service runs as part of MapReduce JobHistoryServer. Can you
try restarting it?

On Fri, Nov 22, 2019 at 3:42 AM David M  wrote:

> All,
>
>
>
> I have an HDP 2.6.1 cluster where we’ve had
> yarn.log-aggregation.retain-seconds set to 30 days for a while, and
> everything was working properly. Four days ago we changed the property to
> 15 days instead and restarted the services. The check interval is set to
> the default, so we expected within 1.5 days, we’d see the logs older than
> 15 days deleted.
>
>
>
> For some reason, we are still seeing 30 days of logs kept. The other
> properties all seem to be set properly. The only weird setting I can find
> is that we are using the LogAggregationIndexedFileController as our primary
> file controller class. The LogAggregationTFileController is still available
> as the second in the list.
>
>
>
> I found YARN-8279 (https://issues.apache.org/jira/browse/YARN-8279),
> which seems sort of related, except that we are still seeing logs being put
> into the right suffix folder, and it still seems to be deleting logs older
> than 30 days. It just doesn’t seem to have updated to 15 days as the cutoff
> instead.
>
>
>
> I’ve looked in the logs for the Resource Manager, Timeline Server, and one
> of the Name Nodes, and nothing that would explain this has popped up. Any
> ideas where to go to figure out what is happening? Additionally, can
> someone confirm in which process the deletion service actually runs? Is it
> the resource manager, timeline server, or something else?
>
>
>
> Thanks!
>
>
>
> David McGinnis
>
>
>


Can't Change Retention Period for YARN Log Aggregation

2019-11-21 Thread David M
All,

I have an HDP 2.6.1 cluster where we've had yarn.log-aggregation.retain-seconds 
set to 30 days for a while, and everything was working properly. Four days ago 
we changed the property to 15 days instead and restarted the services. The 
check interval is set to the default, so we expected within 1.5 days, we'd see 
the logs older than 15 days deleted.

For some reason, we are still seeing 30 days of logs kept. The other properties 
all seem to be set properly. The only weird setting I can find is that we are 
using the LogAggregationIndexedFileController as our primary file controller 
class. The LogAggregationTFileController is still available as the second in 
the list.

I found YARN-8279 (https://issues.apache.org/jira/browse/YARN-8279), which 
seems sort of related, except that we are still seeing logs being put into the 
right suffix folder, and it still seems to be deleting logs older than 30 days. 
It just doesn't seem to have updated to 15 days as the cutoff instead.

I've looked in the logs for the Resource Manager, Timeline Server, and one of 
the Name Nodes, and nothing that would explain this has popped up. Any ideas 
where to go to figure out what is happening? Additionally, can someone confirm 
in which process the deletion service actually runs? Is it the resource 
manager, timeline server, or something else?

Thanks!

David McGinnis