[ 
https://issues.apache.org/jira/browse/OOZIE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172028#comment-14172028
 ] 

Shwetha G S commented on OOZIE-1803:
------------------------------------

Currently, a workflow is deleted only if all its child workflows are complete 
and this is done recursively as sub-worklows can contain sub-workflows 
again(theoretically). Samething goes for coordinator actions, making sure all 
the related workflows are complete and so on. So, the code is pretty 
complicated and runs slow. How about we simplify this and delete all workflows 
whose created time is older than say 15 days(configurable). We can use the same 
logic even for coord actions, but instead of constant 15 days, it can be some 
function of coord frequency(may be how many instances to retain). For 
coordinator and bundle, we can use end time. They are small tables anyways.

If no one has looked at a stuck workflow for more than 15 days(configurable), I 
don't think they will need it anyways. This is the only way it will work with 
partitioning.

This logic serves the purpose, simple and runs faster. Both Yahoo and InMobi 
run this as cron outside oozie. Why not make it part of oozie and provide it as 
an alternative purging logic. Users can choose depend on their usecase.

> Improvement in Purge service
> ----------------------------
>
>                 Key: OOZIE-1803
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1803
>             Project: Oozie
>          Issue Type: Improvement
>          Components: core
>            Reporter: Jaydeep Vishwakarma
>            Assignee: Jaydeep Vishwakarma
>         Attachments: OOZIE-1803-v1.patch, OOZIE-1803-v2.patch, 
> OOZIE-1803-v3.patch, OOZIE-1803.patch, purgeservice-1.patch, 
> purgeservice.patch
>
>
> Current purge service of oozie have some performance issues and it might help 
> to look at the queries and indexes to improve the the purge service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to