[ 
https://issues.apache.org/jira/browse/UNOMI-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francois Gerthoffert updated UNOMI-670:
---------------------------------------
    Fix Version/s: unomi-2.0.1

> Unomi purge system is not working
> ---------------------------------
>
>                 Key: UNOMI-670
>                 URL: https://issues.apache.org/jira/browse/UNOMI-670
>             Project: Apache Unomi
>          Issue Type: Bug
>          Components: unomi(-core)
>    Affects Versions: unomi-2.0.0, unomi-1.6.1
>            Reporter: Kevan Jahanshahi
>            Priority: Major
>             Fix For: unomi-2.0.1, unomi-1.6.2
>
>
> The purge system have been fixed recently in version 1.6.0: 
> [https://jira.jahia.org/browse/UNOMI-504] 
> The fix that was done, just fixed the query used for identifying the profiles 
> to be purged.
> Actually we figured a bit late that the purge system contains more issues 
> than that.
> Here is a non exhaustive list of the one already identified and that will 
> need fix/answers an even reflections/discussions for some of them
> h2. Bug 1: disabling the profile purge will end up with all profiles to be 
> purge everyday
> Using this conf:
> {code:java}
> org.apache.unomi.profile.purge.existTime = -1
> org.apache.unomi.profile.purge.inactiveTime = -1
> org.apache.unomi.monthly.index.purge.existTime= 12 {code}
> To only enable events/session purge will have a bad side effect: all profiles 
> of the system will be removed everyday.
> h2. bug 2: OOM issue for profile purge.
> The persistence service is implementing a method: 
> {code:java}
> /**
>  * Deletes items with the specified Item subclass matching the specified 
> {@link Condition}.
>  *
>  * @param <T>   the type of the Item subclass we want to delete
>  * @param query a {@link Condition} identifying which elements we want to 
> delete
>  * @param clazz the {@link Item} subclass of the items we want to delete
>  * @return {@code true} if the deletion was successful, {@code false} 
> otherwise
>  */
> <T extends Item> boolean removeByQuery(Condition query, Class<T> clazz); 
> {code}
> The issue with the implem is that it's doing the following:
>  * query profiles to get the IDs of documents
>  * building a batch request with a delete request per profile ID
>  * sending the batch.
> So in case of a lot profiles to be purge at once, the batch can grow a lot, 
> and consume all the memory available.
> We should not do like that, instead we should:
>  * just use the ES query: 
> [https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html]
>  directly with the query to identify the profiles
> h2. Bug 3: What is the cache system implemented in the persistence service.
> Seem's coming from: https://issues.apache.org/jira/browse/UNOMI-166 
> It's used by the persistence service, and it explain why the removeByQuery 
> implem is bad.
> The reason why the *removeByQuery* is querying all the documents IDs is to 
> invalidate the cache.
> But it's strange, the cache system is not used at all:
> {code:java}
> <cm:property name="itemClassesToCache" value="" /> {code}
> So I would say, we should remove this cache system, because:
>  * it's not tested
>  * nobody know what would be the effect if enabled
>  * it's not possible to activate it with ENV conf, or global unomi conf.
>  * it's not documented
>  * it's impacting implems like: removeByQuery, even if the cache system is 
> not used
>  * it's mostly dead code inside the core persistence of Unomi, witch is 
> dangerous to keep.
> h2. Bug 4: Purge job seem's never unregistered.
> We do register the purge job when the profile service is started, but in case 
> the profile service is stopped (stop the bundle for example). Then it's not 
> unregistered.
> h2. Bug 5: what should we do in case of unomi cluster, should all of them do 
> the purge ?
> In case of unomi cluster, all of them may try to trigger the purge, I dont 
> think it's an issue, but it look like something could be done to avoid doing 
> unecessary operations twice or more.
> h2. Final: We should also add integrations tests to cover this feature.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to