[ https://issues.apache.org/jira/browse/UNOMI-748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703540#comment-17703540 ]
Kevan Jahanshahi edited comment on UNOMI-748 at 3/22/23 8:22 AM: ----------------------------------------------------------------- Merge code have been improved in PR: [https://github.com/apache/unomi/pull/593], but an other ticket have been created to update all other place that would need to be updated: https://issues.apache.org/jira/browse/UNOMI-753 was (Author: jkevan): Merge code have been improved, but an other ticket have been created to update all other place that would need to be updated: https://issues.apache.org/jira/browse/UNOMI-753 > Unomi merge system is exposed to OOM > ------------------------------------ > > Key: UNOMI-748 > URL: https://issues.apache.org/jira/browse/UNOMI-748 > Project: Apache Unomi > Issue Type: Improvement > Affects Versions: unomi-2.1.0 > Reporter: Kevan Jahanshahi > Assignee: Kevan Jahanshahi > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > currently the sessions/events *update* is using bulkProcessor and it is > asynchronous, we never know when the bulk will be perform. > * t{+}he benefit{+}: fast merge requests, the merge request is fast as > nothing is retain, bulk processor will do the job in a separate thread. > * {+}the cons{+}: {*}all previous sessions/events are first loaded in > memory{*}, so in case of merging active profiles that contains a lot of past > events/sessions, {{{}we could be exposed to OOM{}}}. {_}(We already had > similar case with the purge that was loading all profiles in memory.{_}) > If we replace the *update(one item at a time)* by using {*}updateByQuery{*}, > the request will loose it’s asynchronous nature provided by the so called: > BulkProcessor. > * {+}the benefit{+}: sessions, events not load in memory, no OOM possible > * {+}the cons{+}: request will be synchron and {{{}we expose merge requests > to timeout on client side{}}}. merge is actually trigger by the login on jExp > side adding extra timing here could have bad impacts and side effects. > > Since none of this solution seem’s ok, the perfect solution should be a mix > of both strength: * use *{{updateByQuery}}* in a separate thread to avoid > retaining merge request > * > ** We have the OOM protection by not loading all the past events/sessions > ** We have the asynchronous execution done in a separate thread/job to free > the current request. -- This message was sent by Atlassian Jira (v8.20.10#820010)