Lars,
1. SENTRY-2461 <https://issues.apache.org/jira/browse/SENTRY-2461> should definitely be fixed. Current approach is not efficient. 2. 3. Sentry currently checks if the notification is already processed. It is in HiveNotificationFetcher.java > 1. > > try { > if (cache.contains(hash) || sentryStore.isNotificationProcessed(hash)) { > cache.add(hash); > > LOGGER.debug("Ignoring HMS notification already processed: ID = {}", > id); > return false; > } > > > But there is issue(SENTRY-2422 <https://issues.apache.org/jira/browse/SENTRY-2422>) in this check. "isNotificationProcessed" checks if the notification is processed by looking at SENTRY_PATH_CHANGE table. This is where the issue exists as sentry doesn't record all the notifications in SENTRY_PATH_CHANGE table. Let's say if there is alter table where the name and location is not change, sentry doesn't record it. Simple solution for this is to use SENTRY_HMS_NOTIFICATION_ID table instead of SENTRY_PATH_CHANGE to verify if the notification is processed. This will introduce new issues with the Hive version 2.3.3. In this version of Hive, event id generated in NOTIFICATION_LOG table could be out-of-order and even have duplicate event-id's. That is why we are working on bumping up the hive version which the fix before making changes in Sentry. 1. 1. 2. SENTRY-2422 <https://issues.apache.org/jira/browse/SENTRY-2422> will be fixed after bumping up the hive version. For now, as a work around you can reduce the purge time by providing the configuration " sentry.store.clean.period.seconds" to avoid OOM memory issue. *Thanks,Kalyan Kumar Kalvagadda* | Software Engineer t. (469) 279- <0000000000>5732 cloudera.com <https://www.cloudera.com> [image: Cloudera] <https://www.cloudera.com/> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera> ------------------------------ On Mon, Nov 19, 2018 at 5:04 AM Lars Francke <lars.fran...@gmail.com> wrote: > Hi, > > we are/were facing an issue where we were running into an OOM error in > Sentry. This happened right after startup and it is because of > SENTRY-2461[1]. > > We're trying to understand how this is supposed to work. > > We see HmsFollower waking up every 500ms. > It looks at the current notification Id we have stored > in SENTRY_HMS_NOTIFICATION_ID (e.g. 100). > It then subtracts 1 (e.g. 99) and retrieves any notifications newer than > that (e.g. "100"). > HmsFollower then calls NotificationProcessor#processNotificationEvent to > process the event. > If that method returns false we explicitly persist the notification Id > > There's a few things that we find weird about this: > * We see hundreds of duplicates being processed because we never check if > we already have processed the same notification id. Now I do see comments > in the code that Hive reuses those ids so this might be intentional? > > * But we also store these duplicate notification ids over and over. In the > normal scenario where the id hasn't changed in the last 500ms we just store > it again and because the table does not have a constraint on duplicates it > stores tens of thousands of duplicates for us at least until the purging > process kicks in 12h later > > I guess our question is: Is this normal behavior? Should we see thousands > of duplicates and should Sentry reprocess the last notification over and > over? > > Thank you very much! > > Cheers, > Lars > > [1] <https://issues.apache.org/jira/browse/SENTRY-2461> >