Lars,

   1. SENTRY-2461 <https://issues.apache.org/jira/browse/SENTRY-2461> should
   definitely be fixed. Current approach is not efficient.
   2.
   3.   Sentry currently checks if the notification is already processed.
   It is in HiveNotificationFetcher.java


>    1.
>
>    try {
>      if (cache.contains(hash) || sentryStore.isNotificationProcessed(hash)) {
>        cache.add(hash);
>
>        LOGGER.debug("Ignoring HMS notification already processed: ID = {}", 
> id);
>        return false;
>      }
>
>
> But there is issue(SENTRY-2422
<https://issues.apache.org/jira/browse/SENTRY-2422>) in this check.
"isNotificationProcessed"
checks if the notification is processed by looking at SENTRY_PATH_CHANGE
table.
This is where the issue exists as sentry doesn't record all the
notifications in SENTRY_PATH_CHANGE table. Let's say if there is alter
table where the name and location is not change, sentry doesn't record it.

Simple solution for this is to use SENTRY_HMS_NOTIFICATION_ID table instead
of SENTRY_PATH_CHANGE to verify if the notification is processed.
This will introduce new issues with the Hive version 2.3.3. In this version
of Hive, event id generated in NOTIFICATION_LOG table could be out-of-order
and even have duplicate event-id's. That is why we are working on
bumping up the hive version which the fix before making changes in Sentry.

   1.
      1.
      2. SENTRY-2422 <https://issues.apache.org/jira/browse/SENTRY-2422> will
   be fixed after bumping up the hive version. For now, as a work around you
   can reduce the purge time by providing the configuration "

sentry.store.clean.period.seconds" to avoid OOM memory issue.



*Thanks,Kalyan Kumar Kalvagadda* | Software Engineer
t. (469) 279- <0000000000>5732
cloudera.com <https://www.cloudera.com>

[image: Cloudera] <https://www.cloudera.com/>

[image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
on LinkedIn] <https://www.linkedin.com/company/cloudera>
------------------------------


On Mon, Nov 19, 2018 at 5:04 AM Lars Francke <lars.fran...@gmail.com> wrote:

> Hi,
>
> we are/were facing an issue where we were running into an OOM error in
> Sentry. This happened right after startup and it is because of
> SENTRY-2461[1].
>
> We're trying to understand how this is supposed to work.
>
> We see HmsFollower waking up every 500ms.
> It looks at the current notification Id we have stored
> in SENTRY_HMS_NOTIFICATION_ID (e.g. 100).
> It then subtracts 1 (e.g. 99) and retrieves any notifications newer than
> that (e.g. "100").
> HmsFollower then calls NotificationProcessor#processNotificationEvent to
> process the event.
> If that method returns false we explicitly persist the notification Id
>
> There's a few things that we find weird about this:
> * We see hundreds of duplicates being processed because we never check if
> we already have processed the same notification id. Now I do see comments
> in the code that Hive reuses those ids so this might be intentional?
>
> * But we also store these duplicate notification ids over and over. In the
> normal scenario where the id hasn't changed in the last 500ms we just store
> it again and because the table does not have a constraint on duplicates it
> stores tens of thousands of duplicates for us at least until the purging
> process kicks in 12h later
>
> I guess our question is: Is this normal behavior? Should we see thousands
> of duplicates and should Sentry reprocess the last notification over and
> over?
>
> Thank you very much!
>
> Cheers,
> Lars
>
> [1] <https://issues.apache.org/jira/browse/SENTRY-2461>
>

Reply via email to