[
https://issues.apache.org/jira/browse/SENTRY-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528161#comment-16528161
]
Na Li edited comment on SENTRY-2292 at 6/29/18 8:29 PM:
--------------------------------------------------------
[~arjunmishra13] This may bring sentry to unknown state for path info and cause
performance issue.
1) For example, when sentry is waiting for full snapshot which is starting with
Notification_ID = 3, there are more events with Notification_ID = 4 of creating
table table_1, and its associated permission update
If sentry uses one thread for full snapshot and another thread for getting
notification events and process them (as you proposed), the following could
happen depending on timing of these two threads
* sentry may save updated from notification 4 and its associated permission
first with thread_2,
* then persist the full snapshot with notification 3 with thread_1. The
snapshot ID increases
* Then next pulling interval, it won't get notification 4 as max notification
ID is 4
* For the new snapshot ID, there is no table_1, but its permission exists
2) Sentry had assumption that there is one thread for getting and saving path
info from HMS. Two thread would cause concurrency issue
For multiple sentry threads to consume those notifications
Sentry server need to add code to detect the same notification is already
processed and saved in sentry DB, so it won't process and save the same
notifications in sentry DB. Then, it is fine for multiple threads to consume
those notifications.
Without that code change, exception will occur when multiple sentry threads get
same notifications and save them in sentry DB. The table SENTRY_PATH_CHANGE
contains unique index for the field NOTIFICATION_HASH. Then transaction retry
will happen for long time.
was (Author: linaataustin):
[~arjunmishra13] This may bring sentry to unknown state for path info and cause
performance issue.
1) For example, when sentry is waiting for full snapshot which is starting with
Notification_ID = 3, there are more events with Notification_ID = 4 of creating
table table_1, and its associated permission update
If sentry uses one thread for full snapshot and another thread for getting
notification events and process them (as you proposed), the following could
happen depending on timing of these two threads
* sentry may save updated from notification 4 and its associated permission
first with thread_2,
* then persist the full snapshot with notification 3 with thread_1.
* Then next pulling interval, it gets notification 4, and the permission is
removed for creating table if syncStoreOnCreate = true (thinking that
permission is associated with non-exist table)
2) Sentry had assumption that there is one thread for getting and saving path
info from HMS. Two thread would cause concurrency issue
For multiple sentry threads to consume those notifications
Sentry server need to add code to detect the same notification is already
processed and saved in sentry DB, so it won't process and save the same
notifications in sentry DB. Then, it is fine for multiple threads to consume
those notifications.
Without that code change, exception will occur when multiple sentry threads get
same notifications and save them in sentry DB. The table SENTRY_PATH_CHANGE
contains unique index for the field NOTIFICATION_HASH. Then transaction retry
will happen for long time.
> Separate Processing of Notifications to a separate thread
> ---------------------------------------------------------
>
> Key: SENTRY-2292
> URL: https://issues.apache.org/jira/browse/SENTRY-2292
> Project: Sentry
> Issue Type: Improvement
> Components: Sentry
> Affects Versions: 2.1.0
> Reporter: Arjun Mishra
> Assignee: Arjun Mishra
> Priority: Major
>
> When fetching full update, processing notifications are held which blocks HMS
> threads. Separating them to different threads will relieve HMS threads
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)