[
https://issues.apache.org/jira/browse/SENTRY-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vadim Spector updated SENTRY-1508:
----------------------------------
Assignee: Vadim Spector
Status: Patch Available (was: Open)
> MetastorePlugin.java does not handle properly initialization failure
> --------------------------------------------------------------------
>
> Key: SENTRY-1508
> URL: https://issues.apache.org/jira/browse/SENTRY-1508
> Project: Sentry
> Issue Type: Bug
> Reporter: Vadim Spector
> Assignee: Vadim Spector
> Priority: Critical
> Attachments: SENTRY-1508.001.patch
>
>
> MetastorePlugin.java constructor initializes successfully regardless of
> whether initThread.run() was successful. E.g. if
> cacheInitializer.createInitialUpdate() cal inside the constructor fails (in
> one case in the field it was due to an invalid table uri in HMS database,
> whose parsing produced ArrayIndexOutOfBoundsException), initComplete is still
> assigned to true and MetastorePlugin gets constructed. While initError is
> assigned to non-null value in case of failure, it is effectively ignored as
> will be described later.
> MetastorePlugin implements several callbacks to be called on the Hive side,
> to update Sentry state. It is critical functionality for enforcing access
> control. Its callbacks (addPath(), removeAllPaths(), etc) all end up calling
> notifySentryAndApplyLocal() which starts with the following code:
> if (initComplete) {
> processUpdate()
> }...
> which means that processUpdate() is called even when MetastorePlugin
> initialization has failed (i.e. initError != null). There is an option of
> asynchronous initialization, when MetastorePlugin is constructed and returned
> to the caller before initialization is complete, but the code makes no
> distinction between the two cases, as will be shown below.
> processUpdate() called regardless of whether MetastorePlugin a) has been
> initialized (sync mode, success), or b) still being initialized (async mode),
> or c) failed to initialized (either mode). It calls (unconditionally)
> applyLocal() and notifySentry(). aplyLocal() calls authzPaths.updatePartial()
> which immediately fails with NullPointerException if authzPaths == null which
> happens if initThread within the constructor fails.
> SENTRY-1270 avoids NullPointerException by execuitng this before
> dereferencing authzPaths inside applyLocal():
> if(authzPaths == null) {
> LOGGER.error(initializationFailureMsg);
> return;
> }
> However, this leads to hard-to-diagnose behavior, because a) local updates
> fail forever, and b) housekeeping thread running SyncTask to synchronize with
> Sentry-side state fails as well, printing endlessly misleading "####
> Metastore Plugin cache has not finished initialization." message suggesting
> its temporary condition as opposed to a permanent failure. Failure to sync up
> with Sentry state can lead to very subtle, often intermittent, problems with
> acls. MetastorePlugin and its caller never get fully aware of the permanent
> nature of init failure.
> Suggestion:
> a) define 3 states in MetastorePlugin: initializing, initialized, init_failed.
> 2) communicate those states to the caller clearly, by _immediately_ throwing
> some kind of exception from public callback APIs in the cases preventing
> those callback from completing successfully:
> a) for state == initializing, throw some kind of
> InitializationInProgressException. This only applies for asynchronous
> initialization configuration. It is unknown whether it is actually deployed
> anywhere, but the code exists and there is no reason why not to support it.
> Async init makes sense as an optimization, as long as
> initialization-in-progress condition is clearly communicated.
> b) for state == init_failed throw InitializationFailedException of some
> sort. For usability, it should contain the cause - the original exception
> originating either from the constructor or from the initThread.run() started
> from the constructor (in sync or async modes alike).
> Although in the sync mode initialization failure could be detected easily by
> constructor failure, we may want to preserve the current design, in which
> constructor always succeeds, so the client code does not need to be aware of
> whether init mode is configured as sync vs async.
> Additional cleanup:
> 1. getClient() is always dereferenced, while it can actually return null
> (exception is swallowed and error is logged). Suggested fix: getClient() may
> still print error message, but it should also re-throw an original exception
> - it is much easier to debug than dealing with inevitable and uninformative
> NullPointerException up the caller stack. General rule - error messages are
> to help with diagnostic, not to replace throwing exception, which is the only
> way to make sure error conditions are properly propagated to the caller.
> 2. Each code fork deserves log message: e.g. addPath() retrurns immediately
> if PathsUpdate.parsePath() returns null - w/o printing any log.
> 3. in SyncTask.run() - if notificationLock.tryLock() succeeds, yet the next
> line's expression "MetastorePlugin.this.authzPaths == null" is evaluated to
> false, run() exits w/o a chance to call notificationLock.unlock(). All the
> code following lock.tryLock() should be inside try-catch-finally, with
> finally's first line being lock.unlock(), as a general safe pattern.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)