[
https://issues.apache.org/jira/browse/SENTRY-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16979389#comment-16979389
]
Nicolas Dupont commented on SENTRY-2299:
----------------------------------------
We encountered the issue in CDH 5.12.2 and , as workaround, increased
sentry.authorization-provider.cache-stale-threshold.ms to 604800000 (7 days -
we reboot the cluster each week) so that ACLs stays even if Sentry encounters
the NPE. So far it seems to work
> NPE In Sentry HDFS Sync Plugin
> ------------------------------
>
> Key: SENTRY-2299
> URL: https://issues.apache.org/jira/browse/SENTRY-2299
> Project: Sentry
> Issue Type: Bug
> Components: Sentry
> Affects Versions: 2.1.0
> Reporter: Na Li
> Assignee: Na Li
> Priority: Critical
> Attachments: SENTRY-2299.001.patch
>
>
> Sentry HDFS ACL synchronization stopped working and throws
> NullPointerException. The HDFS logs showed repeating errors like the
> following:
> {code}
> 11:16:15.743 AM WARN SentryAuthorizationInfo
> Failed to update, will retry in [30000]ms, error:
> java.lang.NullPointerException
> at org.apache.sentry.hdfs.HMSPaths$Entry.access$200(HMSPaths.java:146)
> at org.apache.sentry.hdfs.HMSPaths.renameAuthzObject(HMSPaths.java:879)
> at
> org.apache.sentry.hdfs.UpdateableAuthzPaths.applyPartialUpdate(UpdateableAuthzPaths.java:118)
> at
> org.apache.sentry.hdfs.UpdateableAuthzPaths.updatePartial(UpdateableAuthzPaths.java:81)
> at
> org.apache.sentry.hdfs.SentryAuthorizationInfo.processUpdates(SentryAuthorizationInfo.java:211)
> at
> org.apache.sentry.hdfs.SentryAuthorizationInfo.update(SentryAuthorizationInfo.java:139)
> at
> org.apache.sentry.hdfs.SentryAuthorizationInfo.run(SentryAuthorizationInfo.java:232)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> The customer checked the Sentry logs and didn't see any corresponding errors.
> The issue stopped occurring, apparently not through any specific user
> intervention. (The customer tried manually failing over the active NameNode,
> with no change.)
> {code}
> Arjun mentioned the reason is that some delta update from sentry server was
> lost, so the oldEntry at line HMSPaths.java:879 was null. That caused null
> exception.
> {code}
> void renameAuthzObject(String oldName, List<List<String>> oldPathElems,
> String newName, List<List<String>> newPathElems) {
> if (LOG.isDebugEnabled()) {
> LOG.debug(String.format("%s renameAuthzObject({%s, %s} -> {%s, %s})",
> this, oldName, assemblePaths(oldPathElems), newName,
> assemblePaths(newPathElems)));
> }
> if (oldPathElems == null || oldPathElems.isEmpty() ||
> newPathElems == null || newPathElems.isEmpty() ||
> newName == null || newName.equals(oldName)) {
> LOG.warn(String.format("%s renameAuthzObject({%s, %s} -> {%s, %s})" +
> ": invalid inputs, skipping",
> this, oldName, assemblePaths(oldPathElems), newName,
> assemblePaths(newPathElems)));
> return;
> }
> // if oldPath == newPath, that is path has not changed as part of rename
> and hence new table
> // needs to have old paths => new_table.add(old_table_partition_paths)
> List<String> oldPathElements = oldPathElems.get(0);
> List<String> newPathElements = newPathElems.get(0);
> if (!oldPathElements.equals(newPathElements)) {
> Entry oldEntry = root.find(oldPathElements.toArray(new String[0]),
> false);
> Entry newParent = root.createParent(newPathElements);
> oldEntry.moveTo(newParent, newPathElements.get(newPathElements.size() -
> 1)); -> oldEntry is null
> }
> {code}
> There are several possible reasons why some delta changes are lost.
> {code}
> 1. Sentry server does not save the rename update as delta update. The chance
> is really low
> 2. The delta change is lost from sentry server to name node plugin. The
> chance is also low
> 3. When applying delta change about old entry, it is lost
> {code}
> The fix for this issue
> 1. Check if oldEntry is null. If so, don't call oldEntry.moveTo. Instead,
> throw exception, which will be caught by its caller and causes sentry client
> at name node plugin gets path full snapshot from sentry server.
> 2. Find out why the oldEntry is null and fix it.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)