[ https://issues.apache.org/jira/browse/SENTRY-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537624#comment-16537624 ]
Na Li commented on SENTRY-2299: ------------------------------- [~Tagar] are you having this issue now? How often do you have it? The work around is to restart NameNode service. It will force a full snapshot and then the issue will be gone for a while. I am looking for ways to identify where the "create table" delta update is lost. It would be great if you can provide more insight. I do think this is related to HA because how sentry gets path update is changed. > NPE In Sentry HDFS Sync Plug > ---------------------------- > > Key: SENTRY-2299 > URL: https://issues.apache.org/jira/browse/SENTRY-2299 > Project: Sentry > Issue Type: Bug > Components: Sentry > Affects Versions: 2.1.0 > Reporter: Na Li > Priority: Major > > Sentry HDFS ACL synchronization stopped working and throws > NullPointerException. The HDFS logs showed repeating errors like the > following: > {code} > 11:16:15.743 AM WARN SentryAuthorizationInfo > Failed to update, will retry in [30000]ms, error: > java.lang.NullPointerException > at org.apache.sentry.hdfs.HMSPaths$Entry.access$200(HMSPaths.java:146) > at org.apache.sentry.hdfs.HMSPaths.renameAuthzObject(HMSPaths.java:879) > at > org.apache.sentry.hdfs.UpdateableAuthzPaths.applyPartialUpdate(UpdateableAuthzPaths.java:118) > at > org.apache.sentry.hdfs.UpdateableAuthzPaths.updatePartial(UpdateableAuthzPaths.java:81) > at > org.apache.sentry.hdfs.SentryAuthorizationInfo.processUpdates(SentryAuthorizationInfo.java:211) > at > org.apache.sentry.hdfs.SentryAuthorizationInfo.update(SentryAuthorizationInfo.java:139) > at > org.apache.sentry.hdfs.SentryAuthorizationInfo.run(SentryAuthorizationInfo.java:232) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > The customer checked the Sentry logs and didn't see any corresponding errors. > The issue stopped occurring, apparently not through any specific user > intervention. (The customer tried manually failing over the active NameNode, > with no change.) > {code} > Arjun mentioned the reason is that some delta update from sentry server was > lost, so the oldEntry at line HMSPaths.java:879 was null. That caused null > exception. > {code} > void renameAuthzObject(String oldName, List<List<String>> oldPathElems, > String newName, List<List<String>> newPathElems) { > if (LOG.isDebugEnabled()) { > LOG.debug(String.format("%s renameAuthzObject({%s, %s} -> {%s, %s})", > this, oldName, assemblePaths(oldPathElems), newName, > assemblePaths(newPathElems))); > } > if (oldPathElems == null || oldPathElems.isEmpty() || > newPathElems == null || newPathElems.isEmpty() || > newName == null || newName.equals(oldName)) { > LOG.warn(String.format("%s renameAuthzObject({%s, %s} -> {%s, %s})" + > ": invalid inputs, skipping", > this, oldName, assemblePaths(oldPathElems), newName, > assemblePaths(newPathElems))); > return; > } > // if oldPath == newPath, that is path has not changed as part of rename > and hence new table > // needs to have old paths => new_table.add(old_table_partition_paths) > List<String> oldPathElements = oldPathElems.get(0); > List<String> newPathElements = newPathElems.get(0); > if (!oldPathElements.equals(newPathElements)) { > Entry oldEntry = root.find(oldPathElements.toArray(new String[0]), > false); > Entry newParent = root.createParent(newPathElements); > oldEntry.moveTo(newParent, newPathElements.get(newPathElements.size() - > 1)); -> oldEntry is null > } > {code} > There are several possible reasons why some delta changes are lost. > {code} > 1. Sentry server does not save the rename update as delta update. The chance > is really low > 2. The delta change is lost from sentry server to name node plugin. The > chance is also low > 3. When applying delta change about old entry, it is lost > {code} > The fix for this issue > 1. Check if oldEntry is null. If so, don't call oldEntry.moveTo. Instead, > throw exception, which will be caught by its caller and causes sentry client > at name node plugin gets path full snapshot from sentry server. > 2. Find out why the oldEntry is null and fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)