[ https://issues.apache.org/jira/browse/SENTRY-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542015#comment-16542015 ]
Ruslan Dautkhanov commented on SENTRY-2299: ------------------------------------------- [~LinaAtAustin] Yes we have external and non-external tables being created all the time. Not sure if we can count all the external tables. Some of them are transient - table is dropped as soon as a load from it is complete. Also some of external tables are created ad-hoc by data analysts and I can imagine they could have mistypes table locating. But anyway I don't understand how that could have caused this issue? I think it's better to split this problem up into three bullet points: # Implement a workaround asap, irrespective why this problem happens in the first place. Quoting your comment in the Jira description: {quote}Check if oldEntry is null. If so, don't call oldEntry.moveTo. Instead, throw exception, which will be caught by its caller and causes sentry client at name node plugin gets path full snapshot from sentry server. {quote} # Find out why an delta update might get lost. # Find out if an invalid external table path can cause this problem. Please correct me if I missed anything? I think #1 above should be implemented asap. We can't wait for another incident of this problem again, as it can be weeks. To help you find out #2 or #3 yes we can enable debugging - please update the support case with details which specific debugging you'd like to be enabled and how to do this. Thanks!! > NPE In Sentry HDFS Sync Plug > ---------------------------- > > Key: SENTRY-2299 > URL: https://issues.apache.org/jira/browse/SENTRY-2299 > Project: Sentry > Issue Type: Bug > Components: Sentry > Affects Versions: 2.1.0 > Reporter: Na Li > Assignee: Na Li > Priority: Major > > Sentry HDFS ACL synchronization stopped working and throws > NullPointerException. The HDFS logs showed repeating errors like the > following: > {code} > 11:16:15.743 AM WARN SentryAuthorizationInfo > Failed to update, will retry in [30000]ms, error: > java.lang.NullPointerException > at org.apache.sentry.hdfs.HMSPaths$Entry.access$200(HMSPaths.java:146) > at org.apache.sentry.hdfs.HMSPaths.renameAuthzObject(HMSPaths.java:879) > at > org.apache.sentry.hdfs.UpdateableAuthzPaths.applyPartialUpdate(UpdateableAuthzPaths.java:118) > at > org.apache.sentry.hdfs.UpdateableAuthzPaths.updatePartial(UpdateableAuthzPaths.java:81) > at > org.apache.sentry.hdfs.SentryAuthorizationInfo.processUpdates(SentryAuthorizationInfo.java:211) > at > org.apache.sentry.hdfs.SentryAuthorizationInfo.update(SentryAuthorizationInfo.java:139) > at > org.apache.sentry.hdfs.SentryAuthorizationInfo.run(SentryAuthorizationInfo.java:232) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > The customer checked the Sentry logs and didn't see any corresponding errors. > The issue stopped occurring, apparently not through any specific user > intervention. (The customer tried manually failing over the active NameNode, > with no change.) > {code} > Arjun mentioned the reason is that some delta update from sentry server was > lost, so the oldEntry at line HMSPaths.java:879 was null. That caused null > exception. > {code} > void renameAuthzObject(String oldName, List<List<String>> oldPathElems, > String newName, List<List<String>> newPathElems) { > if (LOG.isDebugEnabled()) { > LOG.debug(String.format("%s renameAuthzObject({%s, %s} -> {%s, %s})", > this, oldName, assemblePaths(oldPathElems), newName, > assemblePaths(newPathElems))); > } > if (oldPathElems == null || oldPathElems.isEmpty() || > newPathElems == null || newPathElems.isEmpty() || > newName == null || newName.equals(oldName)) { > LOG.warn(String.format("%s renameAuthzObject({%s, %s} -> {%s, %s})" + > ": invalid inputs, skipping", > this, oldName, assemblePaths(oldPathElems), newName, > assemblePaths(newPathElems))); > return; > } > // if oldPath == newPath, that is path has not changed as part of rename > and hence new table > // needs to have old paths => new_table.add(old_table_partition_paths) > List<String> oldPathElements = oldPathElems.get(0); > List<String> newPathElements = newPathElems.get(0); > if (!oldPathElements.equals(newPathElements)) { > Entry oldEntry = root.find(oldPathElements.toArray(new String[0]), > false); > Entry newParent = root.createParent(newPathElements); > oldEntry.moveTo(newParent, newPathElements.get(newPathElements.size() - > 1)); -> oldEntry is null > } > {code} > There are several possible reasons why some delta changes are lost. > {code} > 1. Sentry server does not save the rename update as delta update. The chance > is really low > 2. The delta change is lost from sentry server to name node plugin. The > chance is also low > 3. When applying delta change about old entry, it is lost > {code} > The fix for this issue > 1. Check if oldEntry is null. If so, don't call oldEntry.moveTo. Instead, > throw exception, which will be caught by its caller and causes sentry client > at name node plugin gets path full snapshot from sentry server. > 2. Find out why the oldEntry is null and fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)