[ https://issues.apache.org/jira/browse/SENTRY-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vadim Spector updated SENTRY-2014: ---------------------------------- Attachment: (was: SENTRY-2014.01.patch) > Incorrect handling of HDFS paths with multiple slashes > ------------------------------------------------------ > > Key: SENTRY-2014 > URL: https://issues.apache.org/jira/browse/SENTRY-2014 > Project: Sentry > Issue Type: Bug > Reporter: Vadim Spector > Assignee: Vadim Spector > > There are at least three places in the code where HDFS paths may not be > parsed correctly: > a) PathsUpdate.parsePath() does not handle collapse duplicate slashes in the > path portion of URI into one slash. This method is used when getting paths > data from HMS store. HDFS paths with duplicate slashes are perfectly legal > and the specs refer to UNIX guidelines saying that multiple slashes should be > treated as single slashes. If we keep multiple slashes in the path, such a > path may be incorrectly split into path entries with some entries being > empty, ultimately resulting in hard-to-troubleshoot ACL problems in the > field. We should not assume that the URIs fed into parsePath() have already > been normalized. It's easier to fix the code. > b) NotificationProcessor.splitPath() is using "/" regex instead of the > correct "/+" one. While the inputs to this class _may_ be controlled by > Sentry software, which _may_ normalize paths properly, it is better not to > make such assumptions and just fix the code. > c) SentryStore.retrieveFullPathsImageCore() splits paths retrieved from > database as "path.split("/") instead of path.split("/+") > This may result in HDFS sync failures. -- This message was sent by Atlassian JIRA (v6.4.14#64029)