[ 
https://issues.apache.org/jira/browse/SENTRY-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Spector updated SENTRY-2014:
----------------------------------
    Attachment: SENTRY-2014.01.patch

> Incorrect handling of HDFS paths with multiple slashes
> ------------------------------------------------------
>
>                 Key: SENTRY-2014
>                 URL: https://issues.apache.org/jira/browse/SENTRY-2014
>             Project: Sentry
>          Issue Type: Bug
>            Reporter: Vadim Spector
>            Assignee: Vadim Spector
>         Attachments: SENTRY-2014.01.patch
>
>
> There are at least three places in the code where HDFS paths may not be 
> parsed correctly:
> a) PathsUpdate.parsePath() does not handle collapse duplicate slashes in the 
> path portion of URI into one slash. This method is used when getting paths 
> data from HMS store. HDFS paths with duplicate slashes are perfectly legal 
> and the specs refer to UNIX guidelines saying that multiple slashes should be 
> treated as single slashes. If we keep multiple slashes in the path, such a 
> path may be incorrectly split into path entries with some entries being 
> empty, ultimately resulting in hard-to-troubleshoot ACL problems in the 
> field. We should not assume that the URIs fed into parsePath() have already 
> been normalized. It's easier to fix the code.
> b) NotificationProcessor.splitPath() is using "/" regex instead of the 
> correct "/+" one. While the inputs to this class _may_ be controlled by 
> Sentry software, which _may_ normalize paths properly, it is better not to 
> make such assumptions and just fix the code.
> c) SentryStore.retrieveFullPathsImageCore() splits paths retrieved from 
> database as "path.split("/") instead of path.split("/+")
> This may result in HDFS sync failures.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to