[ 
https://issues.apache.org/jira/browse/SENTRY-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16256177#comment-16256177
 ] 

Na Li commented on SENTRY-1964:
-------------------------------

[~akolb]Encounter an issue in the following scenario when not sending partition 
location to HDFS
1) When table ext2 has location at "tmp/external/tables/ext2_before" and its 
partitions are at "tmp/external/tables/ext2_before/i=1". The partition is not 
saved in sentry and not sent to HDFS. The permission of the partition works 
because the table (parent of the partition) has authorization object, and the 
corresponding permission can be found.
2) When user changes the table ext2 location to "alter table ext2 set location 
\'hdfs:///tmp/external/tables/ext2_after\'" and partition location is not 
changed, they are not under the same directory any more. When accessing the 
partition, there is no authorization object because its parent does not have 
authorization object (moved to the new location) any more. Therefore, The 
permission of the partition does not work (sentry ACL cannot be found) without 
the authorization object.

{code}
    // START : Verify external table set location..
    writeToPath("/tmp/external/tables/ext2_before/i=1", 5, "foo", "bar");
    writeToPath("/tmp/external/tables/ext2_before/i=2", 5, "foo", "bar");

    stmt.execute("create external table ext2 (s string) partitioned by (i int) 
location \'/tmp/external/tables/ext2_before\'");
    stmt.execute("alter table ext2 add partition (i=1)");
    stmt.execute("alter table ext2 add partition (i=2)");
    verifyQuery(stmt, "ext2", 10);
    verifyOnAllSubDirs("/tmp/external/tables/ext2_before", null, "hbase", 
false);
    stmt.execute("grant all on table ext2 to role p1_admin");

    verifyOnPath("/tmp/external/tables/ext2_before", FsAction.ALL, "hbase", 
true);
    verifyOnPath("/tmp/external/tables/ext2_before/i=1", FsAction.ALL, "hbase", 
true);
    verifyOnPath("/tmp/external/tables/ext2_before/i=2", FsAction.ALL, "hbase", 
true);
    verifyOnPath("/tmp/external/tables/ext2_before/i=1/stuff.txt", 
FsAction.ALL, "hbase", true);
    verifyOnPath("/tmp/external/tables/ext2_before/i=2/stuff.txt", 
FsAction.ALL, "hbase", true);

    writeToPath("/tmp/external/tables/ext2_after/i=1", 6, "foo", "bar");
    writeToPath("/tmp/external/tables/ext2_after/i=2", 6, "foo", "bar");

    stmt.execute("alter table ext2 set location 
\'hdfs:///tmp/external/tables/ext2_after\'");
    Thread.sleep(WAIT_BEFORE_TESTVERIFY);

    // Even though table location is altered, partition location is still old 
(still 10 rows)
    verifyQuery(stmt, "ext2", 10);
{code}

> HDFS sync does not need partition locations (usually)
> -----------------------------------------------------
>
>                 Key: SENTRY-1964
>                 URL: https://issues.apache.org/jira/browse/SENTRY-1964
>             Project: Sentry
>          Issue Type: Improvement
>          Components: Sentry
>    Affects Versions: 2.0.0
>            Reporter: Na Li
>            Assignee: Na Li
>            Priority: Critical
>         Attachments: SENTRY-1964.001.patch, SENTRY-1964.001.patch, 
> SENTRY-1964.002.patch
>
>
> Right now, sentry saves partition info from HMS and send it to HDFS. HDFS 
> only needs database and table info, and does not need partition info for ACL 
> unless the partion location is not sharing the same prefix of its table.
> The partition data amount is huge, and causes performance issue. We can 
> optimize it by not saving and not sending partition info if it shares the 
> same path of its table. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to