[ https://issues.apache.org/jira/browse/SENTRY-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16256448#comment-16256448 ]
Na Li commented on SENTRY-1964: ------------------------------- [~akolb] the only problem is the scenario that the table location is initially prefix of the partition's local, and then changes to another location, but partition's location does not change. How often does this happen? Can we add configuration to control if we send partition to HDFS, and the default behavior is to send HDFS? In this way, for customer who really wants the performance improvement and does not run into the above scenario will be able to enjoy the benefit of not sending partition to HDFS. Later on, the component that user uses to make table location change can be smarter to avoid such situation. For example, when changing table location, will ask user to choose 1) change partition location as well to be sub-directory of table location, or 2) enable sending partition to HDFS. > HDFS sync does not need partition locations (usually) > ----------------------------------------------------- > > Key: SENTRY-1964 > URL: https://issues.apache.org/jira/browse/SENTRY-1964 > Project: Sentry > Issue Type: Improvement > Components: Sentry > Affects Versions: 2.0.0 > Reporter: Na Li > Assignee: Na Li > Priority: Critical > Attachments: SENTRY-1964.001.patch, SENTRY-1964.001.patch, > SENTRY-1964.002.patch > > > Right now, sentry saves partition info from HMS and send it to HDFS. HDFS > only needs database and table info, and does not need partition info for ACL > unless the partion location is not sharing the same prefix of its table. > The partition data amount is huge, and causes performance issue. We can > optimize it by not saving and not sending partition info if it shares the > same path of its table. -- This message was sent by Atlassian JIRA (v6.4.14#64029)