> On Aug. 31, 2017, 12:25 a.m., Alexander Kolbasov wrote: > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java > > Line 247 (original), 251 (patched) > > <https://reviews.apache.org/r/61973/diff/1/?file=1807070#file1807070line252> > > > > Style - is it preferrable to use // style comments to simplify > > commenting out block of code with /* comments?
I moved the comment a few lines before to make a comment for the block (actually for the method). > On Aug. 31, 2017, 12:25 a.m., Alexander Kolbasov wrote: > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java > > Line 254 (original), 258 (patched) > > <https://reviews.apache.org/r/61973/diff/1/?file=1807070#file1807070line259> > > > > Wording here. May be you can reword it here. E.g. > > > > HMS notifications may contain both gaps in the sequence and duplicates > > (the same ID repeated more then once for different events). Duplicates do > > not necessarily come one after another, so the sequence {1,2,3,2} is > > possible I did not add the last sentence because the {1,2,3,2} sequence is not true for Sentry. Sentry will always get 1,2,2,3 because Hive orders the sequence even if the commit was later. > On Aug. 31, 2017, 12:25 a.m., Alexander Kolbasov wrote: > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java > > Line 260 (original) > > <https://reviews.apache.org/r/61973/diff/1/?file=1807070#file1807070line272> > > > > So you decided not to log anything in this case? Yeah. Initially we thought it was useful just as a warning to the user that something is not right. But we thought this was not going to happen often. Now, with HMS HA, I know this will happen too often and I'd like to avoid those warnings. In a quence 1,2,2,3, the expected number of notifications is 3, so this will warn the user something is wrong but it is not. > On Aug. 31, 2017, 12:25 a.m., Alexander Kolbasov wrote: > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java > > Lines 127 (patched) > > <https://reviews.apache.org/r/61973/diff/1/?file=1807071#file1807071line127> > > > > If I am reading this code correctly, the following will happen: > > > > 1) Every time we request IDs starting from last seen ID, so normally > > the first ID in the sequence is always a duplicate > > 2) This first ID in the sequence isn't in the cache yet, so we go to > > the sentryStore and fetch the hash and compare. > > > > This means that we always go to sentryStore for the frist element of > > the batch which means every time we call fetchNotifications(). > > > > This is not very bad, but if we store the sha1 hash of the last event > > in the cache we don't have to go to sentryStore at all under normal > > circumstances Keeping the cache of the last ID is not enough. The HiveNotificationFetcher does not know if the IDs fetched are processed, so the next call to fetchNotifications() may be called with an ID that was not cached. If I want to cache all of them, then I will add more procesisng on getting the sha1() of all the notifications and adding them to the cache. Isn't simpler just to keep it the way it is now? - Sergio ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61973/#review184215 ----------------------------------------------------------- On Aug. 29, 2017, 6:41 p.m., Sergio Pena wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61973/ > ----------------------------------------------------------- > > (Updated Aug. 29, 2017, 6:41 p.m.) > > > Review request for sentry, Alexander Kolbasov and Na Li. > > > Bugs: sentry-1888 > https://issues.apache.org/jira/browse/sentry-1888 > > > Repository: sentry > > > Description > ------- > > Approach: > The proposed solution will be to request notifications from the last ID seen > again. This way we could bring current duplicates and apply them on Sentry. > We have the risk to miss duplicates that were committed much time later, but > we cannot trust on those duplicates as they will not know the order of the > time they were committed. > > The patch adds a new helper class called HiveNotificationFetcher that allow > us to bring new notifications and apply a filter to those notifications that > were already processed by Sentry. > > > Diffs > ----- > > > sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/UniquePathsUpdate.java > 7dae2f538602f4c264084791fb45bb6891a34941 > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java > 593b92f96b47f959266280bce3d0809f6a80c362 > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java > 35da6fc7908ec7498d1fd658d75b62028df35751 > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java > PRE-CREATION > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java > 4a8fb953ce486e1aeb1042884de56872b5539cd0 > > sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java > f56384a99e128b3e9880cbc5692107d61f2f500f > > sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java > PRE-CREATION > > sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestSentryHMSClient.java > 77a2bbb214e23ca146c2934ee4d3bf15e592906f > > > Diff: https://reviews.apache.org/r/61973/diff/1/ > > > Testing > ------- > > > Thanks, > > Sergio Pena > >
