> On Aug. 31, 2017, 12:25 a.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Line 247 (original), 251 (patched)
> > <https://reviews.apache.org/r/61973/diff/1/?file=1807070#file1807070line252>
> >
> >     Style - is it preferrable to use // style comments to simplify 
> > commenting out block of code with /* comments?

I moved the comment a few lines before to make a comment for the block 
(actually for the method).


> On Aug. 31, 2017, 12:25 a.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Line 254 (original), 258 (patched)
> > <https://reviews.apache.org/r/61973/diff/1/?file=1807070#file1807070line259>
> >
> >     Wording here. May be you can reword it here. E.g.
> >     
> >     HMS notifications may contain both gaps in the sequence and duplicates 
> > (the same ID repeated more then once for different events). Duplicates do 
> > not necessarily come one after another, so the sequence {1,2,3,2} is 
> > possible

I did not add the last sentence because the {1,2,3,2} sequence is not true for 
Sentry. Sentry will always get 1,2,2,3 because Hive orders the sequence even if 
the commit was later.


> On Aug. 31, 2017, 12:25 a.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Line 260 (original)
> > <https://reviews.apache.org/r/61973/diff/1/?file=1807070#file1807070line272>
> >
> >     So you decided not to log anything in this case?

Yeah. Initially we thought it was useful just as a warning to the user that 
something is not right. But we thought this was not going to happen often. Now, 
with HMS HA, I know this will happen too often and I'd like to avoid those 
warnings. In a quence 1,2,2,3, the expected number of notifications is 3, so 
this will warn the user something is wrong but it is not.


> On Aug. 31, 2017, 12:25 a.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 127 (patched)
> > <https://reviews.apache.org/r/61973/diff/1/?file=1807071#file1807071line127>
> >
> >     If I am reading this code correctly, the following will happen:
> >     
> >     1) Every time we request IDs starting from last seen ID, so normally 
> > the first ID in the sequence is always a duplicate
> >     2) This first ID in the sequence isn't in the cache yet, so we go to 
> > the sentryStore and fetch the hash and compare.
> >     
> >     This means that we always go to sentryStore for the frist element of 
> > the batch which means every time we call fetchNotifications().
> >     
> >     This is not very bad, but if we store the sha1 hash of the last event 
> > in the cache we don't have to go to sentryStore at all under normal 
> > circumstances

Keeping the cache of the last ID is not enough. The HiveNotificationFetcher 
does not know if the IDs fetched are processed, so the next call to 
fetchNotifications() may  be called with an ID that was not cached. If I want 
to cache all of them, then I will add more procesisng on getting the sha1() of 
all the notifications and adding them to the cache. Isn't simpler just to keep 
it the way it is now?


- Sergio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61973/#review184215
-----------------------------------------------------------


On Aug. 29, 2017, 6:41 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61973/
> -----------------------------------------------------------
> 
> (Updated Aug. 29, 2017, 6:41 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov and Na Li.
> 
> 
> Bugs: sentry-1888
>     https://issues.apache.org/jira/browse/sentry-1888
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Approach:
> The proposed solution will be to request notifications from the last ID seen 
> again. This way we could bring current duplicates and apply them on Sentry. 
> We have the risk to miss duplicates that were committed much time later, but 
> we cannot trust on those duplicates as they will not know the order of the 
> time they were committed.
> 
> The patch adds a new helper class called HiveNotificationFetcher that allow 
> us to bring new notifications and apply a filter to those notifications that 
> were already processed by Sentry.
> 
> 
> Diffs
> -----
> 
>   
> sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/UniquePathsUpdate.java
>  7dae2f538602f4c264084791fb45bb6891a34941 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  593b92f96b47f959266280bce3d0809f6a80c362 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  35da6fc7908ec7498d1fd658d75b62028df35751 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
>  4a8fb953ce486e1aeb1042884de56872b5539cd0 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  f56384a99e128b3e9880cbc5692107d61f2f500f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestSentryHMSClient.java
>  77a2bbb214e23ca146c2934ee4d3bf15e592906f 
> 
> 
> Diff: https://reviews.apache.org/r/61973/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>

Reply via email to