Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-03-13 Thread kalyan kumar kalvagadda via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/
---

(Updated March 13, 2018, 11:26 p.m.)


Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
Pena, and Vadim Spector.


Changes
---

rebased the patch


Bugs: SENTRY-2109
https://issues.apache.org/jira/browse/SENTRY-2109


Repository: sentry


Description
---

This patch does couple of things
1. Avoid triggering full snapshots when gaps are observed while fetching new 
notifications. While fetching new notifications HMSFollower would would fetch 
notifications with last event-id as well. When it gets the notifications and if 
it doesn't get the notifications with event-id, full snpshot is triggered.
2. Functinalty to address gaps and out-of-sequence notificaitons by re-fetching 
addtional notifications that were already fetched. This solution is not fool 
proof. It does a best effort to reduse the chance of loosing notifications by 
re-fetching the notifications.This approach will introduce an over head of 
fetching addtional notifications that were already fetched. Overhead of DB 
look-up is addressed by using a cache. This reduces additional DB lookups 
needed to check if the notification was already processed.
2. Added looging to report duplicate events and out of order events for debug 
purpose.
3. Added new e2e tests to verfy this behavior.


Diffs (updated)
-

  
sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
 PRE-CREATION 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
 929e6be 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
 4521ad4 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
 93cc34f 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
 bf5d85b 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
 7e02874 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
 61e3f06 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
 91c90f9 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
 83a1bec 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcherCache.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
 9b0aeb2 


Diff: https://reviews.apache.org/r/64955/diff/9/

Changes: https://reviews.apache.org/r/64955/diff/8-9/


Testing
---

Made sure that tests pass.


Thanks,

kalyan kumar kalvagadda



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-02-01 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review196651
---


Ship it!




Ship It!

- Vadim Spector


On Jan. 30, 2018, 1:08 a.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 30, 2018, 1:08 a.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  93cc34f34c044dceb11d27e41ecbd1a14f64bed9 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  96c6810baa4d554db2b7d3739a28e3ff7e8b33a0 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
>  79030780c35e5bda432e3ec3f01328e627cb50a6 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  91c90f9d302f4feb3a8b3d06541f43541c87bf0f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
>  83a1becd46ac2d69c7d6dd05ed6253d1cdd8800d 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcherCache.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
>  4cd00e6672730773c74b9840247d1f4d5f7bdfe4 
> 
> 
> Diff: https://reviews.apache.org/r/64955/diff/8/
> 
> 
> Testing
> ---
> 
> Made sure that tests pass.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-29 Thread kalyan kumar kalvagadda via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/
---

(Updated Jan. 30, 2018, 1:08 a.m.)


Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
Pena, and Vadim Spector.


Bugs: SENTRY-2109
https://issues.apache.org/jira/browse/SENTRY-2109


Repository: sentry


Description
---

This patch does couple of things
1. Avoid triggering full snapshots when gaps are observed while fetching new 
notifications. While fetching new notifications HMSFollower would would fetch 
notifications with last event-id as well. When it gets the notifications and if 
it doesn't get the notifications with event-id, full snpshot is triggered.
2. Functinalty to address gaps and out-of-sequence notificaitons by re-fetching 
addtional notifications that were already fetched. This solution is not fool 
proof. It does a best effort to reduse the chance of loosing notifications by 
re-fetching the notifications.This approach will introduce an over head of 
fetching addtional notifications that were already fetched. Overhead of DB 
look-up is addressed by using a cache. This reduces additional DB lookups 
needed to check if the notification was already processed.
2. Added looging to report duplicate events and out of order events for debug 
purpose.
3. Added new e2e tests to verfy this behavior.


Diffs (updated)
-

  
sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
 PRE-CREATION 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
 2f2b98412e7dfdcc847ffe7975a70f452554e747 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
 edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
 93cc34f34c044dceb11d27e41ecbd1a14f64bed9 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
 96c6810baa4d554db2b7d3739a28e3ff7e8b33a0 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
 7e02874b4be6a7109108809b1f404fe971b7b8e2 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
 79030780c35e5bda432e3ec3f01328e627cb50a6 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
 91c90f9d302f4feb3a8b3d06541f43541c87bf0f 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
 b4100278392986c161625a366212c6fef66ec0a9 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
 83a1becd46ac2d69c7d6dd05ed6253d1cdd8800d 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcherCache.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
 4cd00e6672730773c74b9840247d1f4d5f7bdfe4 


Diff: https://reviews.apache.org/r/64955/diff/7/

Changes: https://reviews.apache.org/r/64955/diff/6-7/


Testing
---

Made sure that tests pass.


Thanks,

kalyan kumar kalvagadda



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-29 Thread kalyan kumar kalvagadda via Review Board


> On Jan. 22, 2018, 8:27 p.m., Na Li wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Lines 358 (patched)
> > 
> >
> > since we get max notification ID in next round, this event happens for 
> > every run. Can you change the logging level to debug instead of info?

That will not happen. Filter that we use in the NotificationFetcher would 
filter out those which are already processed.


> On Jan. 22, 2018, 8:27 p.m., Na Li wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 231 (patched)
> > 
> >
> > does this happen for every received notification event? If so, each 
> > will cause a DB lookup. The overhead is high.
> > 
> > Since we know the max notification ID, we can only call 
> > sentryStore.isNotificationIdProcessed(id) if the id <= max notification ID.

This will happen only for out-of-order notification events. If there were no 
out-of-order notification events and gaps, the first notification would be the 
maxnotification processed by sentry.
I'm adding a seperate cache for Id's as well.


> On Jan. 22, 2018, 8:27 p.m., Na Li wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
> > Lines 644 (patched)
> > 
> >
> > do you need this function for testing?
> > 
> > SentryStore has this public function
> > /**
> >* @return ID of the path snapshot
> >*/
> >   public Gauge getLastPathsSnapshotIdGauge() {
> > return new Gauge() {
> >   @Override
> >   public Long getValue() {
> > try {
> >   return getCurrentAuthzPathsSnapshotID();
> > } catch (Exception e) {
> >   LOGGER.error("Can not read current paths snapshot ID", e);
> >   return NOTIFICATION_UNKNOWN;
> > }
> >   }
> > };
> >   }

I need API at sentry service level for testing puposes.


- kalyan kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review195928
---


On Jan. 30, 2018, 1:08 a.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 30, 2018, 1:08 a.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  93cc34f34c044dceb11d27e41ecbd1a14f64bed9 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  96c6810baa4d554db2b7d3739a28e3ff7e8b33a0 
>   
> sentry-provider/sentry-provider-db/src/m

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-29 Thread kalyan kumar kalvagadda via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/
---

(Updated Jan. 29, 2018, 3:47 p.m.)


Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
Pena, and Vadim Spector.


Changes
---

Added more tests to verify the cache and addressed review comments.


Bugs: SENTRY-2109
https://issues.apache.org/jira/browse/SENTRY-2109


Repository: sentry


Description
---

This patch does couple of things
1. Avoid triggering full snapshots when gaps are observed while fetching new 
notifications. While fetching new notifications HMSFollower would would fetch 
notifications with last event-id as well. When it gets the notifications and if 
it doesn't get the notifications with event-id, full snpshot is triggered.
2. Functinalty to address gaps and out-of-sequence notificaitons by re-fetching 
addtional notifications that were already fetched. This solution is not fool 
proof. It does a best effort to reduse the chance of loosing notifications by 
re-fetching the notifications.This approach will introduce an over head of 
fetching addtional notifications that were already fetched. Overhead of DB 
look-up is addressed by using a cache. This reduces additional DB lookups 
needed to check if the notification was already processed.
2. Added looging to report duplicate events and out of order events for debug 
purpose.
3. Added new e2e tests to verfy this behavior.


Diffs (updated)
-

  
sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
 PRE-CREATION 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
 2f2b984 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
 edea5b6 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
 93cc34f 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
 96c6810 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
 7e02874 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
 7903078 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
 91c90f9 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
 b410027 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
 83a1bec 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcherCache.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
 4cd00e6 


Diff: https://reviews.apache.org/r/64955/diff/6/

Changes: https://reviews.apache.org/r/64955/diff/5-6/


Testing
---

Made sure that tests pass.


Thanks,

kalyan kumar kalvagadda



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-26 Thread kalyan kumar kalvagadda via Review Board


> On Jan. 25, 2018, 11:42 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Line 235 (original), 270 (patched)
> > 
> >
> > It seems better to produce two dufferent logs, e.g.
> > 
> > if(isFoundInCache(notificationEvent.getEventId(), hash) == true) {
> >   LOGGER.debug("Ignoring HMS notification already processed: ID = {}", 
> > notificationEvent.getEventId());
> >   return false;
> > } else if (sentryStore.isNotificationProcessed(hash)
> > LOGGER.debug("Ignoring HMS notification already processed: ID = {} - 
> > cache miss", notificationEvent.getEventId());
> >   return false;
> > }
> > 
> > Note "cache miss" in the second log.
> > 
> > Cache miss is what we try to avoid for better performance. It may 
> > happen sometimes, but if there are too many "cache miss" messages, it's 
> > useful to know.
> 
> Vadim Spector wrote:
> As discussed, also suggesting adding unit test to make sure if incoming 
> IDs have no out-of-order or duplicate elements (though can still have gaps), 
> then in theory there should be no cache misses in LRUMap, i.e. 
> sentryStore.isNotificationProcessed(hash) is not invoked. It's to insure that 
> if HMS folks fix the problems on their side with the IDs, this code will 
> perform no unnecessary DB lookups.

My next patch will be addresing that.


- kalyan kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review196297
---


On Jan. 25, 2018, 5 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 25, 2018, 5 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  93cc34f34c044dceb11d27e41ecbd1a14f64bed9 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  96c6810baa4d554db2b7d3739a28e3ff7e8b33a0 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
>  79030780c35e5bda432e3ec3f01328e627cb50a6 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  91c90f9d302f4feb3a8b3d06541f43541c87bf0f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFet

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-26 Thread Vadim Spector via Review Board


> On Jan. 25, 2018, 11:42 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Line 235 (original), 270 (patched)
> > 
> >
> > It seems better to produce two dufferent logs, e.g.
> > 
> > if(isFoundInCache(notificationEvent.getEventId(), hash) == true) {
> >   LOGGER.debug("Ignoring HMS notification already processed: ID = {}", 
> > notificationEvent.getEventId());
> >   return false;
> > } else if (sentryStore.isNotificationProcessed(hash)
> > LOGGER.debug("Ignoring HMS notification already processed: ID = {} - 
> > cache miss", notificationEvent.getEventId());
> >   return false;
> > }
> > 
> > Note "cache miss" in the second log.
> > 
> > Cache miss is what we try to avoid for better performance. It may 
> > happen sometimes, but if there are too many "cache miss" messages, it's 
> > useful to know.

As discussed, also suggesting adding unit test to make sure if incoming IDs 
have no out-of-order or duplicate elements (though can still have gaps), then 
in theory there should be no cache misses in LRUMap, i.e. 
sentryStore.isNotificationProcessed(hash) is not invoked. It's to insure that 
if HMS folks fix the problems on their side with the IDs, this code will 
perform no unnecessary DB lookups.


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review196297
---


On Jan. 25, 2018, 5 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 25, 2018, 5 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  93cc34f34c044dceb11d27e41ecbd1a14f64bed9 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  96c6810baa4d554db2b7d3739a28e3ff7e8b33a0 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
>  79030780c35e5bda432e3ec3f01328e627cb50a6 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  91c90f9d302f4feb3a8b3d06541f43541c87bf0f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
>  83a1becd46ac2d69c7d6dd05ed6253d1cdd8800d 
>   
> sentry-tests/sentry-tests

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-25 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review196297
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 235 (original), 270 (patched)


It seems better to produce two dufferent logs, e.g.

if(isFoundInCache(notificationEvent.getEventId(), hash) == true) {
  LOGGER.debug("Ignoring HMS notification already processed: ID = {}", 
notificationEvent.getEventId());
  return false;
} else if (sentryStore.isNotificationProcessed(hash)
LOGGER.debug("Ignoring HMS notification already processed: ID = {} - cache 
miss", notificationEvent.getEventId());
  return false;
}

Note "cache miss" in the second log.

Cache miss is what we try to avoid for better performance. It may happen 
sometimes, but if there are too many "cache miss" messages, it's useful to know.


- Vadim Spector


On Jan. 25, 2018, 5 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 25, 2018, 5 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  93cc34f34c044dceb11d27e41ecbd1a14f64bed9 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  96c6810baa4d554db2b7d3739a28e3ff7e8b33a0 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
>  79030780c35e5bda432e3ec3f01328e627cb50a6 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  91c90f9d302f4feb3a8b3d06541f43541c87bf0f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
>  83a1becd46ac2d69c7d6dd05ed6253d1cdd8800d 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
>  4cd00e6672730773c74b984

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-25 Thread kalyan kumar kalvagadda via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review196289
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 119 (patched)


Stale log



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 125 (patched)


Stale log



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 140 (patched)


Stale log


- kalyan kumar kalvagadda


On Jan. 25, 2018, 5 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 25, 2018, 5 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  93cc34f34c044dceb11d27e41ecbd1a14f64bed9 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  96c6810baa4d554db2b7d3739a28e3ff7e8b33a0 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
>  79030780c35e5bda432e3ec3f01328e627cb50a6 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  91c90f9d302f4feb3a8b3d06541f43541c87bf0f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
>  83a1becd46ac2d69c7d6dd05ed6253d1cdd8800d 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
>  4cd00e6672730773c74b9840247d1f4d5f7bdfe4 
> 
> 
> Diff: https://reviews.apache.org/r/64955/diff/5/
> 
> 
> Testing
> ---
> 
> Made sure that tests pass.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-25 Thread kalyan kumar kalvagadda via Review Board


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 231 (patched)
> > 
> >
> > optimization - no need for this DB call if the ID > max(storedId)
> 
> kalyan kumar kalvagadda wrote:
> will consider it in the new patch.

I have added simple cache logic to avoid Db lookups.


- kalyan kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review195917
---


On Jan. 25, 2018, 5 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 25, 2018, 5 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  93cc34f34c044dceb11d27e41ecbd1a14f64bed9 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  96c6810baa4d554db2b7d3739a28e3ff7e8b33a0 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
>  79030780c35e5bda432e3ec3f01328e627cb50a6 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  91c90f9d302f4feb3a8b3d06541f43541c87bf0f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
>  83a1becd46ac2d69c7d6dd05ed6253d1cdd8800d 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
>  4cd00e6672730773c74b9840247d1f4d5f7bdfe4 
> 
> 
> Diff: https://reviews.apache.org/r/64955/diff/5/
> 
> 
> Testing
> ---
> 
> Made sure that tests pass.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-25 Thread kalyan kumar kalvagadda via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/
---

(Updated Jan. 25, 2018, 5 p.m.)


Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
Pena, and Vadim Spector.


Changes
---

Details of the code changes
1. Addressed review comments from vadim and lina
2. Changed the cache logic
3. rebased the patch


Bugs: SENTRY-2109
https://issues.apache.org/jira/browse/SENTRY-2109


Repository: sentry


Description
---

This patch does couple of things
1. Avoid triggering full snapshots when gaps are observed while fetching new 
notifications. While fetching new notifications HMSFollower would would fetch 
notifications with last event-id as well. When it gets the notifications and if 
it doesn't get the notifications with event-id, full snpshot is triggered.
2. Functinalty to address gaps and out-of-sequence notificaitons by re-fetching 
addtional notifications that were already fetched. This solution is not fool 
proof. It does a best effort to reduse the chance of loosing notifications by 
re-fetching the notifications.This approach will introduce an over head of 
fetching addtional notifications that were already fetched. Overhead of DB 
look-up is addressed by using a cache. This reduces additional DB lookups 
needed to check if the notification was already processed.
2. Added looging to report duplicate events and out of order events for debug 
purpose.
3. Added new e2e tests to verfy this behavior.


Diffs (updated)
-

  
sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
 PRE-CREATION 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
 2f2b98412e7dfdcc847ffe7975a70f452554e747 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
 edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
 93cc34f34c044dceb11d27e41ecbd1a14f64bed9 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
 96c6810baa4d554db2b7d3739a28e3ff7e8b33a0 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
 7e02874b4be6a7109108809b1f404fe971b7b8e2 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
 79030780c35e5bda432e3ec3f01328e627cb50a6 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
 91c90f9d302f4feb3a8b3d06541f43541c87bf0f 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
 b4100278392986c161625a366212c6fef66ec0a9 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
 83a1becd46ac2d69c7d6dd05ed6253d1cdd8800d 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
 4cd00e6672730773c74b9840247d1f4d5f7bdfe4 


Diff: https://reviews.apache.org/r/64955/diff/5/

Changes: https://reviews.apache.org/r/64955/diff/4-5/


Testing
---

Made sure that tests pass.


Thanks,

kalyan kumar kalvagadda



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-23 Thread Vadim Spector via Review Board


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 67 (patched)
> > 
> >
> > I wonder about the choice of LRUMap for cache.
> > 
> > The ideal data structure would one only store the caches for the events 
> > in the ID range that we want, which is [MAX(event_id) - refetch_count, 
> > MAX(event_id)].
> > 
> > LRUMap does not explicitly address this requirement. It assumes that 
> > the oldest (least accessed) caches will tend to be for the events with the 
> > oldest IDs. We know this is often not true, which is one of the problems 
> > this JIRA tries to work around.
> > 
> > Given that when we do not find the entry in cache we check SentryStore 
> > anyway, everything will still work. But we may end up with a few extra DB 
> > calls to SentryStore and it's hard to predict the overhead.
> > 
> > The cache structure solving this shortcoming could be 
> > HashSet[notificationFetchCount +1] array.
> > 
> > It's the circular buffer structure where the cache[ID % cache.length] 
> > element contains the HashSet of all caches for the events with the given 
> > event ID. We can see just as fast if the event has been processed.
> > 
> > Before each fetch we can calculate which elements in this cache array 
> > correspond to the IDs outside the current fetching range and nullify them 
> > (for curcular buffer some care is needed for modulo arithmetic).
> > 
> > This way we know that
> > a) cache stores the caches of the events only in the range we care 
> > about, though now cleanup is explicit.
> > b) we do not necessarily need to put the upper limit on the number of 
> > events with the given ID (the sizes of HashSet's).
> > 
> > It may save some expensive DB calls.
> > 
> > However, I do not feel strongly one way or the other until we test it 
> > and see the overhead. Perhaps the fudge factor x3 in LRUMap size is enough 
> > to keep enough entries, old or new; so I'm leaving it up to you.
> 
> Na Li wrote:
> Vadim, I prefer to use circular buffer as well. Mentioned this to Kalyan 
> before. The memory usage and execution overhead should be much smaller than 
> LRUMap.
> 
> Vadim Spector wrote:
> I like that if the entry is not found in the cache, there is no need to 
> go to the database to double-check. Can't beat that.
> 
> kalyan kumar kalvagadda wrote:
> Help we understand your comment.
> "LRUMap does not explicitly address this requirement. It assumes that the 
> oldest (least accessed) caches will tend to be for the events with the oldest 
> IDs. We know this is often not true, which is one of the problems this JIRA 
> tries to work around."
> 
> Can you give me one example where you think it would not be found in the 
> cache?

For simplicity of the example, suppose we look back by 2 IDs. It means that we 
fetch with starting number (MaxID - 2) to include 3 IDs: (MaxID-2, MaxID-1, 
MaxID). Therefore, the LRUMap size is 3 * 3 = 9.
Suppose, starting point is LRUMap is empty (just to keep the explanation 
simple), then we perform several fetches, in the exact order specified (first 
listed ID arrives first):

fetch #1 (start_id = 0): 3, 3, 3, 2, 2, 2, 1, 1, 1
 Out of order, can happen, the entries for ID = 3 are the oldest ones. 
LRUMap reaches max size 9. maxID = 3.

fetch #2 (start_id = 3 - 2 = 1):  3, 3, 3, 2, 2, 2, 1, 1, 1 (re-fetched from 
fetch #1), 3, 4 (new events)
 The LRUMap automatically purges the first 2 entries with the ID = 3 to 
accomodate new events 3 and 4. maxID = 4.

fetch #3 (start_id = 4 - 2 = 2): 3, 3, 3, 2, 2, 2, (refetched from fetch #1, 
excluding ID = 1, as it is < 2), 3, 4 (refetched from fetch #2), 5 (new)

Let's stop here. In fetch #3 we still get the first two events with ID = 3, 
which were originally in fetch #1 and fetch #2. However, they are no longer in 
LRUMap, purged after fetch #2, so we end up with DB lookup.

I just provided the simplest example. It is probably extreme that the event 
with the biggest ID in the fetch arrives first in the fetch, as in fetch #1. 
However, it can arrive somewhere in the middle of the fetch too, and some time 
later LRU will still purge it ahead of the events with the smaller ID that 
should've been purged first.

As I was working through this example, I realized that things could be better 
if after each fetch the events were first sorted by their IDs before adding to 
LRUMap. I do not think it will entirely solve the problem of prematurely purged 
entries. The example where all IDs get sorted prior to putting them into LRUMap:

fetch #1 (start_id = 0): 1, 1, 1, 2, 2, 2, 3, 3, 3
 All in order. MaxID = 3

fetch #2 (start_id = 3 - 2 = 1): 1, 1, 1, 2, 2, 2, 3, 3, 3 (from fetch #1), 1, 
1, 1, 1, 2, 2, 2, (new), 

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-23 Thread kalyan kumar kalvagadda via Review Board


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Line 60 (original), 61 (patched)
> > 
> >
> > This using probably unintentional use of Bool instead of primitive 
> > boolean. There seems to be no reason for using autoboxed types.

it was un-intentional. I will update it.


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 70 (patched)
> > 
> >
> > So, the assumption is that the average number of duplicates per each ID 
> > cannot exceed 3, correct?

Re-tried notifications will be found in the cache even if all of them have 3 
duplicates.


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 67 (patched)
> > 
> >
> > I wonder about the choice of LRUMap for cache.
> > 
> > The ideal data structure would one only store the caches for the events 
> > in the ID range that we want, which is [MAX(event_id) - refetch_count, 
> > MAX(event_id)].
> > 
> > LRUMap does not explicitly address this requirement. It assumes that 
> > the oldest (least accessed) caches will tend to be for the events with the 
> > oldest IDs. We know this is often not true, which is one of the problems 
> > this JIRA tries to work around.
> > 
> > Given that when we do not find the entry in cache we check SentryStore 
> > anyway, everything will still work. But we may end up with a few extra DB 
> > calls to SentryStore and it's hard to predict the overhead.
> > 
> > The cache structure solving this shortcoming could be 
> > HashSet[notificationFetchCount +1] array.
> > 
> > It's the circular buffer structure where the cache[ID % cache.length] 
> > element contains the HashSet of all caches for the events with the given 
> > event ID. We can see just as fast if the event has been processed.
> > 
> > Before each fetch we can calculate which elements in this cache array 
> > correspond to the IDs outside the current fetching range and nullify them 
> > (for curcular buffer some care is needed for modulo arithmetic).
> > 
> > This way we know that
> > a) cache stores the caches of the events only in the range we care 
> > about, though now cleanup is explicit.
> > b) we do not necessarily need to put the upper limit on the number of 
> > events with the given ID (the sizes of HashSet's).
> > 
> > It may save some expensive DB calls.
> > 
> > However, I do not feel strongly one way or the other until we test it 
> > and see the overhead. Perhaps the fudge factor x3 in LRUMap size is enough 
> > to keep enough entries, old or new; so I'm leaving it up to you.
> 
> Na Li wrote:
> Vadim, I prefer to use circular buffer as well. Mentioned this to Kalyan 
> before. The memory usage and execution overhead should be much smaller than 
> LRUMap.
> 
> Vadim Spector wrote:
> I like that if the entry is not found in the cache, there is no need to 
> go to the database to double-check. Can't beat that.

Help we understand your comment.
"LRUMap does not explicitly address this requirement. It assumes that the 
oldest (least accessed) caches will tend to be for the events with the oldest 
IDs. We know this is often not true, which is one of the problems this JIRA 
tries to work around."

Can you give me one example where you think it would not be found in the cache?


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 119 (patched)
> > 
> >
> > any reason for returning Boolean vs bool?

it was un-intentional. I will update it.


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 231 (patched)
> > 
> >
> > optimization - no need for this DB call if the ID > max(storedId)

will consider it in the new patch.


- kalyan kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review195917
---


On Jan. 22, 2018, 5:40 p.m., kalyan kumar kalvagadda wrote:
> 
> 

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-22 Thread Vadim Spector via Review Board


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 67 (patched)
> > 
> >
> > I wonder about the choice of LRUMap for cache.
> > 
> > The ideal data structure would one only store the caches for the events 
> > in the ID range that we want, which is [MAX(event_id) - refetch_count, 
> > MAX(event_id)].
> > 
> > LRUMap does not explicitly address this requirement. It assumes that 
> > the oldest (least accessed) caches will tend to be for the events with the 
> > oldest IDs. We know this is often not true, which is one of the problems 
> > this JIRA tries to work around.
> > 
> > Given that when we do not find the entry in cache we check SentryStore 
> > anyway, everything will still work. But we may end up with a few extra DB 
> > calls to SentryStore and it's hard to predict the overhead.
> > 
> > The cache structure solving this shortcoming could be 
> > HashSet[notificationFetchCount +1] array.
> > 
> > It's the circular buffer structure where the cache[ID % cache.length] 
> > element contains the HashSet of all caches for the events with the given 
> > event ID. We can see just as fast if the event has been processed.
> > 
> > Before each fetch we can calculate which elements in this cache array 
> > correspond to the IDs outside the current fetching range and nullify them 
> > (for curcular buffer some care is needed for modulo arithmetic).
> > 
> > This way we know that
> > a) cache stores the caches of the events only in the range we care 
> > about, though now cleanup is explicit.
> > b) we do not necessarily need to put the upper limit on the number of 
> > events with the given ID (the sizes of HashSet's).
> > 
> > It may save some expensive DB calls.
> > 
> > However, I do not feel strongly one way or the other until we test it 
> > and see the overhead. Perhaps the fudge factor x3 in LRUMap size is enough 
> > to keep enough entries, old or new; so I'm leaving it up to you.
> 
> Na Li wrote:
> Vadim, I prefer to use circular buffer as well. Mentioned this to Kalyan 
> before. The memory usage and execution overhead should be much smaller than 
> LRUMap.

I like that if the entry is not found in the cache, there is no need to go to 
the database to double-check. Can't beat that.


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review195917
---


On Jan. 22, 2018, 5:40 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 22, 2018, 5:40 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c28f35af86952c213d5e20a8c9bb3490 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  097aa62912e92ece7f8d

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-22 Thread Na Li via Review Board


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 67 (patched)
> > 
> >
> > I wonder about the choice of LRUMap for cache.
> > 
> > The ideal data structure would one only store the caches for the events 
> > in the ID range that we want, which is [MAX(event_id) - refetch_count, 
> > MAX(event_id)].
> > 
> > LRUMap does not explicitly address this requirement. It assumes that 
> > the oldest (least accessed) caches will tend to be for the events with the 
> > oldest IDs. We know this is often not true, which is one of the problems 
> > this JIRA tries to work around.
> > 
> > Given that when we do not find the entry in cache we check SentryStore 
> > anyway, everything will still work. But we may end up with a few extra DB 
> > calls to SentryStore and it's hard to predict the overhead.
> > 
> > The cache structure solving this shortcoming could be 
> > HashSet[notificationFetchCount +1] array.
> > 
> > It's the circular buffer structure where the cache[ID % cache.length] 
> > element contains the HashSet of all caches for the events with the given 
> > event ID. We can see just as fast if the event has been processed.
> > 
> > Before each fetch we can calculate which elements in this cache array 
> > correspond to the IDs outside the current fetching range and nullify them 
> > (for curcular buffer some care is needed for modulo arithmetic).
> > 
> > This way we know that
> > a) cache stores the caches of the events only in the range we care 
> > about, though now cleanup is explicit.
> > b) we do not necessarily need to put the upper limit on the number of 
> > events with the given ID (the sizes of HashSet's).
> > 
> > It may save some expensive DB calls.
> > 
> > However, I do not feel strongly one way or the other until we test it 
> > and see the overhead. Perhaps the fudge factor x3 in LRUMap size is enough 
> > to keep enough entries, old or new; so I'm leaving it up to you.

Vadim, I prefer to use circular buffer as well. Mentioned this to Kalyan 
before. The memory usage and execution overhead should be much smaller than 
LRUMap.


- Na


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review195917
---


On Jan. 22, 2018, 5:40 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 22, 2018, 5:40 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c28f35af86952c213d5e20a8c9bb3490 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  097aa62912e92ece7f8da0ac0fccb124579a88f2 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  43535a7b50fea51049f31428

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-22 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review195917
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 60 (original), 61 (patched)


This using probably unintentional use of Bool instead of primitive boolean. 
There seems to be no reason for using autoboxed types.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 70 (patched)


So, the assumption is that the average number of duplicates per each ID 
cannot exceed 3, correct?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 67 (patched)


I wonder about the choice of LRUMap for cache.

The ideal data structure would one only store the caches for the events in 
the ID range that we want, which is [MAX(event_id) - refetch_count, 
MAX(event_id)].

LRUMap does not explicitly address this requirement. It assumes that the 
oldest (least accessed) caches will tend to be for the events with the oldest 
IDs. We know this is often not true, which is one of the problems this JIRA 
tries to work around.

Given that when we do not find the entry in cache we check SentryStore 
anyway, everything will still work. But we may end up with a few extra DB calls 
to SentryStore and it's hard to predict the overhead.

The cache structure solving this shortcoming could be 
HashSet[notificationFetchCount +1] array.

It's the circular buffer structure where the cache[ID % cache.length] 
element contains the HashSet of all caches for the events with the given event 
ID. We can see just as fast if the event has been processed.

Before each fetch we can calculate which elements in this cache array 
correspond to the IDs outside the current fetching range and nullify them (for 
curcular buffer some care is needed for modulo arithmetic).

This way we know that
a) cache stores the caches of the events only in the range we care about, 
though now cleanup is explicit.
b) we do not necessarily need to put the upper limit on the number of 
events with the given ID (the sizes of HashSet's).

It may save some expensive DB calls.

However, I do not feel strongly one way or the other until we test it and 
see the overhead. Perhaps the fudge factor x3 in LRUMap size is enough to keep 
enough entries, old or new; so I'm leaving it up to you.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 119 (patched)


any reason for returning Boolean vs bool?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 231 (patched)


optimization - no need for this DB call if the ID > max(storedId)


- Vadim Spector


On Jan. 22, 2018, 5:40 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 22, 2018, 5:40 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-c

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-22 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review195928
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 186 (original), 187 (patched)


Can you add comment for input notificationId?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 188 (original), 189 (patched)


can you name the input as "maxNotificationId" to high light it is the max 
value of processed notification ID? It is easier to read the code this way.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 358 (patched)


since we get max notification ID in next round, this event happens for 
every run. Can you change the logging level to debug instead of info?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 109 (patched)


If we only need to store the hash value, not a (key, value) pair, can we 
use HashSet as collection? That will have smaller memory usage.

or use LRUMap with value as boolean, not string.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 231 (patched)


does this happen for every received notification event? If so, each will 
cause a DB lookup. The overhead is high.

Since we know the max notification ID, we can only call 
sentryStore.isNotificationIdProcessed(id) if the id <= max notification ID.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
Lines 644 (patched)


do you need this function for testing?

SentryStore has this public function
/**
   * @return ID of the path snapshot
   */
  public Gauge getLastPathsSnapshotIdGauge() {
return new Gauge() {
  @Override
  public Long getValue() {
try {
  return getCurrentAuthzPathsSnapshotID();
} catch (Exception e) {
  LOGGER.error("Can not read current paths snapshot ID", e);
  return NOTIFICATION_UNKNOWN;
}
  }
};
  }


- Na Li


On Jan. 22, 2018, 5:40 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 22, 2018, 5:40 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c28f35af86952c213d5e20a8c9bb3490 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  097aa62912e92ece7f8da0ac0fccb124579a88f2 
>   
> sentry-provider/se

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-22 Thread kalyan kumar kalvagadda via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/
---

(Updated Jan. 22, 2018, 5:40 p.m.)


Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
Pena, and Vadim Spector.


Changes
---

Addressed comments from vadim


Bugs: SENTRY-2109
https://issues.apache.org/jira/browse/SENTRY-2109


Repository: sentry


Description
---

This patch does couple of things
1. Avoid triggering full snapshots when gaps are observed while fetching new 
notifications. While fetching new notifications HMSFollower would would fetch 
notifications with last event-id as well. When it gets the notifications and if 
it doesn't get the notifications with event-id, full snpshot is triggered.
2. Functinalty to address gaps and out-of-sequence notificaitons by re-fetching 
addtional notifications that were already fetched. This solution is not fool 
proof. It does a best effort to reduse the chance of loosing notifications by 
re-fetching the notifications.This approach will introduce an over head of 
fetching addtional notifications that were already fetched. Overhead of DB 
look-up is addressed by using a cache. This reduces additional DB lookups 
needed to check if the notification was already processed.
2. Added looging to report duplicate events and out of order events for debug 
purpose.
3. Added new e2e tests to verfy this behavior.


Diffs (updated)
-

  
sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
 PRE-CREATION 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
 6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
 aa1b6a31c28f35af86952c213d5e20a8c9bb3490 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
 097aa62912e92ece7f8da0ac0fccb124579a88f2 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
 43535a7b50fea51049f3142837736e6a99a3a80f 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
 7e02874b4be6a7109108809b1f404fe971b7b8e2 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
 501898bca261db2daf937a8d803d12a59616192b 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
 b4100278392986c161625a366212c6fef66ec0a9 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
 edde886a7646539499149f2d86758979436567bd 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
 83a1becd46ac2d69c7d6dd05ed6253d1cdd8800d 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
 4cd00e6672730773c74b9840247d1f4d5f7bdfe4 


Diff: https://reviews.apache.org/r/64955/diff/4/

Changes: https://reviews.apache.org/r/64955/diff/3-4/


Testing
---

Made sure that tests pass.


Thanks,

kalyan kumar kalvagadda



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-19 Thread kalyan kumar kalvagadda via Review Board


> On Jan. 19, 2018, 1:06 a.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 205 (patched)
> > 
> >
> > this section of code suggests that foundLastProcessedNotification can 
> > only change from false to true. Can it change from true to false? If yes, 
> > where can it happen? If not, why? Please, add comment on that.

I have a comment explaining it at the delclaration of 
foundLastProcessedNotification. I will elaborate it furthur.


> On Jan. 19, 2018, 1:06 a.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Line 206 (original)
> > 
> >
> > why did you delete the cache.clear() line?

Previously cache was just a set that was used to address duplicate 
notifications.

Now it is a LRUMap to server a different pupose. It cleans-up it self when it 
is full.


- kalyan kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review195758
---


On Jan. 17, 2018, 11:56 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 17, 2018, 11:56 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c28f35af86952c213d5e20a8c9bb3490 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  097aa62912e92ece7f8da0ac0fccb124579a88f2 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  43535a7b50fea51049f3142837736e6a99a3a80f 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  501898bca261db2daf937a8d803d12a59616192b 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7646539499149f2d86758979436567bd 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
>  83a1becd46ac2d69c7d6dd05ed6253d1cdd8800d 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-18 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review195758
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 66 (patched)


why not "private final LRUMap cache;"? The code never resets the reference.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 75 (patched)


Please, add javadoc here: what does x3 factor mean? What's the assumption 
here?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 112 (patched)


Nit: just "return cache.get(hash) != null;"



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 89 (original), 147 (patched)


You create filter for the minFetchId which you immediately update on the 
next line. One-line comment on what's going on here would not hurt.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 92 (original), 154 (patched)


Let's log both lastEventId and minFetchEventId. So we'd be clear on what's 
the biggest ID we received and what's the starting ID we are fetching now. 
Something like:

LOGGER.debug("Requesting HMS notifications since ID = {}, last ID {}", 
minFetchId, lastEventId);

This way when we try to make sense out of the logs, we don't need to do 
math each time.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 96 (original), 158 (patched)


It's a brand-new logic here, critical for valdiating this whole new design. 
Can you add javadoc summarizing: why does HiveNotificationFetcher decide here 
that it is out of sync with HMS?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 205 (patched)


this section of code suggests that foundLastProcessedNotification can only 
change from false to true. Can it change from true to false? If yes, where can 
it happen? If not, why? Please, add comment on that.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 206 (original)


why did you delete the cache.clear() line?


- Vadim Spector


On Jan. 17, 2018, 11:56 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 17, 2018, 11:56 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/s

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-17 Thread kalyan kumar kalvagadda via Review Board


> On Jan. 8, 2018, 4:35 p.m., Arjun Mishra wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Lines 357 (patched)
> > 
> >
> > Can we make this WARN?

Unless, it is creating an issue I don't see a reason to log in WARNING.

WARNING level is used to WARN the admin looking at it so that something is 
wrong. When we know that this is expected with the current behaviour of HMS we 
don't have to log it as warning.


> On Jan. 8, 2018, 4:35 p.m., Arjun Mishra wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Lines 359 (patched)
> > 
> >
> > Can we make this WARN?

Same as above.


> On Jan. 8, 2018, 4:35 p.m., Arjun Mishra wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 98 (patched)
> > 
> >
> > Should foundLastProcessedNotification=false be initialized before the 
> > if block

Nope. Do you have a reason why it should be initialized before if block?


> On Jan. 8, 2018, 4:35 p.m., Arjun Mishra wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 109 (patched)
> > 
> >
> > The log message should say "Event-Id - 1 of the last event processed by 
> > sentry..."

lastEventId is the max event ID processed my sentry it is not Event-Id - 1.


> On Jan. 8, 2018, 4:35 p.m., Arjun Mishra wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Line 137 (original), 164 (patched)
> > 
> >
> > Why do we add hash, if it already contains it?

This comment is relevent for the new patch submitted as this code changed.


- kalyan kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review194953
---


On Jan. 17, 2018, 11:56 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 17, 2018, 11:56 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c28f35af86952c213d5e20a8c9bb3490 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  097aa62912e92ece7f8da0ac0fccb124579a88f2 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  43535a7b50fea51049f3142837736e6a99a3a80f 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  501898bca261db2daf937a8d803d12a59616192b 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7646539499149f2d86758979436567bd 
>   
> sentry-pro

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-17 Thread kalyan kumar kalvagadda via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/
---

(Updated Jan. 17, 2018, 11:56 p.m.)


Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
Pena, and Vadim Spector.


Changes
---

New patch has functinalty to address gaps and ou-of-sequence notificaitons. 
This solution is not fool proof. It does a best effort to reduse the chance of 
loosing notifications by re-fetching the notifications.

This approach will introduce an over head of fetching addtional notifications 
will were already fetched. Overhead of DB look-up is addressed by using a 
cache. This reduces additional DB lookups needed to check if the notification 
was already processed.


Bugs: SENTRY-2109
https://issues.apache.org/jira/browse/SENTRY-2109


Repository: sentry


Description
---

This patch does couple of things
1. Avoid triggering full snapshots when gaps are observed while fetching new 
notifications. While fetching new notifications HMSFollower would would fetch 
notifications with last event-id as well. When it gets the notifications and if 
it doesn't get the notifications with event-id, full snpshot is triggered.
2. Added looging to report duplicate events and out of order events for debug 
purpose.
3. Added new e2e tests to verfy this behavior.


Diffs (updated)
-

  
sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
 PRE-CREATION 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
 6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
 aa1b6a31c28f35af86952c213d5e20a8c9bb3490 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
 097aa62912e92ece7f8da0ac0fccb124579a88f2 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
 43535a7b50fea51049f3142837736e6a99a3a80f 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
 7e02874b4be6a7109108809b1f404fe971b7b8e2 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
 501898bca261db2daf937a8d803d12a59616192b 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
 b4100278392986c161625a366212c6fef66ec0a9 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
 edde886a7646539499149f2d86758979436567bd 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
 83a1becd46ac2d69c7d6dd05ed6253d1cdd8800d 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
 4cd00e6672730773c74b9840247d1f4d5f7bdfe4 


Diff: https://reviews.apache.org/r/64955/diff/3/

Changes: https://reviews.apache.org/r/64955/diff/2-3/


Testing
---

Made sure that tests pass. There are three test failures which need code change 
done for SENTRY-2113.


Thanks,

kalyan kumar kalvagadda



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-08 Thread kalyan kumar kalvagadda via Review Board


> On Jan. 4, 2018, 10:40 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Line 205 (original), 206 (patched)
> > 
> >
> > I thought we've agreed that at this point we shall pass to 
> > notificationFetcher.fetchNotification() notificationId-delta, where delta 
> > is configurable and probably set to some significant number, like 10. 
> > Otherwise we'll be loosing lots of older notifications with smaller IDs 
> > that got persisted since the previous fetch, due to the HMS issue.
> 
> kalyan kumar kalvagadda wrote:
> we can handle that as seperate issue. we need this change to avaoid 
> notificated getting missed. Scope of this patch is just to avoid full 
> snapshots un-necessarily.
> 
> Vadim Spector wrote:
> Hmm, I'd like other peoples' take on this. These two issues are not 
> independent imo. Avoiding full snapshots is good, but full snapshots, even if 
> done for a wrong reason, would offset for missing updates, at least to some 
> extent. Now we may start having quite a few of them. 
> 
> It's a few line fix, really, just define some "delta" config parameter, 
> set its default to some value, could be fairly small, like 2, and pass 
> "notificationId-delta" instead; I'd rather just take care of it now. Based on 
> what we know of HMS ID generation problem, I'd feel really uncomfortable 
> releasing one fix w/o another.

Sounds reasonable. Will update the patch with this behavior.


- kalyan kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review194788
---


On Jan. 5, 2018, 12:10 a.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 5, 2018, 12:10 a.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c28f35af86952c213d5e20a8c9bb3490 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  097aa62912e92ece7f8da0ac0fccb124579a88f2 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  43535a7b50fea51049f3142837736e6a99a3a80f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  501898bca261db2daf937a8d803d12a59616192b 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7646539499149f2d86758979436567bd 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
>  c8eef09ed61e11583838a3dfd3117a9478df23ba 
> 
> 
> Diff: https://reviews.apache.org/r/64955/diff/2/
> 
> 
> Testing
> ---
> 
> Made sure that tests pass. There are three test failures which ne

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-08 Thread Arjun Mishra via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review194953
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 357 (patched)


Can we make this WARN?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 359 (patched)


Can we make this WARN?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 83 (patched)


Nit: have method declaration in one line with return type



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 98 (patched)


Should foundLastProcessedNotification=false be initialized before the if 
block



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 109 (patched)


The log message should say "Event-Id - 1 of the last event processed by 
sentry..."



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 137 (original), 164 (patched)


Why do we add hash, if it already contains it?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 144 (original), 171 (patched)


Can we replace e.getMessage() with simply e?


- Arjun Mishra


On Jan. 5, 2018, 12:10 a.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 5, 2018, 12:10 a.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c28f35af86952c213d5e20a8c9bb3490 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  097aa62912e92ece7f8da0ac0fccb124579a88f2 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  43535a7b50fea51049f3142837736e6a99a3a80f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  501898bca261db2daf937a8d803d12a59616192b 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7646539499149f2d86758979436567bd 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
>  c8eef09ed61e11583838a3dfd3117a9478df23ba 
> 
> 
> Diff: https://reviews.apache.org/r/6

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-04 Thread Vadim Spector via Review Board


> On Jan. 4, 2018, 10:40 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Line 205 (original), 206 (patched)
> > 
> >
> > I thought we've agreed that at this point we shall pass to 
> > notificationFetcher.fetchNotification() notificationId-delta, where delta 
> > is configurable and probably set to some significant number, like 10. 
> > Otherwise we'll be loosing lots of older notifications with smaller IDs 
> > that got persisted since the previous fetch, due to the HMS issue.
> 
> kalyan kumar kalvagadda wrote:
> we can handle that as seperate issue. we need this change to avaoid 
> notificated getting missed. Scope of this patch is just to avoid full 
> snapshots un-necessarily.

Hmm, I'd like other peoples' take on this. These two issues are not independent 
imo. Avoiding full snapshots is good, but full snapshots, even if done for a 
wrong reason, would offset for missing updates, at least to some extent. Now we 
may start having quite a few of them. 

It's a few line fix, really, just define some "delta" config parameter, set its 
default to some value, could be fairly small, like 2, and pass 
"notificationId-delta" instead; I'd rather just take care of it now. Based on 
what we know of HMS ID generation problem, I'd feel really uncomfortable 
releasing one fix w/o another.


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review194788
---


On Jan. 5, 2018, 12:10 a.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 5, 2018, 12:10 a.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c28f35af86952c213d5e20a8c9bb3490 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  097aa62912e92ece7f8da0ac0fccb124579a88f2 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  43535a7b50fea51049f3142837736e6a99a3a80f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  501898bca261db2daf937a8d803d12a59616192b 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7646539499149f2d86758979436567bd 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
>  c8eef09ed61e11583838a3dfd3117a9478df23ba 
> 
> 
> Diff: https://reviews.apache.org/r/64955/diff/2/
> 
> 
> Testing
> ---
> 
> Made sure that tests pass. There are three test failures which need code 
> change done for SENTRY-2113.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-04 Thread kalyan kumar kalvagadda via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/
---

(Updated Jan. 5, 2018, 12:10 a.m.)


Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
Pena, and Vadim Spector.


Changes
---

addressed review comments from vadim


Bugs: SENTRY-2109
https://issues.apache.org/jira/browse/SENTRY-2109


Repository: sentry


Description
---

This patch does couple of things
1. Avoid triggering full snapshots when gaps are observed while fetching new 
notifications. While fetching new notifications HMSFollower would would fetch 
notifications with last event-id as well. When it gets the notifications and if 
it doesn't get the notifications with event-id, full snpshot is triggered.
2. Added looging to report duplicate events and out of order events for debug 
purpose.
3. Added new e2e tests to verfy this behavior.


Diffs (updated)
-

  
sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
 PRE-CREATION 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
 6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
 aa1b6a31c28f35af86952c213d5e20a8c9bb3490 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
 097aa62912e92ece7f8da0ac0fccb124579a88f2 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
 43535a7b50fea51049f3142837736e6a99a3a80f 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
 501898bca261db2daf937a8d803d12a59616192b 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
 b4100278392986c161625a366212c6fef66ec0a9 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
 edde886a7646539499149f2d86758979436567bd 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
 c8eef09ed61e11583838a3dfd3117a9478df23ba 


Diff: https://reviews.apache.org/r/64955/diff/2/

Changes: https://reviews.apache.org/r/64955/diff/1-2/


Testing
---

Made sure that tests pass. There are three test failures which need code change 
done for SENTRY-2113.


Thanks,

kalyan kumar kalvagadda



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-04 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review194788
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
Line 2807 (original), 2807 (patched)


This in fact ends up calling getMaxPersistedIDCore(), so would it be a good 
opportunity to rename it now, so it would be clear we are talking about the max 
ID?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 148 (original), 148 (patched)






sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 149 (original), 149 (patched)


lets make it clear that in fact this method returns the max ID in the 
store. shall we change method name to make it clear?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 205 (original), 206 (patched)


I thought we've agreed that at this point we shall pass to 
notificationFetcher.fetchNotification() notificationId-delta, where delta is 
configurable and probably set to some significant number, like 10. Otherwise 
we'll be loosing lots of older notifications with smaller IDs that got 
persisted since the previous fetch, due to the HMS issue.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 392 (original), 345 (patched)


add @param javadoc for notificationId



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 403 (original)


could you explain why you removed LOGGER.debug()?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 48 (patched)


please, add javadoc for what this value stands for, since it obviously 
plays important role in the new logic


- Vadim Spector


On Jan. 4, 2018, 8:22 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 4, 2018, 8:22 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631f 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a3 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  097aa62 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  43535a7 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  501898b 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/T

Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-04 Thread kalyan kumar kalvagadda via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/
---

Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
Pena, and Vadim Spector.


Bugs: SENTRY-2109
https://issues.apache.org/jira/browse/SENTRY-2109


Repository: sentry


Description
---

This patch does couple of things
1. Avoid triggering full snapshots when gaps are observed while fetching new 
notifications. While fetching new notifications HMSFollower would would fetch 
notifications with last event-id as well. When it gets the notifications and if 
it doesn't get the notifications with event-id, full snpshot is triggered.
2. Added looging to report duplicate events and out of order events for debug 
purpose.
3. Added new e2e tests to verfy this behavior.


Diffs
-

  
sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
 PRE-CREATION 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
 6c4631f 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
 aa1b6a3 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
 097aa62 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
 43535a7 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
 501898b 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
 edde886 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
 PRE-CREATION 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
 c8eef09 


Diff: https://reviews.apache.org/r/64955/diff/1/


Testing
---

Made sure that tests pass. There are three test failures which need code change 
done for SENTRY-2113.


Thanks,

kalyan kumar kalvagadda