Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-02-01 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review196651
---


Ship it!




Ship It!

- Vadim Spector


On Jan. 30, 2018, 1:08 a.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 30, 2018, 1:08 a.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  93cc34f34c044dceb11d27e41ecbd1a14f64bed9 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  96c6810baa4d554db2b7d3739a28e3ff7e8b33a0 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
>  79030780c35e5bda432e3ec3f01328e627cb50a6 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  91c90f9d302f4feb3a8b3d06541f43541c87bf0f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
>  83a1becd46ac2d69c7d6dd05ed6253d1cdd8800d 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcherCache.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
>  4cd00e6672730773c74b9840247d1f4d5f7bdfe4 
> 
> 
> Diff: https://reviews.apache.org/r/64955/diff/8/
> 
> 
> Testing
> ---
> 
> Made sure that tests pass.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-26 Thread Vadim Spector via Review Board


> On Jan. 25, 2018, 11:42 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Line 235 (original), 270 (patched)
> > 
> >
> > It seems better to produce two dufferent logs, e.g.
> > 
> > if(isFoundInCache(notificationEvent.getEventId(), hash) == true) {
> >   LOGGER.debug("Ignoring HMS notification already processed: ID = {}", 
> > notificationEvent.getEventId());
> >   return false;
> > } else if (sentryStore.isNotificationProcessed(hash)
> > LOGGER.debug("Ignoring HMS notification already processed: ID = {} - 
> > cache miss", notificationEvent.getEventId());
> >   return false;
> > }
> > 
> > Note "cache miss" in the second log.
> > 
> > Cache miss is what we try to avoid for better performance. It may 
> > happen sometimes, but if there are too many "cache miss" messages, it's 
> > useful to know.

As discussed, also suggesting adding unit test to make sure if incoming IDs 
have no out-of-order or duplicate elements (though can still have gaps), then 
in theory there should be no cache misses in LRUMap, i.e. 
sentryStore.isNotificationProcessed(hash) is not invoked. It's to insure that 
if HMS folks fix the problems on their side with the IDs, this code will 
perform no unnecessary DB lookups.


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review196297
---


On Jan. 25, 2018, 5 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 25, 2018, 5 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  93cc34f34c044dceb11d27e41ecbd1a14f64bed9 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  96c6810baa4d554db2b7d3739a28e3ff7e8b33a0 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
>  79030780c35e5bda432e3ec3f01328e627cb50a6 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  91c90f9d302f4feb3a8b3d06541f43541c87bf0f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
>  83a1becd46ac2d69c7d6dd05ed6253d1cdd8800d 
>   
> sentry-tests/sentry-tests

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-25 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review196297
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 235 (original), 270 (patched)


It seems better to produce two dufferent logs, e.g.

if(isFoundInCache(notificationEvent.getEventId(), hash) == true) {
  LOGGER.debug("Ignoring HMS notification already processed: ID = {}", 
notificationEvent.getEventId());
  return false;
} else if (sentryStore.isNotificationProcessed(hash)
LOGGER.debug("Ignoring HMS notification already processed: ID = {} - cache 
miss", notificationEvent.getEventId());
  return false;
}

Note "cache miss" in the second log.

Cache miss is what we try to avoid for better performance. It may happen 
sometimes, but if there are too many "cache miss" messages, it's useful to know.


- Vadim Spector


On Jan. 25, 2018, 5 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 25, 2018, 5 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  93cc34f34c044dceb11d27e41ecbd1a14f64bed9 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  96c6810baa4d554db2b7d3739a28e3ff7e8b33a0 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
>  79030780c35e5bda432e3ec3f01328e627cb50a6 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  91c90f9d302f4feb3a8b3d06541f43541c87bf0f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHiveNotificationFetcher.java
>  83a1becd46ac2d69c7d6dd05ed6253d1cdd8800d 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
>  4cd00e6672730773c74b984

Re: Review Request 65268: SENTRY-1904: TransactionManager should limit the max time spent by transaction retry

2018-01-25 Thread Vadim Spector via Review Board


> On Jan. 25, 2018, 4:16 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/TransactionManager.java
> > Line 231 (original), 246 (patched)
> > 
> >
> > If you don't think that you can use time because we do not know how 
> > much time it may take to execute transaction, then the whole point of this 
> > JIRA is moot and we shouldn't fix it in the first place.
> 
> Vadim Spector wrote:
> I am not sure if it's the real issue. The two limita are not redundant or 
> mutually exclusive. The old approach limits the number of re-tries; this 
> change also adds limits the total re-try time but keeps the old limitation as 
> well. Are two controls necessarily worse than one? If someone re-configures 
> by mistake retry interval too small (much smaller than the default), so there 
> may be lots of retries (and possibly very unresponsive system) before time 
> limit reached - perhaps we want to prevent it? Can system clock reset at run 
> time messing with the time-only logic? Unlikely, but not impossible.
> 
> kalyan kumar kalvagadda wrote:
> Vadim, Having both limits on retry-count and the retry-time would address 
> the scenario you are taking about.

Kalyan, that was exactly my point, sorry if it was not clear. I support keeping 
both mechanisms in place, as in your patch.


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65268/#review196263
---


On Jan. 23, 2018, 11:40 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65268/
> ---
> 
> (Updated Jan. 23, 2018, 11:40 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Na Li, Sergio Pena, and Vadim 
> Spector.
> 
> 
> Bugs: SENTRY-1904
> https://issues.apache.org/jira/browse/SENTRY-1904
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> The TransactionManager uses exponential backoff strategy for transaction 
> retries. This may cause some transactions to be delayed by a very long time. 
> We should also have a constraint on the max time for a transaction so that we 
> do not retry for too long. 
> 
> New patch that is attached adds upper bounds on below
> 
> 1.Interval between the retry attempts which increases exponentially.
> 2.Total time a transaction could spend in retries.
>  
> 
> With out these limits we would not have a control on how long a transaction 
> could be be active.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/TransactionManager.java
>  f4ff962a67f8a5c23cc5c8daa7bcb861d2e6b6a5 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
> 
> 
> Diff: https://reviews.apache.org/r/65268/diff/3/
> 
> 
> Testing
> ---
> 
> Made sure all the tests pass.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 65268: SENTRY-1904: TransactionManager should limit the max time spent by transaction retry

2018-01-25 Thread Vadim Spector via Review Board


> On Jan. 25, 2018, 4:16 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/TransactionManager.java
> > Line 231 (original), 246 (patched)
> > 
> >
> > If you don't think that you can use time because we do not know how 
> > much time it may take to execute transaction, then the whole point of this 
> > JIRA is moot and we shouldn't fix it in the first place.

I am not sure if it's the real issue. The two limita are not redundant or 
mutually exclusive. The old approach limits the number of re-tries; this change 
also adds limits the total re-try time but keeps the old limitation as well. 
Are two controls necessarily worse than one? If someone re-configures by 
mistake retry interval too small (much smaller than the default), so there may 
be lots of retries (and possibly very unresponsive system) before time limit 
reached - perhaps we want to prevent it? Can system clock reset at run time 
messing with the time-only logic? Unlikely, but not impossible.


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65268/#review196263
---


On Jan. 23, 2018, 11:40 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65268/
> ---
> 
> (Updated Jan. 23, 2018, 11:40 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Na Li, Sergio Pena, and Vadim 
> Spector.
> 
> 
> Bugs: SENTRY-1904
> https://issues.apache.org/jira/browse/SENTRY-1904
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> The TransactionManager uses exponential backoff strategy for transaction 
> retries. This may cause some transactions to be delayed by a very long time. 
> We should also have a constraint on the max time for a transaction so that we 
> do not retry for too long. 
> 
> New patch that is attached adds upper bounds on below
> 
> 1.Interval between the retry attempts which increases exponentially.
> 2.Total time a transaction could spend in retries.
>  
> 
> With out these limits we would not have a control on how long a transaction 
> could be be active.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/TransactionManager.java
>  f4ff962a67f8a5c23cc5c8daa7bcb861d2e6b6a5 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
> 
> 
> Diff: https://reviews.apache.org/r/65268/diff/3/
> 
> 
> Testing
> ---
> 
> Made sure all the tests pass.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 65268: SENTRY-1904: TransactionManager should limit the max time spent by transaction retry

2018-01-23 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65268/#review196083
---


Ship it!




Ship It!

- Vadim Spector


On Jan. 23, 2018, 11:40 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65268/
> ---
> 
> (Updated Jan. 23, 2018, 11:40 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Na Li, Sergio Pena, and Vadim 
> Spector.
> 
> 
> Bugs: SENTRY-1904
> https://issues.apache.org/jira/browse/SENTRY-1904
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> The TransactionManager uses exponential backoff strategy for transaction 
> retries. This may cause some transactions to be delayed by a very long time. 
> We should also have a constraint on the max time for a transaction so that we 
> do not retry for too long. 
> 
> New patch that is attached adds upper bounds on below
> 
> 1.Interval between the retry attempts which increases exponentially.
> 2.Total time a transaction could spend in retries.
>  
> 
> With out these limits we would not have a control on how long a transaction 
> could be be active.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/TransactionManager.java
>  f4ff962a67f8a5c23cc5c8daa7bcb861d2e6b6a5 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
> 
> 
> Diff: https://reviews.apache.org/r/65268/diff/3/
> 
> 
> Testing
> ---
> 
> Made sure all the tests pass.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 65268: SENTRY-1904: TransactionManager should limit the max time spent by transaction retry

2018-01-23 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65268/#review196078
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
Lines 421 (patched)


Since this code is inside catch {} and serves the purpose of clarifying the 
possible exception cause, would it make sense to put it inside its own 
try-catch so if the exception is thrown, it would be logged, not re-thrown from 
the parent catch {} case?


- Vadim Spector


On Jan. 23, 2018, 6:43 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65268/
> ---
> 
> (Updated Jan. 23, 2018, 6:43 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Na Li, Sergio Pena, and Vadim 
> Spector.
> 
> 
> Bugs: SENTRY-1904
> https://issues.apache.org/jira/browse/SENTRY-1904
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> The TransactionManager uses exponential backoff strategy for transaction 
> retries. This may cause some transactions to be delayed by a very long time. 
> We should also have a constraint on the max time for a transaction so that we 
> do not retry for too long. 
> 
> New patch that is attached adds upper bounds on below
> 
> 1.Interval between the retry attempts which increases exponentially.
> 2.Total time a transaction could spend in retries.
>  
> 
> With out these limits we would not have a control on how long a transaction 
> could be be active.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/TransactionManager.java
>  f4ff962a67f8a5c23cc5c8daa7bcb861d2e6b6a5 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
> 
> 
> Diff: https://reviews.apache.org/r/65268/diff/2/
> 
> 
> Testing
> ---
> 
> Made sure all the tests pass.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-23 Thread Vadim Spector via Review Board


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 67 (patched)
> > 
> >
> > I wonder about the choice of LRUMap for cache.
> > 
> > The ideal data structure would one only store the caches for the events 
> > in the ID range that we want, which is [MAX(event_id) - refetch_count, 
> > MAX(event_id)].
> > 
> > LRUMap does not explicitly address this requirement. It assumes that 
> > the oldest (least accessed) caches will tend to be for the events with the 
> > oldest IDs. We know this is often not true, which is one of the problems 
> > this JIRA tries to work around.
> > 
> > Given that when we do not find the entry in cache we check SentryStore 
> > anyway, everything will still work. But we may end up with a few extra DB 
> > calls to SentryStore and it's hard to predict the overhead.
> > 
> > The cache structure solving this shortcoming could be 
> > HashSet[notificationFetchCount +1] array.
> > 
> > It's the circular buffer structure where the cache[ID % cache.length] 
> > element contains the HashSet of all caches for the events with the given 
> > event ID. We can see just as fast if the event has been processed.
> > 
> > Before each fetch we can calculate which elements in this cache array 
> > correspond to the IDs outside the current fetching range and nullify them 
> > (for curcular buffer some care is needed for modulo arithmetic).
> > 
> > This way we know that
> > a) cache stores the caches of the events only in the range we care 
> > about, though now cleanup is explicit.
> > b) we do not necessarily need to put the upper limit on the number of 
> > events with the given ID (the sizes of HashSet's).
> > 
> > It may save some expensive DB calls.
> > 
> > However, I do not feel strongly one way or the other until we test it 
> > and see the overhead. Perhaps the fudge factor x3 in LRUMap size is enough 
> > to keep enough entries, old or new; so I'm leaving it up to you.
> 
> Na Li wrote:
> Vadim, I prefer to use circular buffer as well. Mentioned this to Kalyan 
> before. The memory usage and execution overhead should be much smaller than 
> LRUMap.
> 
> Vadim Spector wrote:
> I like that if the entry is not found in the cache, there is no need to 
> go to the database to double-check. Can't beat that.
> 
> kalyan kumar kalvagadda wrote:
> Help we understand your comment.
> "LRUMap does not explicitly address this requirement. It assumes that the 
> oldest (least accessed) caches will tend to be for the events with the oldest 
> IDs. We know this is often not true, which is one of the problems this JIRA 
> tries to work around."
> 
> Can you give me one example where you think it would not be found in the 
> cache?

For simplicity of the example, suppose we look back by 2 IDs. It means that we 
fetch with starting number (MaxID - 2) to include 3 IDs: (MaxID-2, MaxID-1, 
MaxID). Therefore, the LRUMap size is 3 * 3 = 9.
Suppose, starting point is LRUMap is empty (just to keep the explanation 
simple), then we perform several fetches, in the exact order specified (first 
listed ID arrives first):

fetch #1 (start_id = 0): 3, 3, 3, 2, 2, 2, 1, 1, 1
 Out of order, can happen, the entries for ID = 3 are the oldest ones. 
LRUMap reaches max size 9. maxID = 3.

fetch #2 (start_id = 3 - 2 = 1):  3, 3, 3, 2, 2, 2, 1, 1, 1 (re-fetched from 
fetch #1), 3, 4 (new events)
 The LRUMap automatically purges the first 2 entries with the ID = 3 to 
accomodate new events 3 and 4. maxID = 4.

fetch #3 (start_id = 4 - 2 = 2): 3, 3, 3, 2, 2, 2, (refetched from fetch #1, 
excluding ID = 1, as it is < 2), 3, 4 (refetched from fetch #2), 5 (new)

Let's stop here. In fetch #3 we still get the first two events with ID = 3, 
which were originally in fetch #1 and fetch #2. However, they are no longer in 
LRUMap, purged after fetch #2, so we end up with DB lookup.

I just provided the simplest example. It is probably extreme that the event 
with the biggest ID in the fetch arrives first in the fetch, as in fetch #1. 
However, it can arrive somewhere in the middle of the fetch too, and some time 
later LRU will still purge it ahead of the events with the smaller ID that 
should've been purged first.

As I was working through this example, I realized that things could be better 
if after each fetch the events were first sorted by their IDs before adding to 
LRUMap. I do not think it will entirely solve the problem of prematurely purged 
entries. The example where all IDs get sorted prior to putting them into LRUMap:

fetch #1 (start_id = 0): 1, 1, 1, 2, 2, 2, 3, 3, 3
 All in order. MaxID = 3

fetch #2 (start_id = 3 - 2 = 1): 1, 1, 1, 2, 2, 2, 3, 3, 3 (from fetch #1), 1, 
1, 1, 1, 2, 2, 2, (new), 

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-22 Thread Vadim Spector via Review Board


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 67 (patched)
> > 
> >
> > I wonder about the choice of LRUMap for cache.
> > 
> > The ideal data structure would one only store the caches for the events 
> > in the ID range that we want, which is [MAX(event_id) - refetch_count, 
> > MAX(event_id)].
> > 
> > LRUMap does not explicitly address this requirement. It assumes that 
> > the oldest (least accessed) caches will tend to be for the events with the 
> > oldest IDs. We know this is often not true, which is one of the problems 
> > this JIRA tries to work around.
> > 
> > Given that when we do not find the entry in cache we check SentryStore 
> > anyway, everything will still work. But we may end up with a few extra DB 
> > calls to SentryStore and it's hard to predict the overhead.
> > 
> > The cache structure solving this shortcoming could be 
> > HashSet[notificationFetchCount +1] array.
> > 
> > It's the circular buffer structure where the cache[ID % cache.length] 
> > element contains the HashSet of all caches for the events with the given 
> > event ID. We can see just as fast if the event has been processed.
> > 
> > Before each fetch we can calculate which elements in this cache array 
> > correspond to the IDs outside the current fetching range and nullify them 
> > (for curcular buffer some care is needed for modulo arithmetic).
> > 
> > This way we know that
> > a) cache stores the caches of the events only in the range we care 
> > about, though now cleanup is explicit.
> > b) we do not necessarily need to put the upper limit on the number of 
> > events with the given ID (the sizes of HashSet's).
> > 
> > It may save some expensive DB calls.
> > 
> > However, I do not feel strongly one way or the other until we test it 
> > and see the overhead. Perhaps the fudge factor x3 in LRUMap size is enough 
> > to keep enough entries, old or new; so I'm leaving it up to you.
> 
> Na Li wrote:
> Vadim, I prefer to use circular buffer as well. Mentioned this to Kalyan 
> before. The memory usage and execution overhead should be much smaller than 
> LRUMap.

I like that if the entry is not found in the cache, there is no need to go to 
the database to double-check. Can't beat that.


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review195917
---


On Jan. 22, 2018, 5:40 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 22, 2018, 5:40 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c28f35af86952c213d5e20a8c9bb3490 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  097aa62912e92ece7f8d

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-22 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review195917
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 60 (original), 61 (patched)


This using probably unintentional use of Bool instead of primitive boolean. 
There seems to be no reason for using autoboxed types.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 70 (patched)


So, the assumption is that the average number of duplicates per each ID 
cannot exceed 3, correct?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 67 (patched)


I wonder about the choice of LRUMap for cache.

The ideal data structure would one only store the caches for the events in 
the ID range that we want, which is [MAX(event_id) - refetch_count, 
MAX(event_id)].

LRUMap does not explicitly address this requirement. It assumes that the 
oldest (least accessed) caches will tend to be for the events with the oldest 
IDs. We know this is often not true, which is one of the problems this JIRA 
tries to work around.

Given that when we do not find the entry in cache we check SentryStore 
anyway, everything will still work. But we may end up with a few extra DB calls 
to SentryStore and it's hard to predict the overhead.

The cache structure solving this shortcoming could be 
HashSet[notificationFetchCount +1] array.

It's the circular buffer structure where the cache[ID % cache.length] 
element contains the HashSet of all caches for the events with the given event 
ID. We can see just as fast if the event has been processed.

Before each fetch we can calculate which elements in this cache array 
correspond to the IDs outside the current fetching range and nullify them (for 
curcular buffer some care is needed for modulo arithmetic).

This way we know that
a) cache stores the caches of the events only in the range we care about, 
though now cleanup is explicit.
b) we do not necessarily need to put the upper limit on the number of 
events with the given ID (the sizes of HashSet's).

It may save some expensive DB calls.

However, I do not feel strongly one way or the other until we test it and 
see the overhead. Perhaps the fudge factor x3 in LRUMap size is enough to keep 
enough entries, old or new; so I'm leaving it up to you.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 119 (patched)


any reason for returning Boolean vs bool?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 231 (patched)


optimization - no need for this DB call if the ID > max(storedId)


- Vadim Spector


On Jan. 22, 2018, 5:40 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 22, 2018, 5:40 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-c

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-18 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review195758
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 66 (patched)


why not "private final LRUMap cache;"? The code never resets the reference.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 75 (patched)


Please, add javadoc here: what does x3 factor mean? What's the assumption 
here?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 112 (patched)


Nit: just "return cache.get(hash) != null;"



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 89 (original), 147 (patched)


You create filter for the minFetchId which you immediately update on the 
next line. One-line comment on what's going on here would not hurt.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 92 (original), 154 (patched)


Let's log both lastEventId and minFetchEventId. So we'd be clear on what's 
the biggest ID we received and what's the starting ID we are fetching now. 
Something like:

LOGGER.debug("Requesting HMS notifications since ID = {}, last ID {}", 
minFetchId, lastEventId);

This way when we try to make sense out of the logs, we don't need to do 
math each time.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 96 (original), 158 (patched)


It's a brand-new logic here, critical for valdiating this whole new design. 
Can you add javadoc summarizing: why does HiveNotificationFetcher decide here 
that it is out of sync with HMS?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 205 (patched)


this section of code suggests that foundLastProcessedNotification can only 
change from false to true. Can it change from true to false? If yes, where can 
it happen? If not, why? Please, add comment on that.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Line 206 (original)


why did you delete the cache.clear() line?


- Vadim Spector


On Jan. 17, 2018, 11:56 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 17, 2018, 11:56 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Functinalty to address gaps and out-of-sequence notificaitons by 
> re-fetching addtional notifications that were already fetched. This solution 
> is not fool proof. It does a best effort to reduse the chance of loosing 
> notifications by re-fetching the notifications.This approach will introduce 
> an over head of fetching addtional notifications that were already fetched. 
> Overhead of DB look-up is addressed by using a cache. This reduces additional 
> DB lookups needed to check if the notification was already processed.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/s

Re: Review Request 64820: SENTRY-2106: Remove sentry dependency on HMS table NOTIFICATION_SEQUENCE when checking if Sentry is out-of-sync with HMS

2018-01-18 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64820/#review195750
---


Ship it!




Ship It!

- Vadim Spector


On Jan. 16, 2018, 3:50 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64820/
> ---
> 
> (Updated Jan. 16, 2018, 3:50 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Brian Towles, kalyan kumar 
> kalvagadda, Na Li, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> After steady state, a full snapshot is triggered by mainly 2 cases
> 
> 1. Sentry is "ahead"
> 2. Sentry is behind
> Case 1 has a dependency on NOTIFICATION_SEQUENCE table. This is not reliable 
> as it was observed that sometimes NOTIFICATION_SEQUENCE and NOTIFICATION_LOG 
> are not in sync. As a result of this unnecessary full snapshots can be 
> triggered.
> The fix is to remove this dependency.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7 
> 
> 
> Diff: https://reviews.apache.org/r/64820/diff/6/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-04 Thread Vadim Spector via Review Board


> On Jan. 4, 2018, 10:40 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Line 205 (original), 206 (patched)
> > 
> >
> > I thought we've agreed that at this point we shall pass to 
> > notificationFetcher.fetchNotification() notificationId-delta, where delta 
> > is configurable and probably set to some significant number, like 10. 
> > Otherwise we'll be loosing lots of older notifications with smaller IDs 
> > that got persisted since the previous fetch, due to the HMS issue.
> 
> kalyan kumar kalvagadda wrote:
> we can handle that as seperate issue. we need this change to avaoid 
> notificated getting missed. Scope of this patch is just to avoid full 
> snapshots un-necessarily.

Hmm, I'd like other peoples' take on this. These two issues are not independent 
imo. Avoiding full snapshots is good, but full snapshots, even if done for a 
wrong reason, would offset for missing updates, at least to some extent. Now we 
may start having quite a few of them. 

It's a few line fix, really, just define some "delta" config parameter, set its 
default to some value, could be fairly small, like 2, and pass 
"notificationId-delta" instead; I'd rather just take care of it now. Based on 
what we know of HMS ID generation problem, I'd feel really uncomfortable 
releasing one fix w/o another.


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review194788
---


On Jan. 5, 2018, 12:10 a.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 5, 2018, 12:10 a.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631fa74760d8721b5740dd3dffb2c1d8e72e6 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c28f35af86952c213d5e20a8c9bb3490 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  097aa62912e92ece7f8da0ac0fccb124579a88f2 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  43535a7b50fea51049f3142837736e6a99a3a80f 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  501898bca261db2daf937a8d803d12a59616192b 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7646539499149f2d86758979436567bd 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegrationBase.java
>  c8eef09ed61e11583838a3dfd3117a9478df23ba 
> 
> 
> Diff: https://reviews.apache.org/r/64955/diff/2/
> 
> 
> Testing
> ---
> 
> Made sure that tests pass. There are three test failures which need code 
> change done for SENTRY-2113.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-04 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review194788
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
Line 2807 (original), 2807 (patched)


This in fact ends up calling getMaxPersistedIDCore(), so would it be a good 
opportunity to rename it now, so it would be clear we are talking about the max 
ID?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 148 (original), 148 (patched)






sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 149 (original), 149 (patched)


lets make it clear that in fact this method returns the max ID in the 
store. shall we change method name to make it clear?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 205 (original), 206 (patched)


I thought we've agreed that at this point we shall pass to 
notificationFetcher.fetchNotification() notificationId-delta, where delta is 
configurable and probably set to some significant number, like 10. Otherwise 
we'll be loosing lots of older notifications with smaller IDs that got 
persisted since the previous fetch, due to the HMS issue.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 392 (original), 345 (patched)


add @param javadoc for notificationId



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 403 (original)


could you explain why you removed LOGGER.debug()?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
Lines 48 (patched)


please, add javadoc for what this value stands for, since it obviously 
plays important role in the new logic


- Vadim Spector


On Jan. 4, 2018, 8:22 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64955/
> ---
> 
> (Updated Jan. 4, 2018, 8:22 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Na Li, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2109
> https://issues.apache.org/jira/browse/SENTRY-2109
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> This patch does couple of things
> 1. Avoid triggering full snapshots when gaps are observed while fetching new 
> notifications. While fetching new notifications HMSFollower would would fetch 
> notifications with last event-id as well. When it gets the notifications and 
> if it doesn't get the notifications with event-id, full snpshot is triggered.
> 2. Added looging to report duplicate events and out of order events for debug 
> purpose.
> 3. Added new e2e tests to verfy this behavior.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/exception/SentryOutOfSyncException.java
>  PRE-CREATION 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631f 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a3 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
>  097aa62 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java
>  43535a7 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollowerSentryStoreIntegration.java
>  501898b 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreation.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotCreationWithShorterHMSEventTtl.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestSnapshotWithLongerHMSFollowerLongerInterval.java
>  PRE-CREATION 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/T

Re: Review Request 64820: SENTRY-2106: Sentry ahead full snapshot retry logic

2017-12-25 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64820/#review194500
---


Ship it!




Ship It!

- Vadim Spector


On Dec. 25, 2017, 10:49 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64820/
> ---
> 
> (Updated Dec. 25, 2017, 10:49 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Brian Towles, kalyan kumar 
> kalvagadda, Na Li, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Once SentryHMSNotification table is not empty, there are two cases when a 
> full snapshot is triggered.
> 
> If first event in list of notifications received from HMS greater than latest 
> sentry hms notification Id
> If latest sentry notification Id is greater than current hms notification Id
> The later is a unexpected case where for some reason sentry gets ahead of 
> HMS. We should add a logic to trigger a full snapshot in case 2 only after a 
> configurable number of retries. This will avoid unnecessary full snapshot 
> triggers as they are resource intensive
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java
>  a9afb151a 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7 
> 
> 
> Diff: https://reviews.apache.org/r/64820/diff/3/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 64820: SENTRY-2106: Sentry ahead full snapshot retry logic

2017-12-24 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64820/#review194489
---



Seems like I have no problems with the code once my comments in the previous 
review are addressed. Would like to add more checks to the tests, including, 
say, a couple of loop iterations for the state after "sentry-ahead" state has 
been cleared, just to make sure HMSFollower is in the normal flow mode (no 
Sentry-ahead condition, gets partial updates normally, does not attempt full 
updates for no good reason, hmsFollower.getNotificationSyncRetryCount() keeps 
returning 0).


sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 263 (original), 275 (patched)


typo "cuurent"



sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
Lines 1108 (patched)


Would it make sense to add verify() on sentryHmsClient.getFullSnapshot() 
invokactions? Can't hurt it seems

Also, would it make sense to continue the loop for sometime after count == 
interruptCount instead of break? To check the normal flow after "sentry-ahead" 
recovery? And add default "else" clause with verify()'s for all mock objects? 
Post-sentry-ahead state is an important part to test too.



sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
Lines 1179 (patched)


Would it make sense to add verify() on sentryHmsClient.getFullSnapshot() 
invocations? Can't hurt it seems.

Also, would it make sense to continue the loop for sometime after count == 
interruptCount instead of break? To check the normal flow after "sentry-ahead" 
recovery? And add default "else" clause with verify()'s for all mock objects? 
Post-sentry-ahead state is an important part to test too.


- Vadim Spector


On Dec. 24, 2017, 1:50 a.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64820/
> ---
> 
> (Updated Dec. 24, 2017, 1:50 a.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Brian Towles, kalyan kumar 
> kalvagadda, Na Li, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Once SentryHMSNotification table is not empty, there are two cases when a 
> full snapshot is triggered.
> 
> If first event in list of notifications received from HMS greater than latest 
> sentry hms notification Id
> If latest sentry notification Id is greater than current hms notification Id
> The later is a unexpected case where for some reason sentry gets ahead of 
> HMS. We should add a logic to trigger a full snapshot in case 2 only after a 
> configurable number of retries. This will avoid unnecessary full snapshot 
> triggers as they are resource intensive
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java
>  a9afb151a 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7 
> 
> 
> Diff: https://reviews.apache.org/r/64820/diff/2/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test -Dtest=TestHMSFollower
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 64820: SENTRY-2106: Sentry ahead full snapshot retry logic

2017-12-23 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64820/#review194479
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 239 (original), 253 (patched)


Add another item: "The current notification Id on the HMS is less than the 
latest processed by Sentry after the configurable number of retries, as 
detected by isSentryAhead() method"



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 329 (patched)


This "else" clause is also engaged as part of the normal flow. Because 
calling isSentryAhead() method is part of the normal flow and  normally "if 
(latestSentryNotificationId > currentHmsNotificationId)" would be FALSE, so 
this "else" clause would be engaged.

Therefore, we should print this message only if we know we are in the state 
of trying to recover from "Sentry-ahead" situation. Which is when 
notificationSyncRetryCount != 0.

Therefore, this log should be inside the "if (notificationSyncRetryCount != 
0)" clause, it seems.


- Vadim Spector


On Dec. 24, 2017, 1:50 a.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64820/
> ---
> 
> (Updated Dec. 24, 2017, 1:50 a.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Brian Towles, kalyan kumar 
> kalvagadda, Na Li, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Once SentryHMSNotification table is not empty, there are two cases when a 
> full snapshot is triggered.
> 
> If first event in list of notifications received from HMS greater than latest 
> sentry hms notification Id
> If latest sentry notification Id is greater than current hms notification Id
> The later is a unexpected case where for some reason sentry gets ahead of 
> HMS. We should add a logic to trigger a full snapshot in case 2 only after a 
> configurable number of retries. This will avoid unnecessary full snapshot 
> triggers as they are resource intensive
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java
>  a9afb151a 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7 
> 
> 
> Diff: https://reviews.apache.org/r/64820/diff/2/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test -Dtest=TestHMSFollower
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 64820: SENTRY-2106: Sentry ahead full snapshot retry logic

2017-12-23 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64820/#review194476
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 542 (patched)


this method is used from TestHMSFollower, which is the same package, so it 
can be declared package-private instead of public


- Vadim Spector


On Dec. 22, 2017, 11:34 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64820/
> ---
> 
> (Updated Dec. 22, 2017, 11:34 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Brian Towles, kalyan kumar 
> kalvagadda, Na Li, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Once SentryHMSNotification table is not empty, there are two cases when a 
> full snapshot is triggered.
> 
> If first event in list of notifications received from HMS greater than latest 
> sentry hms notification Id
> If latest sentry notification Id is greater than current hms notification Id
> The later is a unexpected case where for some reason sentry gets ahead of 
> HMS. We should add a logic to trigger a full snapshot in case 2 only after a 
> configurable number of retries. This will avoid unnecessary full snapshot 
> triggers as they are resource intensive
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java
>  a9afb151a 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7 
> 
> 
> Diff: https://reviews.apache.org/r/64820/diff/1/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test -Dtest=TestHMSFollower
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 64820: SENTRY-2106: Sentry ahead full snapshot retry logic

2017-12-23 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64820/#review194475
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 273 (original), 284 (patched)


because you invoke isSentryAhead() outside this method, the 
fullUpdateHMS.compareAndSet(true, false) code is not invoked when 
Sentry-is-ahead state is detected. Another reason to call isSentryAhead() from 
this method.


- Vadim Spector


On Dec. 22, 2017, 11:34 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64820/
> ---
> 
> (Updated Dec. 22, 2017, 11:34 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Brian Towles, kalyan kumar 
> kalvagadda, Na Li, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Once SentryHMSNotification table is not empty, there are two cases when a 
> full snapshot is triggered.
> 
> If first event in list of notifications received from HMS greater than latest 
> sentry hms notification Id
> If latest sentry notification Id is greater than current hms notification Id
> The later is a unexpected case where for some reason sentry gets ahead of 
> HMS. We should add a logic to trigger a full snapshot in case 2 only after a 
> configurable number of retries. This will avoid unnecessary full snapshot 
> triggers as they are resource intensive
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java
>  a9afb151a 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7 
> 
> 
> Diff: https://reviews.apache.org/r/64820/diff/1/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test -Dtest=TestHMSFollower
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 64820: SENTRY-2106: Sentry ahead full snapshot retry logic

2017-12-23 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64820/#review194474
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 219 (patched)


isSentryAhead() is part of the logic to decide if full snapshot is 
required, so it would make sense to invoke it inside the 
isFullSnapshotRequired(), instead of adding this block of code here. After all, 
isSentryAhead() is an improved version of the code that used to be exactly in 
isFullSnapshotRequired(), which you've removed.

And the javadoc section for isFullSnapshotRequired() that you removed would 
stay, just with some minor modifications. It will result in fewer diffs and 
will be easier to undrestand what's exactly changed.


- Vadim Spector


On Dec. 22, 2017, 11:34 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64820/
> ---
> 
> (Updated Dec. 22, 2017, 11:34 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Brian Towles, kalyan kumar 
> kalvagadda, Na Li, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Once SentryHMSNotification table is not empty, there are two cases when a 
> full snapshot is triggered.
> 
> If first event in list of notifications received from HMS greater than latest 
> sentry hms notification Id
> If latest sentry notification Id is greater than current hms notification Id
> The later is a unexpected case where for some reason sentry gets ahead of 
> HMS. We should add a logic to trigger a full snapshot in case 2 only after a 
> configurable number of retries. This will avoid unnecessary full snapshot 
> triggers as they are resource intensive
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java
>  a9afb151a 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7 
> 
> 
> Diff: https://reviews.apache.org/r/64820/diff/1/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test -Dtest=TestHMSFollower
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 64820: SENTRY-2106: Sentry ahead full snapshot retry logic

2017-12-23 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64820/#review194472
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 71 (patched)


A couple of naming suggestions:

a) fullUdateRetryCount is a bit misleading; count is something that's 
incremented, not a constant. fullUpdateAttemptMax would be more intuitive, 
makes it clearly related to fullUpdateAttemptCount. Or you can use "Retry" in 
both names - fullUpdateRetryCount and fullUpdateRetryMax.

b) the names are probably misleading altogether; it is not an attempt to do 
full update, but on contrary an attempt to avoid it; essentially, an attempt to 
establish that Sentry and HMS are in-sync, so no full update is needed. I'd 
change it to something like notificationSyncRetryCount and 
notificationSyncRetryMax



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 228 (patched)


nit: extra line


- Vadim Spector


On Dec. 22, 2017, 11:34 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64820/
> ---
> 
> (Updated Dec. 22, 2017, 11:34 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Brian Towles, kalyan kumar 
> kalvagadda, Na Li, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Once SentryHMSNotification table is not empty, there are two cases when a 
> full snapshot is triggered.
> 
> If first event in list of notifications received from HMS greater than latest 
> sentry hms notification Id
> If latest sentry notification Id is greater than current hms notification Id
> The later is a unexpected case where for some reason sentry gets ahead of 
> HMS. We should add a logic to trigger a full snapshot in case 2 only after a 
> configurable number of retries. This will avoid unnecessary full snapshot 
> triggers as they are resource intensive
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java
>  a9afb151a 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  aa1b6a31c 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
>  edde886a7 
> 
> 
> Diff: https://reviews.apache.org/r/64820/diff/1/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test -Dtest=TestHMSFollower
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 64661: SENTRY-1944: Optimize DelegateSentryStore.getGroupsByRoles()

2017-12-19 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64661/#review194174
---


Ship it!




Ship It!

- Vadim Spector


On Dec. 19, 2017, 8:04 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64661/
> ---
> 
> (Updated Dec. 19, 2017, 8:04 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Brian Towles, kalyan kumar 
> kalvagadda, Na Li, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> When Solr is using Sentry server for authorization, it issues a lot of calls 
> to {{getGroupsByRoles()}} function in {{DelegateSentryStore}}.
> 
> This function isn't very efficient - it walks over each role in the set, 
> obtains role by name, get groups for each role, and collects all group names 
> into a set.
> 
> It may be possible to optimize it.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/generic/service/persistent/DelegateSentryStore.java
>  4cb46abc5 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/generic/service/persistent/TestDelegateSentryStore.java
>  69d16238f 
> 
> 
> Diff: https://reviews.apache.org/r/64661/diff/4/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 64661: SENTRY-1944: Optimize DelegateSentryStore.getGroupsByRoles()

2017-12-18 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64661/#review194079
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/generic/service/persistent/DelegateSentryStore.java
Line 252 (original), 254 (patched)


I recommend a better test for this method.

It is tested in TestDelegateSentryStore.testGetGroupsByRoleNames(), which 
uses two roles with identical groups for each role.

I'd define roles role1 and role2 with partially iverlapping groups for 
each, i.e. a couple of groups defined only for role1, another couple only for 
role2, and another couple for both role1 and role2.


- Vadim Spector


On Dec. 18, 2017, 4:36 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64661/
> ---
> 
> (Updated Dec. 18, 2017, 4:36 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Brian Towles, kalyan kumar 
> kalvagadda, Na Li, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> When Solr is using Sentry server for authorization, it issues a lot of calls 
> to {{getGroupsByRoles()}} function in {{DelegateSentryStore}}.
> 
> This function isn't very efficient - it walks over each role in the set, 
> obtains role by name, get groups for each role, and collects all group names 
> into a set.
> 
> It may be possible to optimize it.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/generic/service/persistent/DelegateSentryStore.java
>  4cb46abc5 
> 
> 
> Diff: https://reviews.apache.org/r/64661/diff/3/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 64452: SENTRY-2091: User-based Privilege is broken by SENTRY-769

2017-12-15 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64452/#review193992
---




sentry-tests/sentry-tests-hive-v2/src/test/java/org/apache/sentry/tests/e2e/hive/TestUserManagement.java
Line 372 (original), 372 (patched)


Are we sure here that CREATE fails for the right reason, i.e. user1 not 
having privileges, not because it already exists?

If we want to make sure that user1 can neither CREATE nor SELECT, would it 
make sense to try CREATE on a different table, not the one that already exists, 
followed by SELECT on already existing table?



sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hive/TestUserManagement.java
Line 370 (original), 368 (patched)


Ditto: do we know CREATE fails for the right reason (user1 lacking 
privilege), not because the table already exists?


- Vadim Spector


On Dec. 14, 2017, 8:54 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64452/
> ---
> 
> (Updated Dec. 14, 2017, 8:54 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, kalyan kumar kalvagadda, Sergio 
> Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Update the code to make sure user without group but with role has proper 
> access. 
> test case is added for the scenario that user has no group but with desired 
> role. 
> test case is added for the scenario that user has no group and no privilege 
> to allow the access
> 
> 
> Diffs
> -
> 
>   
> sentry-binding/sentry-binding-hive-v2/src/main/java/org/apache/sentry/binding/hive/v2/HiveAuthzBindingHookBaseV2.java
>  5a21dd3 
>   
> sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/authz/HiveAuthzBindingHookBase.java
>  9c60c22 
>   
> sentry-binding/sentry-binding-hive/src/test/java/org/apache/sentry/binding/hive/TestHiveAuthzBindings.java
>  a41d1bd 
>   
> sentry-binding/sentry-binding-solr/src/test/java/org/apache/sentry/binding/solr/TestSolrAuthzBinding.java
>  f060b82 
>   
> sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/ResourceAuthorizationProvider.java
>  005724f 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  6c4631f 
>   
> sentry-tests/sentry-tests-hive-v2/src/test/java/org/apache/sentry/tests/e2e/hive/TestUserManagement.java
>  02ac514 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestGrantUserToRole.java
>  5364937 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hive/TestUserManagement.java
>  02ac514 
> 
> 
> Diff: https://reviews.apache.org/r/64452/diff/2/
> 
> 
> Testing
> ---
> 
> unit tests
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 64545: SENTRY-2078: Have sentry server print an obvious INFO level log message when it becomes the writer

2017-12-13 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64545/#review193707
---


Ship it!




Ship It!

- Vadim Spector


On Dec. 13, 2017, 5:45 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64545/
> ---
> 
> (Updated Dec. 13, 2017, 5:45 p.m.)
> 
> 
> Review request for sentry, Brian Towles, kalyan kumar kalvagadda, Na Li, 
> Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Currently sentry's log messaging when it becomes a 'writer' (the sentry 
> server that reads from HMS) is not obvious, and is really mostly discernable 
> at the debug level. Add a log message at the INFO level, such as 'This sentry 
> server has just become the HMS reader/writer' that is printed once when the 
> sentry server first becomes the HMS reader/writer.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/LeaderStatusMonitor.java
>  360c5a530 
> 
> 
> Diff: https://reviews.apache.org/r/64545/diff/6/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 64545: SENTRY-2078: Have sentry server print an obvious INFO level log message when it becomes the writer

2017-12-12 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64545/#review193591
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/LeaderStatusMonitor.java
Line 133 (original), 132 (patched)


my 5 cents: I'd personally implement toString() method to print all useful 
info about this object, something like

return String.format("%s{incarnationId=%s, leaderCount=%d, isLeader=%b, 
isSingleNodeMode=%b}", getClass().getSimpleName(), );

then, outside the constructor, you can simply log as

LOGGER.info("{}: {}", this, your-message);

which is shorter. Some people may argue whether we need all these fields 
logged, but in reality we rarely complain that there is too much information in 
the logs; usually the opposite.

But whichever way you decide, it's already a good improvement.


- Vadim Spector


On Dec. 12, 2017, 5:10 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64545/
> ---
> 
> (Updated Dec. 12, 2017, 5:10 p.m.)
> 
> 
> Review request for sentry, Brian Towles, kalyan kumar kalvagadda, Na Li, 
> Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Currently sentry's log messaging when it becomes a 'writer' (the sentry 
> server that reads from HMS) is not obvious, and is really mostly discernable 
> at the debug level. Add a log message at the INFO level, such as 'This sentry 
> server has just become the HMS reader/writer' that is printed once when the 
> sentry server first becomes the HMS reader/writer.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/LeaderStatusMonitor.java
>  360c5a530 
> 
> 
> Diff: https://reviews.apache.org/r/64545/diff/3/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63881: SENTRY-2040: When getting Snapshots from HMS we need more logging around cases when a snapshot is not being received

2017-12-06 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63881/#review193045
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/CounterWait.java
Line 164 (original), 165 (patched)


Question for this and similar parts of code: will it be common for old and 
new values to have gaps, and if not, would it make sense to warn() about it?


- Vadim Spector


On Dec. 6, 2017, 9:03 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63881/
> ---
> 
> (Updated Dec. 6, 2017, 9:03 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, Sergio Pena, and 
> Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Currently we only seem to log when all conditions that lead to getting a 
> snapshot are validated, and eventually a full snapshot is received. However, 
> if for some reason a particular condition was invalid we don't log as to why. 
> This leaves no room for knowing definitely what might have been the reason as 
> to why a snapshot was not received from HMS.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  f32a745ed 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/CounterWait.java
>  2268ce740 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  c1471d118 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
>  eccb40fb6 
> 
> 
> Diff: https://reviews.apache.org/r/63881/diff/11/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63881: SENTRY-2040: When getting Snapshots from HMS we need more logging around cases when a snapshot is not being received

2017-12-05 Thread Vadim Spector via Review Board


> On Dec. 5, 2017, 11:25 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Line 420 (original), 430 (patched)
> > 
> >
> > Use `ID = {}", event.getEventId`.
> > Also, do we want to show the actual failure here or it will be printed 
> > elsewhere?
> 
> Vadim Spector wrote:
> For some reason processNotifications() is declared public, yet it is only 
> used inside HMSFollower.syncupWithHms(), inside try { } catch (Throwable e) { 
> } , which does LOG.error("Exception in HMSFollower! Caused by: " + 
> t.getMessage(), t);
> So, we should be fine, but I'm wondering why not to declare this method 
> private

nevermind, it is used in JUnit tests


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63881/#review192945
---


On Dec. 5, 2017, 9:02 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63881/
> ---
> 
> (Updated Dec. 5, 2017, 9:02 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, Sergio Pena, and 
> Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Currently we only seem to log when all conditions that lead to getting a 
> snapshot are validated, and eventually a full snapshot is received. However, 
> if for some reason a particular condition was invalid we don't log as to why. 
> This leaves no room for knowing definitely what might have been the reason as 
> to why a snapshot was not received from HMS.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  f32a745ed 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  c1471d118 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
>  eccb40fb6 
> 
> 
> Diff: https://reviews.apache.org/r/63881/diff/8/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63881: SENTRY-2040: When getting Snapshots from HMS we need more logging around cases when a snapshot is not being received

2017-12-05 Thread Vadim Spector via Review Board


> On Dec. 5, 2017, 11:25 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Lines 377 (patched)
> > 
> >
> > Will someone up the call stack print the full stack trace?

createFullSnapshot() is private method called only from syncupWithHms() method, 
inside try { } catch (Throwable e) { } , which does LOG.error("Exception in 
HMSFollower! Caused by: " + t.getMessage(), t);
so, we should be fine as far as I can see


> On Dec. 5, 2017, 11:25 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Line 420 (original), 430 (patched)
> > 
> >
> > Use `ID = {}", event.getEventId`.
> > Also, do we want to show the actual failure here or it will be printed 
> > elsewhere?

For some reason processNotifications() is declared public, yet it is only used 
inside HMSFollower.syncupWithHms(), inside try { } catch (Throwable e) { } , 
which does LOG.error("Exception in HMSFollower! Caused by: " + t.getMessage(), 
t);
So, we should be fine, but I'm wondering why not to declare this method private


> On Dec. 5, 2017, 11:25 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
> > Lines 190 (patched)
> > 
> >
> > Does this work? I am not sure that logger correctly supports more then 
> > two args. You can pass Array with more then 2 elements though.

Per 
https://www.slf4j.org/apidocs/org/slf4j/Logger.html#info(java.lang.String,%20java.lang.Object...)
 it should:
Log a message at the INFO level according to the specified format and arguments.
This form avoids superfluous string concatenation when the logger is disabled 
for the INFO level. However, this variant incurs the hidden (and relatively 
small) cost of creating an Object[] before invoking the method, even if this 
logger is disabled for INFO. The variants taking one and two arguments exist 
solely in order to avoid this hidden cost.
Parameters:
format - the format string
arguments - a list of 3 or more arguments


> On Dec. 5, 2017, 11:25 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
> > Lines 209 (patched)
> > 
> >
> > It is visuall better to rearrange these to show the lower ID first.
> > 
> > It would be nice to see something like `received non-consequitive event 
> > sequence 10, 12`

Agree


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63881/#review192945
---


On Dec. 5, 2017, 9:02 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63881/
> ---
> 
> (Updated Dec. 5, 2017, 9:02 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, Sergio Pena, and 
> Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Currently we only seem to log when all conditions that lead to getting a 
> snapshot are validated, and eventually a full snapshot is received. However, 
> if for some reason a particular condition was invalid we don't log as to why. 
> This leaves no room for knowing definitely what might have been the reason as 
> to why a snapshot was not received from HMS.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  f32a745ed 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  c1471d118 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
>  eccb40fb6 
> 
> 
> Diff: https://reviews.apache.org/r/63881/diff/8/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63881: SENTRY-2040: When getting Snapshots from HMS we need more logging around cases when a snapshot is not being received

2017-12-05 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63881/#review192934
---


Ship it!




Ship It!

- Vadim Spector


On Dec. 5, 2017, 9:02 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63881/
> ---
> 
> (Updated Dec. 5, 2017, 9:02 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, Sergio Pena, and 
> Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Currently we only seem to log when all conditions that lead to getting a 
> snapshot are validated, and eventually a full snapshot is received. However, 
> if for some reason a particular condition was invalid we don't log as to why. 
> This leaves no room for knowing definitely what might have been the reason as 
> to why a snapshot was not received from HMS.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  f32a745ed 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  c1471d118 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
>  eccb40fb6 
> 
> 
> Diff: https://reviews.apache.org/r/63881/diff/8/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63881: SENTRY-2040: When getting Snapshots from HMS we need more logging around cases when a snapshot is not being received

2017-12-05 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63881/#review192914
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
Lines 229 (patched)


I recommend using 
org.apache.commons.lang3.exception.ExceptionUtils.getRootCause() instead, which 
is available. See 
https://commons.apache.org/proper/commons-lang/apidocs/src-html/org/apache/commons/lang3/exception/ExceptionUtils.html

It addresses weird case where exception chain forms a  linked list with the 
loop, which would result in infinite loop. In theory shoud never happen, but 
not impossible.


- Vadim Spector


On Dec. 5, 2017, 5:32 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63881/
> ---
> 
> (Updated Dec. 5, 2017, 5:32 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, Sergio Pena, and 
> Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Currently we only seem to log when all conditions that lead to getting a 
> snapshot are validated, and eventually a full snapshot is received. However, 
> if for some reason a particular condition was invalid we don't log as to why. 
> This leaves no room for knowing definitely what might have been the reason as 
> to why a snapshot was not received from HMS.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  f32a745ed 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  c1471d118 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
>  eccb40fb6 
> 
> 
> Diff: https://reviews.apache.org/r/63881/diff/7/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63881: SENTRY-2040: When getting Snapshots from HMS we need more logging around cases when a snapshot is not being received

2017-11-30 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63881/#review192387
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 346 (patched)


Is getting empty snapshot while creating full snapshot normal? If not, 
perhaps info() or warn() would be better? Also, would not hurt including 
snapshotInfo.getId() into the log.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
Lines 152 (patched)


logging eventIdBefore may be useful at this point



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
Line 202 (original), 203 (patched)


Would it be worth logging normal execution flow inside this loop? It is 
still exceptional in a way that it's inside full update logic, right? Should 
not be too many iterations before currentEventId and eventIdAfter catch up with 
each other, is it right? Something like "{} updates received, currentEventId 
{}, eventIdAfter {}".

Also, in theory returned event IDs should be sequential, but this is HMS, 
so who knows what can happen? The logic breaks out of the loop if the event ID 
reaches or exceeds the desired value, but if sequence numbers are out of order 
we can get gaps with hard to debug HDFS sync problems. The least we can do is 
warn() about sequence numbers being non-sequential, something like

if (event.getEventId() != (currentEventId+1)) {
  LOGGER.warn(...);
}


- Vadim Spector


On Nov. 30, 2017, 10:41 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63881/
> ---
> 
> (Updated Nov. 30, 2017, 10:41 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, Sergio Pena, and 
> Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Currently we only seem to log when all conditions that lead to getting a 
> snapshot are validated, and eventually a full snapshot is received. However, 
> if for some reason a particular condition was invalid we don't log as to why. 
> This leaves no room for knowing definitely what might have been the reason as 
> to why a snapshot was not received from HMS.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  f32a745ed 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
>  c1471d118 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
>  eccb40fb6 
> 
> 
> Diff: https://reviews.apache.org/r/63881/diff/6/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63596: SENTRY-1951 - Old SentryStore.retrieveFullPathsImage() should be removed

2017-11-28 Thread Vadim Spector via Review Board


> On Nov. 28, 2017, 10:16 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
> > Line 2514 (original), 2513 (patched)
> > 
> >
> > So, what are our final assumptions about leading and trailing hashes in 
> > hdfs paths at the point of adding them to Sentry store? Do we assume they 
> > only have leading slashes? Can they be multiple slashes or single-only? 
> > What about trailing slashes - none, single-only, multiple ok?
> > 
> > I start thinking that sentry store should be taking hdfs paths with 
> > "bad" slashes and canoinicalize them internally. I like the idea of a 
> > contract between Sentry components in regards to hdfs paths; we keep 
> > talking about, but the truth is contracts can be inadvertantly broken by 
> > code changes. Canonicalizing paths seems like a cheap operation, so I think 
> > that Sentry store should be fine accepting non-canonicalized paths - in 
> > fact, we should do just that and test the resuts by passing 
> > non-canonicalized paths and make sure canonicalization happened.
> > 
> > Would like others to weigh in.
> 
> Vadim Spector wrote:
> No matter what we decide, should probably document those assuimptions in 
> the test code, if only in couple of sentences.
> 
> Arjun Mishra wrote:
> Vadim, the newer version doesn't have the specific line you have 
> commented against. In the new version all the test cases are unchanged with 
> respect to the inputs that are being fed

Does not really matter, imo. The point is, we switch to another API, and change 
tests accordingly. It may be a good time to clearly state our assumptions (and 
probably even change them), to avoid future problems. We know how hard it is to 
debug issues involving incorrect hdfs path parsing; cheaper to harden our code 
right now, imo.


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63596/#review192068
---


On Nov. 27, 2017, 6:03 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63596/
> ---
> 
> (Updated Nov. 27, 2017, 6:03 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, Na 
> Li, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> The old retrieveFullPathsImage() method in SentryStore is no longer used by 
> actual code (retrieveFullPathsImageUpdate(final String[] prefixes) is used 
> instead. It was preserved because it is used by test which now doesn't make 
> much sense.
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestImageRetriever.java
>  b355630e7 
>   
> sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestSentryHDFSServiceProcessor.java
>  f09d1b228 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  f32a745ed 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  24b11f724 
> 
> 
> Diff: https://reviews.apache.org/r/63596/diff/2/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test -Dtest=TestSentryStore
> mvn -f sentry-hdfs/sentry-hdfs-service/pom.xml test 
> -Dtest=TestSentryHDFSServiceProcessor
> mvn -f sentry-hdfs/sentry-hdfs-service/pom.xml test 
> -Dtest=TestImageRetriever.java
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63596: SENTRY-1951 - Old SentryStore.retrieveFullPathsImage() should be removed

2017-11-28 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63596/#review192068
---




sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
Line 2514 (original), 2513 (patched)


So, what are our final assumptions about leading and trailing hashes in 
hdfs paths at the point of adding them to Sentry store? Do we assume they only 
have leading slashes? Can they be multiple slashes or single-only? What about 
trailing slashes - none, single-only, multiple ok?

I start thinking that sentry store should be taking hdfs paths with "bad" 
slashes and canoinicalize them internally. I like the idea of a contract 
between Sentry components in regards to hdfs paths; we keep talking about, but 
the truth is contracts can be inadvertantly broken by code changes. 
Canonicalizing paths seems like a cheap operation, so I think that Sentry store 
should be fine accepting non-canonicalized paths - in fact, we should do just 
that and test the resuts by passing non-canonicalized paths and make sure 
canonicalization happened.

Would like others to weigh in.


- Vadim Spector


On Nov. 27, 2017, 6:03 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63596/
> ---
> 
> (Updated Nov. 27, 2017, 6:03 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, Na 
> Li, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> The old retrieveFullPathsImage() method in SentryStore is no longer used by 
> actual code (retrieveFullPathsImageUpdate(final String[] prefixes) is used 
> instead. It was preserved because it is used by test which now doesn't make 
> much sense.
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestImageRetriever.java
>  b355630e7 
>   
> sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestSentryHDFSServiceProcessor.java
>  f09d1b228 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  f32a745ed 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  24b11f724 
> 
> 
> Diff: https://reviews.apache.org/r/63596/diff/2/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test -Dtest=TestSentryStore
> mvn -f sentry-hdfs/sentry-hdfs-service/pom.xml test 
> -Dtest=TestSentryHDFSServiceProcessor
> mvn -f sentry-hdfs/sentry-hdfs-service/pom.xml test 
> -Dtest=TestImageRetriever.java
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63596: SENTRY-1951 - Old SentryStore.retrieveFullPathsImage() should be removed

2017-11-28 Thread Vadim Spector via Review Board


> On Nov. 28, 2017, 10:16 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
> > Line 2514 (original), 2513 (patched)
> > 
> >
> > So, what are our final assumptions about leading and trailing hashes in 
> > hdfs paths at the point of adding them to Sentry store? Do we assume they 
> > only have leading slashes? Can they be multiple slashes or single-only? 
> > What about trailing slashes - none, single-only, multiple ok?
> > 
> > I start thinking that sentry store should be taking hdfs paths with 
> > "bad" slashes and canoinicalize them internally. I like the idea of a 
> > contract between Sentry components in regards to hdfs paths; we keep 
> > talking about, but the truth is contracts can be inadvertantly broken by 
> > code changes. Canonicalizing paths seems like a cheap operation, so I think 
> > that Sentry store should be fine accepting non-canonicalized paths - in 
> > fact, we should do just that and test the resuts by passing 
> > non-canonicalized paths and make sure canonicalization happened.
> > 
> > Would like others to weigh in.

No matter what we decide, should probably document those assuimptions in the 
test code, if only in couple of sentences.


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63596/#review192068
---


On Nov. 27, 2017, 6:03 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63596/
> ---
> 
> (Updated Nov. 27, 2017, 6:03 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, Na 
> Li, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> The old retrieveFullPathsImage() method in SentryStore is no longer used by 
> actual code (retrieveFullPathsImageUpdate(final String[] prefixes) is used 
> instead. It was preserved because it is used by test which now doesn't make 
> much sense.
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestImageRetriever.java
>  b355630e7 
>   
> sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestSentryHDFSServiceProcessor.java
>  f09d1b228 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  f32a745ed 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  24b11f724 
> 
> 
> Diff: https://reviews.apache.org/r/63596/diff/2/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test -Dtest=TestSentryStore
> mvn -f sentry-hdfs/sentry-hdfs-service/pom.xml test 
> -Dtest=TestSentryHDFSServiceProcessor
> mvn -f sentry-hdfs/sentry-hdfs-service/pom.xml test 
> -Dtest=TestImageRetriever.java
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63975: SENTRY-2066: DB name is not set for AlterTable

2017-11-20 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63975/#review191564
---




sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestDbSentryOnFailureHookLoading.java
Line 265 (original), 265 (patched)


Lina, Assert.assertTrue() produces very unhelpful message, telling nothing 
about what we expect vs what we get; it only says that the two values are 
different. 
I suggest using Assert.assertEquals() instead, both for tableName and 
dbName. Specifically:

instead of 
Assert.assertNotNull(msg, y);
Assert.assertTrue(x.equalsIgnoreCase(y.getName()));

do this:
Assert.assertNotNull(msg, y);
Assert.assertNotNull(msg_1, y.getName());
Assert.assertEquals(x.toLowerCase(), y.getName().toLowerCase());

where the last line will throw an exception telling exactly the expected 
value and the actual value, if they are different. This is far more helpful.


- Vadim Spector


On Nov. 21, 2017, 3:32 a.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63975/
> ---
> 
> (Updated Nov. 21, 2017, 3:32 a.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Brian Towles, kalyan kumar 
> kalvagadda, and Sergio Pena.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> if db name is not set in "case HiveParser.TOK_ALTERTABLE" at 
> HiveAuthzBindingHook.preAnalyze, make sure it is set
> 
> 
> Diffs
> -
> 
>   
> sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java
>  e4620ea 
>   
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestDbSentryOnFailureHookLoading.java
>  aa99595 
> 
> 
> Diff: https://reviews.apache.org/r/63975/diff/1/
> 
> 
> Testing
> ---
> 
> unit test
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 63886: SENTRY-2047: isTableEmptyCore method in SentryStore has references to MAuthzPathsMapping when it should be generic

2017-11-16 Thread Vadim Spector via Review Board


> On Nov. 16, 2017, 8:14 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
> > Line 3251 (original), 3251 (patched)
> > 
> >
> > So, it means this code only worked for MAuthzPathsMapping type.
> > 
> > Which implies we've never tested it for any other type - shouldn't we 
> > add such a test now?
> 
> Arjun Mishra wrote:
> It is acutally usd by isHmsNotificationEmpt() method, which passes in 
> MSentryHmsNotification.class. I am not sure why an exception was never thrown 
> because this is the first method we execute before taking a full snapshot for 
> the first time. 
> 
> Test cases were included in TestHMSFollower and they seem to pass. Did 
> you still want me to add tests?

No, actually, nevermind, the way Java generics work, the old casting works even 
if the type is wrong, because it is a collection. The fix is ok.


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63886/#review191238
---


On Nov. 16, 2017, 7:29 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63886/
> ---
> 
> (Updated Nov. 16, 2017, 7:29 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, Sergio Pena, and 
> Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Current isTableEmpty implementation is below
> private boolean isTableEmptyCore(PersistenceManager pm, Class clazz) {
> Query query = pm.newQuery(clazz);
> query.addExtension(LOAD_RESULTS_AT_COMMIT, "false");
> // setRange is implemented efficiently for MySQL, Postgresql (using the 
> LIMIT SQL keyword)
> // and Oracle (using the ROWNUM keyword), with the query only finding the 
> objects required
> // by the user directly in the datastore. For other RDBMS the query will 
> retrieve all
> // objects up to the "to" record, and will not pass any unnecessary 
> objects that are before
> // the "from" record.
> query.setRange(0, 1);
> return ((List) query.execute()).isEmpty();
>   }
> We seem to be casting query.execute to a List when there 
> is no need for it
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  7217dea7a 
> 
> 
> Diff: https://reviews.apache.org/r/63886/diff/1/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-tests/sentry-tests-hive/pom.xml test 
> -Dtest=TestPrivilegesAtTableScopePart1
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63886: SENTRY-2047: isTableEmptyCore method in SentryStore has references to MAuthzPathsMapping when it should be generic

2017-11-16 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63886/#review191251
---


Ship it!




Ship It!

- Vadim Spector


On Nov. 16, 2017, 7:29 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63886/
> ---
> 
> (Updated Nov. 16, 2017, 7:29 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, Sergio Pena, and 
> Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Current isTableEmpty implementation is below
> private boolean isTableEmptyCore(PersistenceManager pm, Class clazz) {
> Query query = pm.newQuery(clazz);
> query.addExtension(LOAD_RESULTS_AT_COMMIT, "false");
> // setRange is implemented efficiently for MySQL, Postgresql (using the 
> LIMIT SQL keyword)
> // and Oracle (using the ROWNUM keyword), with the query only finding the 
> objects required
> // by the user directly in the datastore. For other RDBMS the query will 
> retrieve all
> // objects up to the "to" record, and will not pass any unnecessary 
> objects that are before
> // the "from" record.
> query.setRange(0, 1);
> return ((List) query.execute()).isEmpty();
>   }
> We seem to be casting query.execute to a List when there 
> is no need for it
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  7217dea7a 
> 
> 
> Diff: https://reviews.apache.org/r/63886/diff/1/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-tests/sentry-tests-hive/pom.xml test 
> -Dtest=TestPrivilegesAtTableScopePart1
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63645: SENTRY-2032: Leading Slashes need to removed when creating HMS path entries

2017-11-16 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63645/#review191247
---


Ship it!




Ship It!

- Vadim Spector


On Nov. 16, 2017, 8:38 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63645/
> ---
> 
> (Updated Nov. 16, 2017, 8:38 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> When retrieving full path image update, we split a path by "/" and create HMS 
> Path entries. However, the leading "/" presence will cause issues because on 
> splitting the value at index 0 will be empty. This will affect the creation 
> of HMS path entries.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/utils/PathUtils.java
>  cef8bd734 
>   
> sentry-core/sentry-core-common/src/test/java/org/apache/sentry/core/common/utils/TestPathUtils.java
>  8419b9d5e 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  cf83e7796 
> 
> 
> Diff: https://reviews.apache.org/r/63645/diff/3/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test -Dtest=TestSentryStore
> mvn -f sentry-core/sentry-core-common/pom.xml test -Dtest=TestPathUtils
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63886: SENTRY-2047: isTableEmptyCore method in SentryStore has references to MAuthzPathsMapping when it should be generic

2017-11-16 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63886/#review191238
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
Line 3251 (original), 3251 (patched)


So, it means this code only worked for MAuthzPathsMapping type.

Which implies we've never tested it for any other type - shouldn't we add 
such a test now?


- Vadim Spector


On Nov. 16, 2017, 7:29 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63886/
> ---
> 
> (Updated Nov. 16, 2017, 7:29 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, Sergio Pena, and 
> Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Current isTableEmpty implementation is below
> private boolean isTableEmptyCore(PersistenceManager pm, Class clazz) {
> Query query = pm.newQuery(clazz);
> query.addExtension(LOAD_RESULTS_AT_COMMIT, "false");
> // setRange is implemented efficiently for MySQL, Postgresql (using the 
> LIMIT SQL keyword)
> // and Oracle (using the ROWNUM keyword), with the query only finding the 
> objects required
> // by the user directly in the datastore. For other RDBMS the query will 
> retrieve all
> // objects up to the "to" record, and will not pass any unnecessary 
> objects that are before
> // the "from" record.
> query.setRange(0, 1);
> return ((List) query.execute()).isEmpty();
>   }
> We seem to be casting query.execute to a List when there 
> is no need for it
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  7217dea7a 
> 
> 
> Diff: https://reviews.apache.org/r/63886/diff/1/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-tests/sentry-tests-hive/pom.xml test 
> -Dtest=TestPrivilegesAtTableScopePart1
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63645: SENTRY-2032: Leading Slashes need to removed when creating HMS path entries

2017-11-16 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63645/#review191214
---




sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/utils/PathUtils.java
Line 219 (original), 223 (patched)


Not sure why to change the function name. If it were a generic 
splitString() name, then, of course, then we'd need to disambiguate it via more 
descriptive method name, to say what we understand by "splitting".

But splitting a UNIX path (which is what it is) into path elements, is 
exactly what this method does, including treatment of leading, trailing, and 
duplicate slashes. Am I missing something?



sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/utils/PathUtils.java
Line 220 (original), 225 (patched)


at this point too late to check path != null. If the input path was 
originally null, then the first line will throw NullPointerException. If path 
is not null, then String.replaceAll() will never return null.

Should, probably, just do
if (path != null) {
  do stuff
} else {
  throw new IllegalArgumentException("Null path");
}

to put the most common case first. This function may be used a lot inside 
the loops, so it better be efficiant.

As to making non-null path part of the contract, we have no way to ensure 
it, and when it's broken, it's betetr to have clear message.



sentry-core/sentry-core-common/src/test/java/org/apache/sentry/core/common/utils/TestPathUtils.java
Lines 152 (patched)


I'd recommend more informative test. First, create List expected = 
Arrays.asList("a", "b", "c");

then for each test case do just this:

assertEquals(expected, Arrays.asList(PathUtils.strip...());

this automatically checks the size, but it also checks exact equality of 
two values, while the existing tests will miss something like ["a", "b", "c"] 
vs [ "", "a", "b"].

The above style also allows you to define upfront all possible paths as an 
array of strings, and then do all assertions inside the loop. More compact and 
easy to add new test cases.


- Vadim Spector


On Nov. 8, 2017, 3:03 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63645/
> ---
> 
> (Updated Nov. 8, 2017, 3:03 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> When retrieving full path image update, we split a path by "/" and create HMS 
> Path entries. However, the leading "/" presence will cause issues because on 
> splitting the value at index 0 will be empty. This will affect the creation 
> of HMS path entries.
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/core/common/utils/PathUtils.java
>  cef8bd734 
>   
> sentry-core/sentry-core-common/src/test/java/org/apache/sentry/core/common/utils/TestPathUtils.java
>  8419b9d5e 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  7217dea7a 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/NotificationProcessor.java
>  d92f23e63 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  cf83e7796 
> 
> 
> Diff: https://reviews.apache.org/r/63645/diff/2/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test -Dtest=TestSentryStore
> mvn -f sentry-core/sentry-core-common/pom.xml test -Dtest=TestPathUtils
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63646: SENTRY-2035: Metrics should move to destination atomically

2017-11-07 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63646/#review190414
---


Ship it!




Ship It!

- Vadim Spector


On Nov. 7, 2017, 9:47 p.m., Alexander Kolbasov wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63646/
> ---
> 
> (Updated Nov. 7, 2017, 9:47 p.m.)
> 
> 
> Review request for sentry, Brian Towles, kalyan kumar kalvagadda, Na Li, 
> Sergio Pena, and Vadim Spector.
> 
> 
> Bugs: ENTRY-2035
> https://issues.apache.org/jira/browse/ENTRY-2035
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> SENTRY-2035: Metrics should move to destination atomically
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryMetrics.java
>  86cae64fd4bbab162a09a1ca27a6d3887bfc2dc1 
> 
> 
> Diff: https://reviews.apache.org/r/63646/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alexander Kolbasov
> 
>



Re: Review Request 63596: SENTRY-1951 - Old SentryStore.retrieveFullPathsImage() should be removed

2017-11-06 Thread Vadim Spector via Review Board


> On Nov. 6, 2017, 10:11 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
> > Lines 2453 (patched)
> > 
> >
> > In the original test code all paths were absolute, i.e starting with 
> > "/". After your changes, all table paths start with "prefix", which does 
> > _not_ begin with "/". Why is that? What has changed? If it's correct, it's 
> > worth at least one sentence of javadoc.
> 
> Arjun Mishra wrote:
> Yes so based don my testing it is required that the front slash be left 
> out when you call persist full paths image. Please see 
> SentryStore#retrieveFullPathsImageCore, look at line "String[] pathComponents 
> = PathUtils.splitPath(path);". If the starting "/" is included, one of the 
> path elements would be empty and this would affect the tree structure. 
> 
> That's why all "/" have been removed from paths, that are being passed in 
> as arguments into persisting paths image.
> 
> Arjun Mishra wrote:
> All leading "/" have been removed

So, again, what has changed? Did the tests passed before? Why did they, if they 
had leading slashes?


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63596/#review190231
---


On Nov. 6, 2017, 9:52 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63596/
> ---
> 
> (Updated Nov. 6, 2017, 9:52 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> The old retrieveFullPathsImage() method in SentryStore is no longer used by 
> actual code (retrieveFullPathsImageUpdate(final String[] prefixes) is used 
> instead. It was preserved because it is used by test which now doesn't make 
> much sense.
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestImageRetriever.java
>  b355630e7 
>   
> sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestSentryHDFSServiceProcessor.java
>  f09d1b228 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  0cd6e48aa 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  cf83e7796 
> 
> 
> Diff: https://reviews.apache.org/r/63596/diff/1/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test -Dtest=TestSentryStore
> mvn -f sentry-hdfs/sentry-hdfs-service/pom.xml test 
> -Dtest=TestSentryHDFSServiceProcessor
> mvn -f sentry-hdfs/sentry-hdfs-service/pom.xml test 
> -Dtest=TestImageRetriever.java
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63596: SENTRY-1951 - Old SentryStore.retrieveFullPathsImage() should be removed

2017-11-06 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63596/#review190231
---




sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
Lines 2453 (patched)


In the original test code all paths were absolute, i.e starting with "/". 
After your changes, all table paths start with "prefix", which does _not_ begin 
with "/". Why is that? What has changed? If it's correct, it's worth at least 
one sentence of javadoc.


- Vadim Spector


On Nov. 6, 2017, 9:52 p.m., Arjun Mishra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63596/
> ---
> 
> (Updated Nov. 6, 2017, 9:52 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Sergio Pena, and Vadim Spector.
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> The old retrieveFullPathsImage() method in SentryStore is no longer used by 
> actual code (retrieveFullPathsImageUpdate(final String[] prefixes) is used 
> instead. It was preserved because it is used by test which now doesn't make 
> much sense.
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestImageRetriever.java
>  b355630e7 
>   
> sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestSentryHDFSServiceProcessor.java
>  f09d1b228 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  0cd6e48aa 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  cf83e7796 
> 
> 
> Diff: https://reviews.apache.org/r/63596/diff/1/
> 
> 
> Testing
> ---
> 
> mvn -f sentry-provider/sentry-provider-db/pom.xml test -Dtest=TestSentryStore
> mvn -f sentry-hdfs/sentry-hdfs-service/pom.xml test 
> -Dtest=TestSentryHDFSServiceProcessor
> mvn -f sentry-hdfs/sentry-hdfs-service/pom.xml test 
> -Dtest=TestImageRetriever.java
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>



Re: Review Request 63086: SENTRY-1993: StringIndexOutOfBoundsException in HMSPathsDumper.java

2017-10-20 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63086/#review188866
---


Ship it!




New patch added call to hmsPaths.addPathsToAuthzObject(), which allows passing 
paths with empty elements. This is a public API, so it should be used for other 
tests too, but this will be addressed by another JIRA. For the scope of this 
JIRA, it would cause StringIndexOutOfBoundsException without Misha's fix, so 
now we directly test the fix.

- Vadim Spector


On Oct. 20, 2017, 7:14 p.m., Misha Dmitriev wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63086/
> ---
> 
> (Updated Oct. 20, 2017, 7:14 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov and Vadim Spector.
> 
> 
> Bugs: SENTRY-1993
> https://issues.apache.org/jira/browse/SENTRY-1993
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> SENTRY-1993: StringIndexOutOfBoundsException in HMSPathsDumper.java
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/HMSPathsDumper.java
>  1267093dbbdeb291ec01ecdc87253a90e8ab98ac 
>   
> sentry-hdfs/sentry-hdfs-common/src/test/java/org/apache/sentry/hdfs/TestHMSPathsFullDump.java
>  6a4e32f6ad180bc1c2ecb6d0be7cdae6c586505d 
> 
> 
> Diff: https://reviews.apache.org/r/63086/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Misha Dmitriev
> 
>



Re: Review Request 63086: SENTRY-1993: StringIndexOutOfBoundsException in HMSPathsDumper.java

2017-10-18 Thread Vadim Spector via Review Board


> On Oct. 18, 2017, 2:44 p.m., Vadim Spector wrote:
> > Ship It!

I'll proceed with commit today


- Vadim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63086/#review188492
---


On Oct. 17, 2017, 7:12 p.m., Misha Dmitriev wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63086/
> ---
> 
> (Updated Oct. 17, 2017, 7:12 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov and Vadim Spector.
> 
> 
> Bugs: SENTRY-1993
> https://issues.apache.org/jira/browse/SENTRY-1993
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> SENTRY-1993: StringIndexOutOfBoundsException in HMSPathsDumper.java
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/HMSPathsDumper.java
>  1267093dbbdeb291ec01ecdc87253a90e8ab98ac 
>   
> sentry-hdfs/sentry-hdfs-common/src/test/java/org/apache/sentry/hdfs/TestHMSPathsFullDump.java
>  6a4e32f6ad180bc1c2ecb6d0be7cdae6c586505d 
> 
> 
> Diff: https://reviews.apache.org/r/63086/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Misha Dmitriev
> 
>



Re: Review Request 63086: SENTRY-1993: StringIndexOutOfBoundsException in HMSPathsDumper.java

2017-10-18 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63086/#review188492
---


Ship it!




Ship It!

- Vadim Spector


On Oct. 17, 2017, 7:12 p.m., Misha Dmitriev wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63086/
> ---
> 
> (Updated Oct. 17, 2017, 7:12 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov and Vadim Spector.
> 
> 
> Bugs: SENTRY-1993
> https://issues.apache.org/jira/browse/SENTRY-1993
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> SENTRY-1993: StringIndexOutOfBoundsException in HMSPathsDumper.java
> 
> 
> Diffs
> -
> 
>   
> sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/HMSPathsDumper.java
>  1267093dbbdeb291ec01ecdc87253a90e8ab98ac 
>   
> sentry-hdfs/sentry-hdfs-common/src/test/java/org/apache/sentry/hdfs/TestHMSPathsFullDump.java
>  6a4e32f6ad180bc1c2ecb6d0be7cdae6c586505d 
> 
> 
> Diff: https://reviews.apache.org/r/63086/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Misha Dmitriev
> 
>