Re: Review Request 65244: SENTRY-2074: Fix maven dependencies to have all directly used libraries defined

2018-01-23 Thread Colm O hEigeartaigh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65244/#review196040
---



Thanks for the explanation - I see the dependency list is greatly reduced in 
the distribution, nice! A few comments/questions:

a) Please remove the whitespace changes:

warning: squelched 48 whitespace errors
warning: 53 lines add whitespace errors.

b) commons-httpclinet.version in the root pom should be 
"commons-httpclient.version"

c) Are the folllowing jars really required?

zookeeper-3.4.6-tests.jar
curator-test-2.11.1.jar

d) Do we need both Stax API jars?

stax-api-1.0.1.jar
stax-api-1.0-2.jar

e) Can we use a consistent Jackson version?

jackson-annotations-2.2.2.jar
jackson-core-2.2.2.jar
jackson-core-asl-1.8.8.jar
jackson-databind-2.2.2.jar
jackson-jaxrs-1.8.3.jar
jackson-mapper-asl-1.8.8.jar
jackson-xc-1.8.3.jar

- Colm O hEigeartaigh


On Jan. 22, 2018, 3:12 p.m., Brian Towles wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65244/
> ---
> 
> (Updated Jan. 22, 2018, 3:12 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, Arjun Mishra, Colm O 
> hEigeartaigh, kalyan kumar kalvagadda, Na Li, Sergio Pena, and Vadim Spector.
> 
> 
> Bugs: SENTRY-2074
> https://issues.apache.org/jira/browse/SENTRY-2074
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> Resubmitted from https://reviews.apache.org/r/65192/ in order to fix changed 
> paths file from master.
> 
> Created a patch that has all of the direct dependency libraries explicitly 
> defined.  There is also some minor cleanup on ThreadSafe annotations to make 
> it from a consistent package (this is only used by IDEs for checks anyways).  
> As well I added a check in the poms to fail a build if it discovers any 
> direct use dependencies that are not defined in the pom.
> 
> The one of the primary motivations for this patch is to help on its way to 
> cleaning up the distribution.  Currently the dist module reads all 
> dependencies no mater what scope they are and drops them into the distributed 
> libs.  This causes things like junit and ant to be pushed into the libs that 
> are being distributed.  With the changes to have direct dependencies always 
> defined it allows us to take compile and runtime scopes only into account 
> when dropping the libs needed.  
> 
> As well this identifies which libraries are provided already by environments 
> where the plugins/bindings are going into.  For example in the hive bindings, 
> the hive and hadoop libraries need only be defined with "provided" scope, 
> since with those application we want to use the hadoop and hive libraries 
> that the applications already provide.
> 
> This makes it a lot easier for shading and package shifting of the binding 
> and plugins for libraries and versions of those libraries that are needed by 
> the binding and might conflict with versions already in the application which 
> the binding or plugin is going into.  Guava is a major issue with this.  
> Doing this short of shading based on the cleanup would allow us to rev Guava 
> and use newer Guava features while not conflicting with the Guava version the 
> application is using.  By having the directly used dependency defined it 
> gives us control over the exact version we are using and not be dependent on 
> and having conflicts with the transitive dependencies of the application 
> being embedded in.
> 
> This patch will not really make the development process harder since the 
> analysis of the dependencies needed automatically runs as part of the build 
> and a failure occurs telling you which "used but undefined" and which 
> "defined but unused" libraries are missing or in the pom.  There is even an 
> xml dump of the dependencies part need to put right into the pom. No 
> additional runs or dependency analysis needs to take place.
> 
> 
> Diffs
> -
> 
>   pom.xml 6f9856e45b72ef9e0c43a222eddc8452b64f1a71 
>   sentry-binding/pom.xml 17b0f1afa658203327668b33990e5da53dd75ad0 
>   sentry-binding/sentry-binding-hbase-indexer/pom.xml 
> d50acfec954acf8e8d9682425122039e6f5a2724 
>   sentry-binding/sentry-binding-hive-common/pom.xml 
> e39004bd1681f1424a463b59d05359206b94777b 
>   sentry-binding/sentry-binding-hive-conf/pom.xml 
> 3e7e70a321733a97914f13f9cc5e39557ab1c9fb 
>   sentry-binding/sentry-binding-hive-follower/pom.xml 
> 5f8a5afb479b783d027a005ea8f72590445b8386 
>   sentry-binding/sentry-binding-hive/pom.xml 
> ccfa9cfeab3cafcccd275a5b395ca6eb3f7aa609 
>   sentry-binding/sentry-binding-kafka/pom.xml 
> 239eeba5fe28962e6481dff8f5b5bd1303e8666a 
>   sentry-binding/sentry-binding-solr/pom.xml 
> f08669994aa6d058fc9c92e6a8b576063602cc95 
>   sentry-binding/sentry-binding-sqoop/pom.xml

Re: Review Request 65268: SENTRY-1904: TransactionManager should limit the max time spent by transaction retry

2018-01-23 Thread kalyan kumar kalvagadda via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65268/
---

(Updated Jan. 23, 2018, 6:43 p.m.)


Review request for sentry, Arjun Mishra, Na Li, Sergio Pena, and Vadim Spector.


Changes
---

Changed the solution approach.


Bugs: SENTRY-1904
https://issues.apache.org/jira/browse/SENTRY-1904


Repository: sentry


Description (updated)
---

The TransactionManager uses exponential backoff strategy for transaction 
retries. This may cause some transactions to be delayed by a very long time. We 
should also have a constraint on the max time for a transaction so that we do 
not retry for too long. 

New patch that is attached adds upper bounds on below

1.Interval between the retry attempts which increases exponentially.
2.Total time a transaction could spend in retries.
 

With out these limits we would not have a control on how long a transaction 
could be be active.


Diffs (updated)
-

  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
 2f2b98412e7dfdcc847ffe7975a70f452554e747 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
 edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/TransactionManager.java
 f4ff962a67f8a5c23cc5c8daa7bcb861d2e6b6a5 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
 7e02874b4be6a7109108809b1f404fe971b7b8e2 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
 b4100278392986c161625a366212c6fef66ec0a9 


Diff: https://reviews.apache.org/r/65268/diff/2/

Changes: https://reviews.apache.org/r/65268/diff/1-2/


Testing
---

Made sure all the tests pass.


Thanks,

kalyan kumar kalvagadda



Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-23 Thread kalyan kumar kalvagadda via Review Board


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Line 60 (original), 61 (patched)
> > 
> >
> > This using probably unintentional use of Bool instead of primitive 
> > boolean. There seems to be no reason for using autoboxed types.

it was un-intentional. I will update it.


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 70 (patched)
> > 
> >
> > So, the assumption is that the average number of duplicates per each ID 
> > cannot exceed 3, correct?

Re-tried notifications will be found in the cache even if all of them have 3 
duplicates.


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 67 (patched)
> > 
> >
> > I wonder about the choice of LRUMap for cache.
> > 
> > The ideal data structure would one only store the caches for the events 
> > in the ID range that we want, which is [MAX(event_id) - refetch_count, 
> > MAX(event_id)].
> > 
> > LRUMap does not explicitly address this requirement. It assumes that 
> > the oldest (least accessed) caches will tend to be for the events with the 
> > oldest IDs. We know this is often not true, which is one of the problems 
> > this JIRA tries to work around.
> > 
> > Given that when we do not find the entry in cache we check SentryStore 
> > anyway, everything will still work. But we may end up with a few extra DB 
> > calls to SentryStore and it's hard to predict the overhead.
> > 
> > The cache structure solving this shortcoming could be 
> > HashSet[notificationFetchCount +1] array.
> > 
> > It's the circular buffer structure where the cache[ID % cache.length] 
> > element contains the HashSet of all caches for the events with the given 
> > event ID. We can see just as fast if the event has been processed.
> > 
> > Before each fetch we can calculate which elements in this cache array 
> > correspond to the IDs outside the current fetching range and nullify them 
> > (for curcular buffer some care is needed for modulo arithmetic).
> > 
> > This way we know that
> > a) cache stores the caches of the events only in the range we care 
> > about, though now cleanup is explicit.
> > b) we do not necessarily need to put the upper limit on the number of 
> > events with the given ID (the sizes of HashSet's).
> > 
> > It may save some expensive DB calls.
> > 
> > However, I do not feel strongly one way or the other until we test it 
> > and see the overhead. Perhaps the fudge factor x3 in LRUMap size is enough 
> > to keep enough entries, old or new; so I'm leaving it up to you.
> 
> Na Li wrote:
> Vadim, I prefer to use circular buffer as well. Mentioned this to Kalyan 
> before. The memory usage and execution overhead should be much smaller than 
> LRUMap.
> 
> Vadim Spector wrote:
> I like that if the entry is not found in the cache, there is no need to 
> go to the database to double-check. Can't beat that.

Help we understand your comment.
"LRUMap does not explicitly address this requirement. It assumes that the 
oldest (least accessed) caches will tend to be for the events with the oldest 
IDs. We know this is often not true, which is one of the problems this JIRA 
tries to work around."

Can you give me one example where you think it would not be found in the cache?


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 119 (patched)
> > 
> >
> > any reason for returning Boolean vs bool?

it was un-intentional. I will update it.


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 231 (patched)
> > 
> >
> > optimization - no need for this DB call if the ID > max(storedId)

will consider it in the new patch.


- kalyan kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64955/#review195917
---


On Jan. 22, 2018, 5:40 p.m., kalyan kumar kalvagadda wrote:
> 
> 

Re: Review Request 64955: SENTRY-2109: Fix the logic of identifying HMS out of Sync

2018-01-23 Thread Vadim Spector via Review Board


> On Jan. 22, 2018, 8:49 p.m., Vadim Spector wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HiveNotificationFetcher.java
> > Lines 67 (patched)
> > 
> >
> > I wonder about the choice of LRUMap for cache.
> > 
> > The ideal data structure would one only store the caches for the events 
> > in the ID range that we want, which is [MAX(event_id) - refetch_count, 
> > MAX(event_id)].
> > 
> > LRUMap does not explicitly address this requirement. It assumes that 
> > the oldest (least accessed) caches will tend to be for the events with the 
> > oldest IDs. We know this is often not true, which is one of the problems 
> > this JIRA tries to work around.
> > 
> > Given that when we do not find the entry in cache we check SentryStore 
> > anyway, everything will still work. But we may end up with a few extra DB 
> > calls to SentryStore and it's hard to predict the overhead.
> > 
> > The cache structure solving this shortcoming could be 
> > HashSet[notificationFetchCount +1] array.
> > 
> > It's the circular buffer structure where the cache[ID % cache.length] 
> > element contains the HashSet of all caches for the events with the given 
> > event ID. We can see just as fast if the event has been processed.
> > 
> > Before each fetch we can calculate which elements in this cache array 
> > correspond to the IDs outside the current fetching range and nullify them 
> > (for curcular buffer some care is needed for modulo arithmetic).
> > 
> > This way we know that
> > a) cache stores the caches of the events only in the range we care 
> > about, though now cleanup is explicit.
> > b) we do not necessarily need to put the upper limit on the number of 
> > events with the given ID (the sizes of HashSet's).
> > 
> > It may save some expensive DB calls.
> > 
> > However, I do not feel strongly one way or the other until we test it 
> > and see the overhead. Perhaps the fudge factor x3 in LRUMap size is enough 
> > to keep enough entries, old or new; so I'm leaving it up to you.
> 
> Na Li wrote:
> Vadim, I prefer to use circular buffer as well. Mentioned this to Kalyan 
> before. The memory usage and execution overhead should be much smaller than 
> LRUMap.
> 
> Vadim Spector wrote:
> I like that if the entry is not found in the cache, there is no need to 
> go to the database to double-check. Can't beat that.
> 
> kalyan kumar kalvagadda wrote:
> Help we understand your comment.
> "LRUMap does not explicitly address this requirement. It assumes that the 
> oldest (least accessed) caches will tend to be for the events with the oldest 
> IDs. We know this is often not true, which is one of the problems this JIRA 
> tries to work around."
> 
> Can you give me one example where you think it would not be found in the 
> cache?

For simplicity of the example, suppose we look back by 2 IDs. It means that we 
fetch with starting number (MaxID - 2) to include 3 IDs: (MaxID-2, MaxID-1, 
MaxID). Therefore, the LRUMap size is 3 * 3 = 9.
Suppose, starting point is LRUMap is empty (just to keep the explanation 
simple), then we perform several fetches, in the exact order specified (first 
listed ID arrives first):

fetch #1 (start_id = 0): 3, 3, 3, 2, 2, 2, 1, 1, 1
 Out of order, can happen, the entries for ID = 3 are the oldest ones. 
LRUMap reaches max size 9. maxID = 3.

fetch #2 (start_id = 3 - 2 = 1):  3, 3, 3, 2, 2, 2, 1, 1, 1 (re-fetched from 
fetch #1), 3, 4 (new events)
 The LRUMap automatically purges the first 2 entries with the ID = 3 to 
accomodate new events 3 and 4. maxID = 4.

fetch #3 (start_id = 4 - 2 = 2): 3, 3, 3, 2, 2, 2, (refetched from fetch #1, 
excluding ID = 1, as it is < 2), 3, 4 (refetched from fetch #2), 5 (new)

Let's stop here. In fetch #3 we still get the first two events with ID = 3, 
which were originally in fetch #1 and fetch #2. However, they are no longer in 
LRUMap, purged after fetch #2, so we end up with DB lookup.

I just provided the simplest example. It is probably extreme that the event 
with the biggest ID in the fetch arrives first in the fetch, as in fetch #1. 
However, it can arrive somewhere in the middle of the fetch too, and some time 
later LRU will still purge it ahead of the events with the smaller ID that 
should've been purged first.

As I was working through this example, I realized that things could be better 
if after each fetch the events were first sorted by their IDs before adding to 
LRUMap. I do not think it will entirely solve the problem of prematurely purged 
entries. The example where all IDs get sorted prior to putting them into LRUMap:

fetch #1 (start_id = 0): 1, 1, 1, 2, 2, 2, 3, 3, 3
 All in order. MaxID = 3

fetch #2 (start_id = 3 - 2 = 1): 1, 1, 1, 2, 2, 2, 3, 3, 3 (from fetch #1), 1, 
1, 1, 1, 2, 2, 2, (new), 

Re: Review Request 65268: SENTRY-1904: TransactionManager should limit the max time spent by transaction retry

2018-01-23 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65268/#review196076
---


Ship it!




Ship It!

- Na Li


On Jan. 23, 2018, 6:43 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65268/
> ---
> 
> (Updated Jan. 23, 2018, 6:43 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Na Li, Sergio Pena, and Vadim 
> Spector.
> 
> 
> Bugs: SENTRY-1904
> https://issues.apache.org/jira/browse/SENTRY-1904
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> The TransactionManager uses exponential backoff strategy for transaction 
> retries. This may cause some transactions to be delayed by a very long time. 
> We should also have a constraint on the max time for a transaction so that we 
> do not retry for too long. 
> 
> New patch that is attached adds upper bounds on below
> 
> 1.Interval between the retry attempts which increases exponentially.
> 2.Total time a transaction could spend in retries.
>  
> 
> With out these limits we would not have a control on how long a transaction 
> could be be active.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/TransactionManager.java
>  f4ff962a67f8a5c23cc5c8daa7bcb861d2e6b6a5 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
> 
> 
> Diff: https://reviews.apache.org/r/65268/diff/2/
> 
> 
> Testing
> ---
> 
> Made sure all the tests pass.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 65268: SENTRY-1904: TransactionManager should limit the max time spent by transaction retry

2018-01-23 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65268/#review196078
---




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
Lines 421 (patched)


Since this code is inside catch {} and serves the purpose of clarifying the 
possible exception cause, would it make sense to put it inside its own 
try-catch so if the exception is thrown, it would be logged, not re-thrown from 
the parent catch {} case?


- Vadim Spector


On Jan. 23, 2018, 6:43 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65268/
> ---
> 
> (Updated Jan. 23, 2018, 6:43 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Na Li, Sergio Pena, and Vadim 
> Spector.
> 
> 
> Bugs: SENTRY-1904
> https://issues.apache.org/jira/browse/SENTRY-1904
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> The TransactionManager uses exponential backoff strategy for transaction 
> retries. This may cause some transactions to be delayed by a very long time. 
> We should also have a constraint on the max time for a transaction so that we 
> do not retry for too long. 
> 
> New patch that is attached adds upper bounds on below
> 
> 1.Interval between the retry attempts which increases exponentially.
> 2.Total time a transaction could spend in retries.
>  
> 
> With out these limits we would not have a control on how long a transaction 
> could be be active.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/TransactionManager.java
>  f4ff962a67f8a5c23cc5c8daa7bcb861d2e6b6a5 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
> 
> 
> Diff: https://reviews.apache.org/r/65268/diff/2/
> 
> 
> Testing
> ---
> 
> Made sure all the tests pass.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 65268: SENTRY-1904: TransactionManager should limit the max time spent by transaction retry

2018-01-23 Thread kalyan kumar kalvagadda via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65268/
---

(Updated Jan. 23, 2018, 11:40 p.m.)


Review request for sentry, Arjun Mishra, Na Li, Sergio Pena, and Vadim Spector.


Changes
---

Addressed review comment from vadim


Bugs: SENTRY-1904
https://issues.apache.org/jira/browse/SENTRY-1904


Repository: sentry


Description
---

The TransactionManager uses exponential backoff strategy for transaction 
retries. This may cause some transactions to be delayed by a very long time. We 
should also have a constraint on the max time for a transaction so that we do 
not retry for too long. 

New patch that is attached adds upper bounds on below

1.Interval between the retry attempts which increases exponentially.
2.Total time a transaction could spend in retries.
 

With out these limits we would not have a control on how long a transaction 
could be be active.


Diffs (updated)
-

  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
 2f2b98412e7dfdcc847ffe7975a70f452554e747 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
 edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/TransactionManager.java
 f4ff962a67f8a5c23cc5c8daa7bcb861d2e6b6a5 
  
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
 7e02874b4be6a7109108809b1f404fe971b7b8e2 
  
sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
 b4100278392986c161625a366212c6fef66ec0a9 


Diff: https://reviews.apache.org/r/65268/diff/3/

Changes: https://reviews.apache.org/r/65268/diff/2-3/


Testing
---

Made sure all the tests pass.


Thanks,

kalyan kumar kalvagadda



Re: Review Request 65268: SENTRY-1904: TransactionManager should limit the max time spent by transaction retry

2018-01-23 Thread Vadim Spector via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65268/#review196083
---


Ship it!




Ship It!

- Vadim Spector


On Jan. 23, 2018, 11:40 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65268/
> ---
> 
> (Updated Jan. 23, 2018, 11:40 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Na Li, Sergio Pena, and Vadim 
> Spector.
> 
> 
> Bugs: SENTRY-1904
> https://issues.apache.org/jira/browse/SENTRY-1904
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> The TransactionManager uses exponential backoff strategy for transaction 
> retries. This may cause some transactions to be delayed by a very long time. 
> We should also have a constraint on the max time for a transaction so that we 
> do not retry for too long. 
> 
> New patch that is attached adds upper bounds on below
> 
> 1.Interval between the retry attempts which increases exponentially.
> 2.Total time a transaction could spend in retries.
>  
> 
> With out these limits we would not have a control on how long a transaction 
> could be be active.
> 
> 
> Diffs
> -
> 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java
>  2f2b98412e7dfdcc847ffe7975a70f452554e747 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  edea5b64d8f98c93aafc1fe43fa97e00c2ce2948 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/TransactionManager.java
>  f4ff962a67f8a5c23cc5c8daa7bcb861d2e6b6a5 
>   
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java
>  7e02874b4be6a7109108809b1f404fe971b7b8e2 
>   
> sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  b4100278392986c161625a366212c6fef66ec0a9 
> 
> 
> Diff: https://reviews.apache.org/r/65268/diff/3/
> 
> 
> Testing
> ---
> 
> Made sure all the tests pass.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>