----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68973/#review209408 -----------------------------------------------------------
sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java Lines 3404 (patched) <https://reviews.apache.org/r/68973/#comment293802> This can be an attribute of the class and initialized in the constructor sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java Lines 3411 (patched) <https://reviews.apache.org/r/68973/#comment293803> Why is this declared outside when the scope is only restricted to the for loop? sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java Lines 3420 (patched) <https://reviews.apache.org/r/68973/#comment293804> Can you explain why we need the thread.sleep ? sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java Lines 3424 (patched) <https://reviews.apache.org/r/68973/#comment293805> I think if we know there is an exception we should handle it right away instead of comparing the size with what is persisted. According to the Executors.newFixedThreadPool API comments "If any thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks." Looks like Executors.newFixedThreadPool doesn't throw an exception if one of the tasks failed. If that is true I am not very comfortable with using this approach. sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java Lines 3430 (patched) <https://reviews.apache.org/r/68973/#comment293806> If one snapshot persistance failed, how do we know the subsequent retries will pass? Waiting for part of the snapshot persistence to complete and then checking is not very efficient in my opinion - Arjun Mishra On Oct. 10, 2018, 1:45 p.m., kalyan kumar kalvagadda wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68973/ > ----------------------------------------------------------- > > (Updated Oct. 10, 2018, 1:45 p.m.) > > > Review request for sentry, Arjun Mishra, Na Li, and Sergio Pena. > > > Bugs: SENTRY-2305 > https://issues.apache.org/jira/browse/SENTRY-2305 > > > Repository: sentry > > > Description > ------- > > I have considered multiple options. Persisting in batches is not an option > with out changing the schema as the data nucleus does not persist row in > batches for tables which have foreign key on other tables. > > I see that best option is to persist the paths in parallel. It gave good > results. > > Solution Approach: > > I have used a thread pool to persist the snapshot. Size of this thread pool > is configurable. Paths for each object database/table are submitted to this > thread pool. If for reason some of the paths are not pesisted, snapshot is > removed and exception is throw back. > > This patch along with SENTRY-2423 gave was 5 times faster when tested with > below. > > > > Object Type Count > Databases 209 > Tables 2100 > Partitions 200004 > > > Diffs > ----- > > > sentry-core/sentry-core-common/src/main/java/org/apache/sentry/service/common/ServiceConstants.java > 092060c450c6a906850630cb10454737157af5fe > > sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java > 7a736ca9604eb0bb182a159b5a2aed274275c16e > > sentry-service/sentry-service-server/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java > 4a9afe303672baff39be01d4f190034b2bfb75fe > > > Diff: https://reviews.apache.org/r/68973/diff/1/ > > > Testing > ------- > > Added new test and also made sure that existing tests passed. > > > Thanks, > > kalyan kumar kalvagadda > >