> On June 15, 2017, 5 a.m., Alexander Kolbasov wrote: > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/model/MHiveMetastoreImage.java > > Lines 28 (patched) > > <https://reviews.apache.org/r/59720/diff/8/?file=1751335#file1751335line28> > > > > Naming - although the info comes from HMS, it isn't any image of > > HiveMetastore. It is essentially the generation. > > Sergio Pena wrote: > I couldn't find a better name before, but now I did. > What about MAuthzPathsSnapshotId? This reflects the intention of the > table that stores only snapshot identifiers for authz-paths.
I think this is better. > On June 15, 2017, 5 a.m., Alexander Kolbasov wrote: > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java > > Lines 2582 (patched) > > <https://reviews.apache.org/r/59720/diff/8/?file=1751337#file1751337line2587> > > > > Do we really need to do it within the same transaction? It could be Ok, > > but if this is really expensive it might be better to do it asynchronously. > > We can use this approach and see whether it causes problems. > > Sergio Pena wrote: > This is tricky. There are some scenarios that I think it may cause > conflicts. > > 1. Deleting all in one transaction, then persisting in another > transaction may cause that the HMSFollower detects that MAuthzPathsMapping is > empty and attempts to persist a new image while one is in progress. > > 2. If a HMSFollower detected that MAuthzPathsMapping is empty and it > triggers a new transaction to persist a snapshot, but just milliseconds > before, the old thread finishes persisting, then this new thread will persist > a new image (with a new ID) but containg duplicated content on MPaths. There > are some methods that get information about MPaths without joining to > MAuthzPathsMapping, so this will get incorrect results. > > So I thought having this delete in the same transaction would help me > avoiding the above issues. I meant a bit different thing. It is safe to keep old snapshots lying around - we can delete them later at our leisure. We can always persist a new snapshot and your code makes sure that we only pick up elements from the most up-to-date snapshot. Then it is quite safe. What do you think? You don't have to do it now though - we may do it later if we still consider it useful. > On June 15, 2017, 5 a.m., Alexander Kolbasov wrote: > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java > > Line 2583 (original), 2613 (patched) > > <https://reviews.apache.org/r/59720/diff/8/?file=1751337#file1751337line2618> > > > > Can this ever happen? > > Sergio Pena wrote: > This should not happen ever; but the backend DB does not have any > constraint that a user modifies the table manually. > Should we at least display an error in case this is found for some weird > reason? My major concern is that we are doing a lot of extra work for each record which we don't need to do. This will cause slowdown when we enable Sentry and make the initial snapshot way longer then it should be. My suggestion is to remove the whole fetching of old value and comparing whether it is present or not. - Alexander ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59720/#review177967 ----------------------------------------------------------- On June 14, 2017, 6:25 p.m., Sergio Pena wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/59720/ > ----------------------------------------------------------- > > (Updated June 14, 2017, 6:25 p.m.) > > > Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and > Vamsee Yarlagadda. > > > Bugs: SENTRY-1781 > https://issues.apache.org/jira/browse/SENTRY-1781 > > > Repository: sentry > > > Description > ------- > > This patch only adds an extra column to the MAuthzPathsMapping table that > contains an image identifier that will be used to detect full snapshots. This > patch does not increment the identifier yet, it only uses 0 for now. > > TIPS for reviewers: > > - package.jdo adds a new column name AUTHZ_IMG_ID to the AUTHZ_MAPS_MAPPING > table used to identify the paths that are part of a new hive snapshot. > > - MAuthzPathsMapping.java adds the the new column and API to handle JDO > requests. > > - SentryStore.java modifies all methods that interact with the snapshots. It > makes sure that > when persisting a new snapshot, then the authzImgID is incremented and the > old one is > deleted. Also it makes sure that retrieve/add/delete/rename/update uses the > current > authzImgID whenver they're called. > > - TestSentryStore.java adds unit tests to verify that all SentryStore API > work correctly > when a new hive snapshot is created. > > > Diffs > ----- > > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/model/MAuthzPathsMapping.java > f51894bb7e109c37997e7134e07a82f46c0a3c44 > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/model/MHiveMetastoreImage.java > PRE-CREATION > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/model/package.jdo > 459939b7600e05d57c7e11eefe20c2591cac0b34 > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java > 8b19c88e52400ce096cf8f38fe2ca0313f7e46e1 > > sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java > 29878c5f1213062cb9d0a1f9666a19004b67dc52 > > > Diff: https://reviews.apache.org/r/59720/diff/8/ > > > Testing > ------- > > All unit tests run locally, but there are some flaky tests on Jenkins no > related to this patch. > > > Thanks, > > Sergio Pena > >