> On May 15, 2017, 4:22 p.m., Alexander Kolbasov wrote: > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java > > Lines 328 (patched) > > <https://reviews.apache.org/r/59274/diff/1/?file=1718347#file1718347line329> > > > > "(which inditicates a gap and some notifications are removed from meta > > store and not received by Sentry)" > > > > what do you mean by that? What would be the scenario that would cause > > this to happen? > > > > Sergio - what is the speci for notifications - does it guarantee that > > it returns *all* events since the specified value or *some number of > > consequitive events*"?
Scenarios where this gap might happen are: - Sentry is down for more than 24h. HMS clears all entries that has been in the DB for more than 24h. If HMS cleares them before Sentry starts, then there is a gap. - Same as HDFS sync disabled. If HDFS sync is disabled at some point after Sentry already did a snapshot, and HDFS sync is disabled for more than 24h, then we have a gap too. Regarding the number of notifications. The HMS will return all notifications from the specified value up to the maximum specified. Although it might get less if we specify a filter (sentry does not use a filter). See http://github.mtv.cloudera.com/CDH/hive/blob/cdh5-1.1.0/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L7629 Although there is not comments that will guarantee this behavior. This being an external project, we always get the risk that the API implementation changes. We could try to check if the older notification ID received is the next sequantial ID. IMPROVEMENT: We could write code to improve this and request notifications in batches (in Sentry side) on every HMSFollower run. - Sergio ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59274/#review174967 ----------------------------------------------------------- On May 15, 2017, 2:58 p.m., Sergio Pena wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/59274/ > ----------------------------------------------------------- > > (Updated May 15, 2017, 2:58 p.m.) > > > Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and > Na Li. > > > Bugs: SENTRY-1760 > https://issues.apache.org/jira/browse/SENTRY-1760 > > > Repository: sentry > > > Description > ------- > > The patch will set the 'needHiveSnapshot' to TRUE whenever the following > cases are found: > > - List of notifications received are different than expected. > This may happen when Sentry has been down or HDFS sync was disabled for a > while (more than 24h), > and the HMS cleared old notifications (older than 24h) not processed by > Sentry causing a gap when retrieving notifications. > > - Latest Sentry notification ID processed is bigger than current HMS > notification ID. > This may happen when the HMS DB data was reset or restore from an old > snapshot causing sync issues with Sentry. > > When needHiveSnapshot is set to TRUE, then the HMSFollower will CLEAR any > hive snapshot stored on the Sentry store, and recreate a > new hive snapshot from scratch to keep Sentry in sync. > > > Diffs > ----- > > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java > ef6786537e9c5f7730bc86d44e8b4a168c20677e > > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java > 5e6b906587f6422d9bf1466ab83815722bd51fb0 > > > Diff: https://reviews.apache.org/r/59274/diff/1/ > > > Testing > ------- > > HadoopQA is GREEN. > > > Thanks, > > Sergio Pena > >