steveloughran edited a comment on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#issuecomment-528858428 working on the CLI, but overreporting errors, especially on versioning Scan of a dir did work, but it overreacts to * no etag on a directory entry * version Id mismatch Reports no etag on directory entries where we don't expect one: ``` 2019-09-06 13:46:33,571 [main] ERROR s3guard.S3GuardFsckViolationHandler (S3GuardFsckViolationHandler.java:handle(79)) - On path: s3a://hwdev-steve-ireland-new/etc/hadoop No etag. ``` On a scan of a tree it reports version ID mismatches where s3 == null: ``` 2019-09-06 13:46:33,572 [main] ERROR s3guard.S3GuardFsckViolationHandler (S3GuardFsckViolationHandler.java:handle(79)) - On path: s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml getModificationTime mismatch - s3: 1567773093000, ms: 1567773092204 getVersionId mismatch - s3: null, ms: AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t ``` The ddb table has the version ID, but I'm assuming that the scan doesn't get them from S3 because we'd need to use HEAD over LIST. When I give the full path `s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml` it says there's a mismatch but now prints the same value on both sides. This is not a mismatch and should not appear. ``` ~/P/R/fsck bin/hadoop s3guard fsck -check s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml 2019-09-06 13:59:52,857 [main] INFO s3guard.S3GuardTool (S3GuardTool.java:initMetadataStore(322)) - Metadata store DynamoDBMetadataStore{region=eu-west-1, tableName=hwdev-steve-ireland-new, tableArn=arn:aws:dynamodb:eu-west-1:980678866538:table/hwdev-steve-ireland-new} is initialized. 2019-09-06 13:59:53,057 [main] INFO s3guard.S3GuardFsck (S3GuardFsck.java:compareFileStatusToPathMetadata(217)) - Path: s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml - Length S3: 8260, MS: 8260 - Etag S3: 2887e7740b821abd405e6a5c70d2081e, MS: 2887e7740b821abd405e6a5c70d2081e 2019-09-06 13:59:53,115 [main] INFO s3guard.S3GuardFsck (S3GuardFsck.java:compareFileStatusToPathMetadata(217)) - Path: s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml - Length S3: 8260, MS: 8260 - Etag S3: 2887e7740b821abd405e6a5c70d2081e, MS: 2887e7740b821abd405e6a5c70d2081e 2019-09-06 13:59:53,142 [main] ERROR s3guard.S3GuardFsckViolationHandler (S3GuardFsckViolationHandler.java:handle(79)) - On path: s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml getModificationTime mismatch - s3: 1567773093000, ms: 1567773092204 getVersionId mismatch - s3: AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t, ms: AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t 2019-09-06 13:59:53,142 [main] ERROR s3guard.S3GuardFsckViolationHandler (S3GuardFsckViolationHandler.java:handle(79)) - On path: s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml getModificationTime mismatch - s3: 1567773093000, ms: 1567773092204 getVersionId mismatch - s3: AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t, ms: AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t 2019-09-06 13:59:53,142 [main] INFO s3guard.S3GuardFsck (S3GuardFsck.java:compareS3ToMs(144)) - Total scan time: 0s 2019-09-06 13:59:53,142 [main] INFO s3guard.S3GuardFsck (S3GuardFsck.java:compareS3ToMs(145)) - Scanned entries: 2 ``` Note also that the file gets scanned twice. This hints at the scanning playing up when the supplied path is a file, not a dir. Now I open the file with `hadoop fs -cat s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml`; there's a PUT to the DDB table as the modtime is updated; the next scan doesn't report modtime issues, but it does still mistakenly report the version IDs are different. ``` bin/hadoop s3guard fsck -check s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml 2019-09-06 14:33:59,582 [main] INFO s3guard.S3GuardTool (S3GuardTool.java:initMetadataStore(322)) - Metadata store DynamoDBMetadataStore{region=eu-west-1, tableName=hwdev-steve-ireland-new, tableArn=arn:aws:dynamodb:eu-west-1:980678866538:table/hwdev-steve-ireland-new} is initialized. 2019-09-06 14:33:59,773 [main] INFO s3guard.S3GuardFsck (S3GuardFsck.java:compareFileStatusToPathMetadata(217)) - Path: s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml - Length S3: 8260, MS: 8260 - Etag S3: 2887e7740b821abd405e6a5c70d2081e, MS: 2887e7740b821abd405e6a5c70d2081e 2019-09-06 14:33:59,828 [main] INFO s3guard.S3GuardFsck (S3GuardFsck.java:compareFileStatusToPathMetadata(217)) - Path: s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml - Length S3: 8260, MS: 8260 - Etag S3: 2887e7740b821abd405e6a5c70d2081e, MS: 2887e7740b821abd405e6a5c70d2081e 2019-09-06 14:33:59,856 [main] ERROR s3guard.S3GuardFsckViolationHandler (S3GuardFsckViolationHandler.java:handle(79)) - On path: s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml getVersionId mismatch - s3: AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t, ms: AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t 2019-09-06 14:33:59,856 [main] ERROR s3guard.S3GuardFsckViolationHandler (S3GuardFsckViolationHandler.java:handle(79)) - On path: s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml getVersionId mismatch - s3: AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t, ms: AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t 2019-09-06 14:33:59,856 [main] INFO s3guard.S3GuardFsck (S3GuardFsck.java:compareS3ToMs(144)) - Total scan time: 0s 2019-09-06 14:33:59,856 [main] INFO s3guard.S3GuardFsck (S3GuardFsck.java:compareS3ToMs(145)) - Scanned entries: 2 ``` side issue: what to do when the path supplied is for a file which has a tombstone in DDB and no file? Currently it's FNFE ``` bin/hadoop s3guard fsck -check s3a://hwdev-steve-ireland-new/etc/hadoop/httpfs-site.xml._COPYING_ 2019-09-06 13:33:14,788 [main] INFO s3guard.S3GuardTool (S3GuardTool.java:initMetadataStore(322)) - Metadata store DynamoDBMetadataStore{region=eu-west-1, tableName=hwdev-steve-ireland-new, tableArn=arn:aws:dynamodb:eu-west-1:980678866538:table/hwdev-steve-ireland-new} is initialized. java.io.FileNotFoundException: No such file or directory: s3a://hwdev-steve-ireland-new/etc/hadoop/httpfs-site.xml._COPYING_ 2019-09-06 13:33:14,890 [main] INFO util.ExitUtil (ExitUtil.java:terminate(210)) - Exiting with status 44: java.io.FileNotFoundException: No such file or directory: s3a://hwdev-steve-ireland-new/etc/hadoop/httpfs-site.xml._COPYING_ ``` Are we confident that this command will do a check if there is a file in S3 but tombstoned in MS?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org