[jira] [Comment Edited] (HADOOP-16484) S3A to warn or fail if S3Guard is disabled

2019-11-08 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970466#comment-16970466
 ] 

Steve Loughran edited comment on HADOOP-16484 at 11/8/19 5:44 PM:
--

OK, this is good and I am already pleased to see it in my logs.

But I realise we've missed something -in the s3guard tool we explicitly disable 
S3Guard when instantiating the FS. So we get warning messages which are not in 
fact correct.
{code}
2019-11-08 17:38:35,656 [main] DEBUG s3guard.S3Guard 
(S3Guard.java:getMetadataStoreClass(136)) - Metastore option source 
[fs.s3a.bucket.hwdev-steve-ireland-new.metadatastore.impl via [S3AUtils]]
2019-11-08 17:38:35,657 [main] DEBUG s3guard.S3Guard 
(S3Guard.java:getMetadataStore(108)) - Using NullMetadataStore metadata store 
for s3a filesystem
2019-11-08 17:38:35,659 [main] INFO  s3a.S3AFileSystem 
(S3Guard.java:logS3GuardDisabled(849)) - S3Guard is disabled on this bucket: 
hwdev-steve-ireland-new
2019-11-08 17:38:35,659 [main] DEBUG s3a.S3AUtils 
(S3AUtils.java:longOption(1001)) - Value of fs.s3a.multipart.purge.age is 
360
2019-11-08 17:38:35,665 [main] DEBUG s3a.MultipartUtils 
(MultipartUtils.java:requestNextBatch(158)) - [1], Requesting next 5000 uploads 
prefix , next key null, next upload id null
2019-11-08 17:38:35,667 [main] DEBUG s3a.Invoker (DurationInfo.java:(74)) 
- Starting: listMultipartUploads
2019-11-08 17:38:36,004 [main] DEBUG s3a.Invoker (DurationInfo.java:close(89)) 
- listMultipartUploads: duration 0:00.338s
2019-11-08 17:38:36,005 [main] DEBUG s3a.MultipartUtils 
(MultipartUtils.java:requestNextBatch(165)) - New listing state: Upload 
iterator: prefix ; list count 2; isTruncated=false
Total 0 uploads found.
2019-11-08 17:38:36,008 [shutdown-hook-0] DEBUG s3a.S3AFileSystem 
(S3AFileSystem.java:close(3117)) - Filesystem s3a://hwdev-steve-ireland-new is 
closed
{code}


Proposed: just as we force in the null metastore, we will need to set the log 
to debug.

I'm just going to reopen this as a followup. 

[~gabor.bota]: do you want to do this or shall I do the code and you do the 
review?


was (Author: ste...@apache.org):
S3A auth mode can cause confusion in deployments, because people expect there 
never to be any HTTP requests to S3 in a path marked as authoritative.

This is *not* the case when S3Guard doesn't have an entry for the path in the 
table. Which is the state it is in when the directory was populated using 
different tools (e.g AWS s3 command).

Proposed

1. HADOOP-16684 to give more diagnostics about the bucket

2. add an audit command to take a path and verify that it is marked in dynamoDB 
as authoritative *all the way down*

This command is designed to be executed from the commandline and will return 
different error codes based on different situations

* path isn't guarded
* path is not authoritative in s3a settings (dir, path)
* path not known in table: use the 404/44 response
* path contains 1+ dir entry which is non-auth

3. Use this audit after some of the bulk rename, delete, import, commit (soon: 
upload, copy) operations to verify that's where appropriate, we do update the 
directories. Particularly for incremental rename() where I have long suspected 
we may have to do more there.

4. Review documentation and make it clear what is needed (import) after 
uploading/Generating Data through other tools.

I'm going to pull in the open JIRAs on this topic as they are all related


There shouldn't be anything wrong with using the AWS S3 command to create the 
test table -we just need to tell S3Guard to scan it afterwards, which "s3guard 
import" does. The audit command well make sure that everything is set up in 
DynamoDB before the next stage in the test suite. Then, if we still see IO 
against S3 during list operations, then we can start worrying about whether or 
not there is actally a bug in the s3a code. (we could use it after things like 
DDB and spark & hive queries too to validate the output is being tagged as auth 
too)

+add some tests of listLocatedStatus, listFiles, listStatus to verify they 
don't go near S3 on parts they consider authoritative

Examine the path metadata, declare whether it should be queued for recursive 
scanning
@throws ExitUtil

OK, this is good and I am already pleased to see it in my logs.

But I realise we've missed something -in the s3guard tool we explicitly disable 
S3Guard when instantiating the FS. So we get warning messages which are not in 
fact correct.
{code}
2019-11-08 17:38:35,656 [main] DEBUG s3guard.S3Guard 
(S3Guard.java:getMetadataStoreClass(136)) - Metastore option source 
[fs.s3a.bucket.hwdev-steve-ireland-new.metadatastore.impl via [S3AUtils]]
2019-11-08 17:38:35,657 [main] DEBUG s3guard.S3Guard 
(S3Guard.java:getMetadataStore(108)) - Using NullMetadataStore metadata store 
for s3a filesystem
2019-11-08 17:38:35,659 [main] INFO  s3a.S3AFileSystem 

[jira] [Comment Edited] (HADOOP-16484) S3A to warn or fail if S3Guard is disabled

2019-10-16 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952938#comment-16952938
 ] 

Gabor Bota edited comment on HADOOP-16484 at 10/16/19 3:29 PM:
---

I removed status (merged with inform) so at inform level it will log with 
LOG.info


was (Author: gabor.bota):
I removed status (merged with inform) so at inform level it will log with 

> S3A to warn or fail if S3Guard is disabled
> --
>
> Key: HADOOP-16484
> URL: https://issues.apache.org/jira/browse/HADOOP-16484
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> A seemingly recurrent problem with s3guard is "people who think S3Guard is 
> turned on but really it isn't"
> It's not immediately obvious this is the case, and the fact S3Guard is off 
> tends to surface after some intermittent failure has actually been detected.
> Propose: add a configuration parameter which chooses what to do when an S3A 
> FS is instantiated without S3Guard
> * silent : today; do nothing.
> * status: give s3guard on/off status
> * inform: log FS is instantiated without s3guard
> * warn: Warn that data may be at risk in workflows
> * fail
> deployments could then choose which level of reaction they want. I'd make the 
> default "inform" for now; any on-prem object store deployment should switch 
> to silent, and if you really want strictness, fail is the ultimate option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org