steveloughran commented on pull request #3109: URL: https://github.com/apache/hadoop/pull/3109#issuecomment-867108247
Thanks for the detials. I agree, these are all unrelated. Some of them we've seen before and I'd say "you are distant from your S3 bucket/slow network/overloaded laptop". There's a couple of new ones though, both with hints of security/permissions. > org.apache.hadoop.tools.contract.AbstractContractDistCpTest#testDistCpWithIterator > org.junit.runners.model.TestTimedOutException: test timed out after 1800000 milliseconds probably a variant on (https://issues.apache.org/jira/browse/HADOOP-17628)[https://issues.apache.org/jira/browse/HADOOP-17628]: we need to make the test directory tree smaller. it'd make the test faster for all too. Patches welcome :) > org.apache.hadoop.fs.contract.AbstractContractUnbufferTest#testUnbufferOnClosedFile > java.lang.AssertionError: failed to read expected number of bytes from stream. This may be transient Expected :1024 Actual :605 you aren't alone here; its read() returning an undeful buffer. We can't switch to readFully() as the test really wants to call read(). Ignore it. Happens when I use many threads in parallel runs. > org.apache.hadoop.fs.contract.s3a.ITestS3AContractUnbuffer > java.lang.AssertionError: failed to read expected number of bytes from stream. This may be transient Expected :1024 Actual :605 same transient; ignore > org.apache.hadoop.fs.s3a.ITestS3AInconsistency#testGetFileStatus > java.lang.AssertionError: getFileStatus should fail due to delayed visibility. Looks like you are seeing https://issues.apache.org/jira/browse/HADOOP-17457 Given S3 is now consistent, I'd fix this by removing the entire test suite :) ``` org.apache.hadoop.fs.s3a.tools.ITestMarkerTool java.nio.file.AccessDeniedException: : listObjects: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; ``` This is new. Can you file a JIRA with the stack trace, just so we have a history of it. MarkerTool should just be trying to call listObjects under a path in the test dir. ``` org.apache.hadoop.fs.s3a.auth.delegation.ITestDelegatedMRJob java.nio.file.AccessDeniedException: s3a://osm-pds/planet/planet-latest.orc#_partition.lst: getFileStatus on s3a://osm-pds/planet/planet-latest.orc#_partition.lst: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: A1Y4D90WW452Q8A9; S3 Extended Request ID: b/IV48OeMEgTaxikC9raP+IiHVPve3rIeoVkCymMc5opNp/70Iyc0tY2WZ0zpixFl0w7WT3bBCQ=; Proxy: null), S3 Extended Request ID: b/IV48OeMEgTaxikC9raP+IiHVPve3rIeoVkCymMc5opNp/70Iyc0tY2WZ0zpixFl0w7WT3bBCQ=:403 Forbidden ``` This is *very* new, which makes it interesting. If you are seeing this, it means it may surface in the wild. I suspect it's because you've got an IAM permission set up blocking access to this (public) dataset. Can you file a JIRA with this too? I'll probably give you some tasks to find out more about the cause, but at least there'll be an indexed reference to the issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org