[ https://issues.apache.org/jira/browse/HADOOP-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644973#comment-16644973 ]
Gabor Bota edited comment on HADOOP-15819 at 10/10/18 2:19 PM: --------------------------------------------------------------- I have some bad news: I get these issues for tests where {{FS_S3A_IMPL_DISABLE_CACHE}} set to true, so the caching _should be_ disabled. What I did is I modified the {{org.apache.hadoop.fs.s3a.AbstractS3ATestBase#teardown}} to {code:java} @Override public void teardown() throws Exception { super.teardown(); boolean fsCacheDisabled = getConfiguration() .getBoolean(FS_S3A_IMPL_DISABLE_CACHE, false); if(fsCacheDisabled){ describe("closing file system"); LOG.warn("Closing fs. FS_S3A_IMPL_DISABLE_CACHE: " + fsCacheDisabled); IOUtils.closeStream(getFileSystem()); } } {code} And there were still issues after this. was (Author: gabor.bota): I have some bad news: I get these issues for tests where {{FS_S3A_IMPL_DISABLE_CACHE}} set to true, so the caching _should be_ disabled. > S3A integration test failures: FileSystem is closed! - without parallel test > run > -------------------------------------------------------------------------------- > > Key: HADOOP-15819 > URL: https://issues.apache.org/jira/browse/HADOOP-15819 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 > Affects Versions: 3.1.1 > Reporter: Gabor Bota > Assignee: Gabor Bota > Priority: Critical > Attachments: S3ACloseEnforcedFileSystem.java, > closed_fs_closers_example_5klines.log.zip > > > Running the integration tests for hadoop-aws {{mvn -Dscale verify}} against > Amazon AWS S3 (eu-west-1, us-west-1, with no s3guard) we see a lot of these > failures: > {noformat} > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.408 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob) > Time elapsed: 0.027 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 4.345 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob > [ERROR] > testStagingDirectory(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.021 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.022 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.489 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest) > Time elapsed: 0.023 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.695 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob > [ERROR] testMRJob(org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob) > Time elapsed: 0.039 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.015 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory > [ERROR] > testEverything(org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory) > Time elapsed: 0.014 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > {noformat} > The big issue is that the tests are running in a serial manner - no test is > running on top of the other - so we should not see that the tests are failing > like this. The issue could be in how we handle > org.apache.hadoop.fs.FileSystem#CACHE - the tests should use the same > S3AFileSystem so if A test uses a FileSystem and closes it in teardown then B > test will get the same FileSystem object from the cache and try to use it, > but it is closed. > We see this a lot in our downstream testing too. It's not possible to tell > that the failed regression test result is an implementation issue in the > runtime code or a test implementation problem. > I've checked when and what closes the S3AFileSystem with a sightly modified > version of S3AFileSystem which logs the closers of the fs if an error should > occur. I'll attach this modified java file for reference. See the next > example of the result when it's running: > {noformat} > 2018-10-04 00:52:25,596 [Thread-4201] ERROR s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:checkIfClosed(74)) - Use after close(): > java.lang.RuntimeException: Using closed FS!. > at > org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.checkIfClosed(S3ACloseEnforcedFileSystem.java:73) > at > org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.mkdirs(S3ACloseEnforcedFileSystem.java:474) > at > org.apache.hadoop.fs.contract.AbstractFSContractTestBase.mkdirs(AbstractFSContractTestBase.java:338) > at > org.apache.hadoop.fs.contract.AbstractFSContractTestBase.setup(AbstractFSContractTestBase.java:193) > at > org.apache.hadoop.fs.s3a.ITestS3AClosedFS.setup(ITestS3AClosedFS.java:40) > at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > 2018-10-04 00:52:25,596 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:logClosers(61)) - FS was closed 6 times by: > 2018-10-04 00:52:25,596 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:logClosers(64)) - ----------------- close() > #1 --------------- > 2018-10-04 00:52:25,596 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:logClosers(65)) - > java.lang.RuntimeException: close() called here > at > org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.close(S3ACloseEnforcedFileSystem.java:83) > at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280) > at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298) > at > org.apache.hadoop.fs.s3a.AbstractS3ATestBase.teardown(AbstractS3ATestBase.java:55) > at > org.apache.hadoop.fs.s3a.commit.AbstractCommitITest.teardown(AbstractCommitITest.java:206) > at > org.apache.hadoop.fs.s3a.auth.ITestAssumedRoleCommitOperations.teardown(ITestAssumedRoleCommitOperations.java:90) > at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > 2018-10-04 00:52:25,596 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:logClosers(64)) - ----------------- close() > #2 --------------- > 2018-10-04 00:52:25,596 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:logClosers(65)) - > java.lang.RuntimeException: close() called here > at > org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.close(S3ACloseEnforcedFileSystem.java:83) > at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280) > at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298) > at > org.apache.hadoop.fs.s3a.AbstractS3ATestBase.teardown(AbstractS3ATestBase.java:55) > at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > 2018-10-04 00:52:25,597 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:logClosers(64)) - ----------------- close() > #3 --------------- > 2018-10-04 00:52:25,598 [Thread-4201] WARN s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:logClosers(65)) - > java.lang.RuntimeException: close() called here > at > org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.close(S3ACloseEnforcedFileSystem.java:83) > at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280) > at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298) > at > org.apache.hadoop.fs.s3a.AbstractS3ATestBase.teardown(AbstractS3ATestBase.java:55) > at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > (...) > {noformat} > From the logs it seems that the closers are the teardown methods of the > abstract test classes, so the same FileSystem object gets reused between > tests. > To run a test with the modified fs the following should be in the config: > {noformat} > <property> > <name>fs.s3a.impl</name> > <value>org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem</value> > <description>The implementation class of the S3A Filesystem</description> > </property> > {noformat} > I file this jira to solve this issue, and start a conversation if needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org