[
https://issues.apache.org/jira/browse/HADOOP-11584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14324214#comment-14324214
]
Steve Loughran commented on HADOOP-11584:
-----------------------------------------
Tested patch -003 & all hadoop-aws tests against S3 EU tests multiple times',
passed.
One test failed *once*'; assume a transient network glitch, as it never
re-occurred.
{code}
Running org.apache.hadoop.fs.s3a.TestS3AFileSystemContract
Tests run: 43, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 356.841 sec
<<< FAILURE! - in org.apache.hadoop.fs.s3a.TestS3AFileSystemContract
testRenameRootDirForbidden(org.apache.hadoop.fs.s3a.TestS3AFileSystemContract)
Time elapsed: 2.492 sec <<< FAILURE!
junit.framework.AssertionFailedError: Source exists expected:<true> but
was:<false>
at junit.framework.Assert.fail(Assert.java:57)
at junit.framework.Assert.failNotEquals(Assert.java:329)
at junit.framework.Assert.assertEquals(Assert.java:78)
at junit.framework.Assert.assertEquals(Assert.java:174)
at junit.framework.TestCase.assertEquals(TestCase.java:333)
at
org.apache.hadoop.fs.FileSystemContractBaseTest.rename(FileSystemContractBaseTest.java:490)
at
org.apache.hadoop.fs.FileSystemContractBaseTest.testRenameRootDirForbidden(FileSystemContractBaseTest.java:598)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at junit.framework.TestCase.runTest(TestCase.java:176)
at junit.framework.TestCase.runBare(TestCase.java:141)
at junit.framework.TestResult$1.protect(TestResult.java:122)
at junit.framework.TestResult.runProtected(TestResult.java:142)
at junit.framework.TestResult.run(TestResult.java:125)
at junit.framework.TestCase.run(TestCase.java:129)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
at
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
{code}
That assert is from a {{FileSystem.exists(path)}} call, which does call
{{getFileStatus(path)}}, so is following a codepath that has now changed. But
the exists() call should only fail if the {{getFileStatus(path)}} raises a 404,
and it never re-occurred. Hence the assumption: **transient**
> s3a file block size set to 0 in getFileStatus
> ---------------------------------------------
>
> Key: HADOOP-11584
> URL: https://issues.apache.org/jira/browse/HADOOP-11584
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 2.6.0
> Reporter: Dan Hecht
> Assignee: Brahma Reddy Battula
> Priority: Blocker
> Attachments: HADOOP-10584-003.patch, HADOOP-111584.patch,
> HADOOP-11584-002.patch
>
>
> The consequence is that mapreduce probably is not splitting s3a files in the
> expected way. This is similar to HADOOP-5861 (which was for s3n, though s3n
> was passing 5G rather than 0 for block size).
> FileInputFormat.getSplits() relies on the FileStatus block size being set:
> {code}
> if (isSplitable(job, path)) {
> long blockSize = file.getBlockSize();
> long splitSize = computeSplitSize(blockSize, minSize, maxSize);
> {code}
> However, S3AFileSystem does not set the FileStatus block size field. From
> S3AFileStatus.java:
> {code}
> // Files
> public S3AFileStatus(long length, long modification_time, Path path) {
> super(length, false, 1, 0, modification_time, path);
> isEmptyDirectory = false;
> }
> {code}
> I think it should use S3AFileSystem.getDefaultBlockSize() for each file's
> block size (where it's currently passing 0).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)