[ https://issues.apache.org/jira/browse/HADOOP-9230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557502#comment-13557502 ]
Karthik Kambatla commented on HADOOP-9230: ------------------------------------------ checkAgainstLegacy() compares the generated splits against a legacy split generation. I don't quite understand the purpose behind this check. Can anyone who knows this better throw some light on why we need the test. I noticed a conflict in the math between UniformSizeInputFormat split generation and the legacy generation: Current: {code} long nBytesPerSplit = (long) Math.ceil(totalSizeBytes * 1.0 / numSplits); {code} Legacy: {code} final long targetsize = totalFileSize / numSplits; {code} I would expect the math discrepancy to lead to more than 10% failure rate though. > TestUniformSizeInputFormat fails intermittently > ----------------------------------------------- > > Key: HADOOP-9230 > URL: https://issues.apache.org/jira/browse/HADOOP-9230 > Project: Hadoop Common > Issue Type: Bug > Components: test > Affects Versions: 2.0.2-alpha > Reporter: Karthik Kambatla > Assignee: Karthik Kambatla > Labels: distcp > > TestUniformSizeFileInputFormat fails intermittently. I ran the test 50 times > and noticed 5 failures. > Haven't noticed any particular pattern to which runs fail. > A sample stack trace is as follows: > {noformat} > java.lang.AssertionError: expected:<1944> but was:<1820> > at org.junit.Assert.fail(Assert.java:91) > at org.junit.Assert.failNotEquals(Assert.java:645) > at org.junit.Assert.assertEquals(Assert.java:126) > at org.junit.Assert.assertEquals(Assert.java:470) > at org.junit.Assert.assertEquals(Assert.java:454) > at > org.apache.hadoop.tools.mapred.TestUniformSizeInputFormat.checkAgainstLegacy(TestUniformSizeInputFormat.java:244) > at > org.apache.hadoop.tools.mapred.TestUniformSizeInputFormat.testGetSplits(TestUniformSizeInputFormat.java:126) > at > org.apache.hadoop.tools.mapred.TestUniformSizeInputFormat.testGetSplits(TestUniformSizeInputFormat.java:252) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira