[ https://issues.apache.org/jira/browse/HBASE-9759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13803261#comment-13803261 ]
stack commented on HBASE-9759: ------------------------------ You going to commit [~enis]? > IntegrationTestBulkLoad random number collision > ----------------------------------------------- > > Key: HBASE-9759 > URL: https://issues.apache.org/jira/browse/HBASE-9759 > Project: HBase > Issue Type: Bug > Components: test > Reporter: Enis Soztutar > Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1 > > Attachments: hbase-9759_v1.patch > > > ITBL failed recently in our test harness. Inspecting the failure made me > believe that the only reason that particular failure might have happened is > that there is a collision in random longs generated by the test. > The test creates 50 mappers by default, and each mapper writes a 500K random > rows starting with row = 0. By default there are 5 iterations. > The check job outputs these counters: > {code} > 2013-10-13 07:48:01,134 Map input records=124999751 > 2013-10-13 07:48:01,134 Map output records=124999999 > {code} > The number of input records seems fine because > {code} > 124999751 = 1 + 5 * (0.5M - 1) * 50 > {code} > 5 = num iterations, 0.5M = num rows, 50 = num mappers, and 1 is for row =0 > which every chain writes to. > Output records should be 125M, however we see one cell missing. Since the map > input records matches expected number of distinct rows, I suspect that row = > 0 had a collision. > In one of the generate jobs, we can see that the reducer output count does > not match the reducer input count. Given that we are using KVSortReducer, > this confirms that there is a collision in KeyValues received by this task. > {code} > 2013-10-13 06:48:12,738 Reduce input records=75000000 > 2013-10-13 06:48:12,738 Reduce output records=74999997 > {code} > The count is off by 3 because we are writing 3 columns per row. > My only theory for explaining this is that we had a collision in chainId's or > one of the chains reused row = 0 as the next row. > This is similar to HBASE-8700, however, in this the probability is much much > much lower. -- This message was sent by Atlassian JIRA (v6.1#6144)