[ https://issues.apache.org/jira/browse/ACCUMULO-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14708538#comment-14708538 ]
Josh Elser commented on ACCUMULO-3967: -------------------------------------- Seems like the issue is as follows: For a table whose tablets sort directly after one another (e.g. 1 2 3 4 5), the use of {{Range.followingPrefix}} incorrectly causes the original extent to be convert into the next extent because the previous endRow gets converted into the same as the endRow, which then causes {{findOverlappingTablets}} to return the next extent instead of the same one. {noformat} @@ -617,10 +628,16 @@ public class BulkImporter { public static List<TabletLocation> findOverlappingTablets(ClientContext context, VolumeManager fs, TabletLocator locator, Path file, KeyExtent failed) throws Exception { locator.invalidateCache(failed); Text start = failed.getPrevEndRow(); - if (start != null) - start = Range.followingPrefix(start); + if (start != null) { + start = new Text(start); + // The first possible value after the previous tablet + start.append(new byte[] {0}, 0, 1); + } + log.info("For '" + failed + "': start='" + start + "', end='" + failed.getEndRow() + "'"); return findOverlappingTablets(context, fs, locator, file, start, failed.getEndRow()); } {noformat} It looks like this is a subtle bug that depends on the tablet distribution for a table that has been present for all versions. > bulk import loses records when loading pre-split table > ------------------------------------------------------ > > Key: ACCUMULO-3967 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3967 > Project: Accumulo > Issue Type: Bug > Components: client, tserver > Affects Versions: 1.4.5, 1.5.3, 1.6.0, 1.6.1, 1.6.2, 1.6.3, 1.7.0 > Environment: generic hadoop 2.6.0, zookeeper 3.4.6 on redhat 6.7 > 7 node cluster > Reporter: Edward Seidl > Priority: Blocker > Fix For: 1.6.4, 1.7.1, 1.8.0, 1.5.4 > > > I just noticed that some records I'm loading via importDirectory go missing. > After a lot of digging around trying to reproduce the problem, I discovered > that it occurs most frequently when loading a table that I have just recently > added splits to. In the tserver logs I'll see messages like > 20 16:25:36,805 [client.BulkImporter] INFO : Could not assign 1 map files to > tablet 1xw;18;17 because : Not Serving Tablet . Will retry ... > > or > 20 16:25:44,826 [tserver.TabletServer] INFO : files > [hdfs://xxxx:54310/accumulo/tables/1xw/b-00jnmxe/I00jnmxq.rf] not imported to > 1xw;03;02: tablet 1xw;03;02 is closed > these appear after messages about unloading tablets...it seems that tablets > are being redistributed at the same time as the bulk import is occuring. > Steps to reproduce > 1) I run a mapreduce job that produces random data in rfiles > 2) copy the rfiles to an import directory > 3) create table or deleterows -f > 4) addsplits > 5) importdirectory > I have also performed the above completely within the mapreduce job, with > similar results. The difference with the mapreduce job is that the time > between adding splits and the import directory is minutes rather than seconds. > my current test creates 1000000 records, and after the importdirectory > returns a count of rows will be anywhere from ~800000 to 1000000. > With my original workflow, I found that re-importing the same set of rfiles > three times would eventually get all rows loaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)