[ 
https://issues.apache.org/jira/browse/ACCUMULO-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14708538#comment-14708538
 ] 

Josh Elser commented on ACCUMULO-3967:
--------------------------------------

Seems like the issue is as follows:

For a table whose tablets sort directly after one another (e.g. 1 2 3 4 5), the 
use of {{Range.followingPrefix}} incorrectly causes the original extent to be 
convert into the next extent because the previous endRow gets converted into 
the same as the endRow, which then causes {{findOverlappingTablets}} to return 
the next extent instead of the same one.

{noformat}
@@ -617,10 +628,16 @@ public class BulkImporter {

   public static List<TabletLocation> findOverlappingTablets(ClientContext 
context, VolumeManager fs, TabletLocator locator, Path file, KeyExtent failed)
       throws Exception {
     locator.invalidateCache(failed);
     Text start = failed.getPrevEndRow();
-    if (start != null)
-      start = Range.followingPrefix(start);
+    if (start != null) {
+      start = new Text(start);
+      // The first possible value after the previous tablet
+      start.append(new byte[] {0}, 0, 1);
+    }
+    log.info("For '" + failed + "': start='" + start + "', end='" + 
failed.getEndRow() + "'");
     return findOverlappingTablets(context, fs, locator, file, start, 
failed.getEndRow());
   }
{noformat}

It looks like this is a subtle bug that depends on the tablet distribution for 
a table that has been present for all versions.

> bulk import loses records when loading pre-split table
> ------------------------------------------------------
>
>                 Key: ACCUMULO-3967
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3967
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client, tserver
>    Affects Versions: 1.4.5, 1.5.3, 1.6.0, 1.6.1, 1.6.2, 1.6.3, 1.7.0
>         Environment: generic hadoop 2.6.0, zookeeper 3.4.6 on redhat 6.7
> 7 node cluster
>            Reporter: Edward Seidl
>            Priority: Blocker
>             Fix For: 1.6.4, 1.7.1, 1.8.0, 1.5.4
>
>
> I just noticed that some records I'm loading via importDirectory go missing.  
> After a lot of digging around trying to reproduce the problem, I discovered 
> that it occurs most frequently when loading a table that I have just recently 
> added splits to.  In the tserver logs I'll see messages like 
> 20 16:25:36,805 [client.BulkImporter] INFO : Could not assign 1 map files to 
> tablet 1xw;18;17 because : Not Serving Tablet .  Will retry ...
>  
> or
> 20 16:25:44,826 [tserver.TabletServer] INFO : files 
> [hdfs://xxxx:54310/accumulo/tables/1xw/b-00jnmxe/I00jnmxq.rf] not imported to 
> 1xw;03;02: tablet 1xw;03;02 is closed
> these appear after messages about unloading tablets...it seems that tablets 
> are being redistributed at the same time as the bulk import is occuring.
> Steps to reproduce
> 1) I run a mapreduce job that produces random data in rfiles
> 2) copy the rfiles to an import directory
> 3) create table or deleterows -f
> 4) addsplits
> 5) importdirectory
> I have also performed the above completely within the mapreduce job, with 
> similar results.  The difference with the mapreduce job is that the time 
> between adding splits and the import directory is minutes rather than seconds.
> my current test creates 1000000 records, and after the importdirectory 
> returns a count of rows will be anywhere from ~800000 to 1000000.
> With my original workflow, I found that re-importing the same set of rfiles 
> three times would eventually get all rows loaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to