dlmarion commented on code in PR #3168:
URL: https://github.com/apache/accumulo/pull/3168#discussion_r1085247845


##########
core/src/main/java/org/apache/accumulo/core/client/impl/TabletLocatorImpl.java:
##########
@@ -330,8 +348,16 @@ else if (!lookupFailed)
         tabletLocations.add(tl);
       }
 
-      for (TabletLocation tl2 : tabletLocations) {
-        TabletLocatorImpl.addRange(binnedRanges, tl2.tablet_location, 
tl2.tablet_extent, range);
+      // Ensure the extents found are non overlapping and have no holes. When 
reading some extents
+      // from the cache and other from the metadata table in the loop above we 
may end up with
+      // non-contiguous extents. This can happen when a subset of exents are 
placed in the cache and
+      // then after that merges and splits happen.
+      if (isContiguous(tabletLocations)) {

Review Comment:
   I'm wondering if this check should be moved to the end of loop `l1`. 
Multiple disjoint ranges could be sent into this method and I'm not sure that's 
taken into account here. I'm assuming that you want to check that you have a 
contiguous list of TabletLocations for each Range.



##########
core/src/test/java/org/apache/accumulo/core/client/impl/TabletLocatorImplTest.java:
##########
@@ -1115,6 +1116,82 @@ public void testBinRanges5() throws Exception {
         nrl(nr("0", "11"), nr("1", "2"), nr("0", "4"), nr("2", "4")));
   }
 
+  @Test
+  public void testBinRangesNonContiguousExtents() throws Exception {
+
+    // This test exercises a bug that was seen in the tablet locator code.
+
+    KeyExtent e1 = nke("foo", "05", null);
+    KeyExtent e2 = nke("foo", "1", "05");
+    KeyExtent e3 = nke("foo", "2", "05");
+
+    Text tableName = new Text("foo");
+
+    TServers tservers = new TServers();
+    TabletLocatorImpl metaCache =
+        createLocators(tservers, "tserver1", "tserver2", "foo", e1, "l1", e2, 
"l1");
+
+    List<Range> ranges = nrl(nr("01", "07"));
+    Map<String,Map<KeyExtent,List<Range>>> expected =
+        createExpectedBinnings("l1", nol(e1, nrl(nr("01", "07")), e2, 
nrl(nr("01", "07"))));
+
+    // The following will result in extents e1 and e2 being placed in the 
cache.
+    runTest(tableName, ranges, metaCache, expected, nrl());
+
+    // Add e3 to the metadata table. Extent e3 could not be added earlier in 
the test because it
+    // overlaps e2. If e2 and e3 are seen in the same metadata read then one 
will be removed from
+    // the cache because the cache can never contain overlapping extents.
+    setLocation(tservers, "tserver2", MTE, e3, "l1");
+
+    // The following test reproduces a bug. Extents e1 and e2 are in the 
cache. Extent e3 overlaps
+    // e2 but is not in the cache. The range used by the test overlaps 
e1,e2,and e3. The bug was
+    // that for this situation the binRanges code in tablet locator used to 
return e1,e2,and e3. The
+    // desired behavior is that the range fails for this situation. This 
tablet locator bug caused
+    // the batch scanner to return duplicate data.
+    ranges = nrl(nr("01", "17"));
+    runTest(tableName, ranges, metaCache, new HashMap<>(), nrl(nr("01", 
"17")));
+
+    // After the above test fails it should cause e3 to be added to the cache. 
Because e3 overlaps
+    // e2, when e3 is added then e2 is removed. Therefore, the following 
binRanges call should
+    // succeed and find the range overlaps e1 and e3.
+    expected = createExpectedBinnings("l1", nol(e1, nrl(nr("01", "17")), e3, 
nrl(nr("01", "17"))));
+    runTest(tableName, ranges, metaCache, expected, nrl());
+  }
+

Review Comment:
   I think we need a test that uses multiple non-contiguous ranges.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to