keith-turner commented on code in PR #3221:
URL: https://github.com/apache/accumulo/pull/3221#discussion_r1123922228


##########
server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java:
##########
@@ -1814,7 +1824,7 @@ List<FileRef> findChopFiles(KeyExtent extent, 
Map<FileRef,Pair<Key,Key>> firstAn
   public synchronized boolean needsSplit() {
     if (isClosing() || isClosed())
       return false;
-    return findSplitRow(getDatafileManager().getFiles()) != null;
+    return isSplitPossible();

Review Comment:
   >  I'm wondering if the call to FileUtil.findMidPoint that is currently in 
DatafileManager.findSplitRow should be moved to DataFileManager in it's own 
method.
   
   I experimented with making some changes in the spirit of this, but with a 
slightly different implementation than described.  Those changes are in [this 
branch](https://github.com/keith-turner/accumulo/tree/FineAndDandy-3217) which 
has [this 
commit](https://github.com/keith-turner/accumulo/commit/5b19850a218ea5149eec4102affa22bd9ca5b23a).
  Was trying to achieve the following.
   
    * Only computes split information when tablets files have changed
    * Does all split related computations on files outside of the tablet lock.  
So hopefully needsSplit should not block scans.
    * Caching computation does not prevent splits from happening when user 
changes config. Like if a tablet was not a candidate for split and then a user 
lowers the split threshold making it a candidate, then the tablet should split.
   
   I quickly made these changes, so I need to take a second look at them.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to