keith-turner commented on code in PR #3221:
URL: https://github.com/apache/accumulo/pull/3221#discussion_r1123922228
##########
server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java:
##########
@@ -1814,7 +1824,7 @@ List<FileRef> findChopFiles(KeyExtent extent,
Map<FileRef,Pair<Key,Key>> firstAn
public synchronized boolean needsSplit() {
if (isClosing() || isClosed())
return false;
- return findSplitRow(getDatafileManager().getFiles()) != null;
+ return isSplitPossible();
Review Comment:
> I'm wondering if the call to FileUtil.findMidPoint that is currently in
DatafileManager.findSplitRow should be moved to DataFileManager in it's own
method.
I experimented with making some changes in the spirit of this, but with a
slightly different implementation than described. Those changes are in [this
branch](https://github.com/keith-turner/accumulo/tree/FineAndDandy-3217) which
has [this
commit](https://github.com/keith-turner/accumulo/commit/5b19850a218ea5149eec4102affa22bd9ca5b23a).
Was trying to achieve the following.
* Only computes split information when tablets files have changed
* Does all split related computations on files outside of the tablet lock.
So hopefully needsSplit should not block scans.
* Caching computation does not prevent splits from happening when user
changes config. Like if a tablet was not a candidate for split and then a user
lowers the split threshold making it a candidate, then the tablet should split.
I quickly made these changes, so I need to take a second look at them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]