[ https://issues.apache.org/jira/browse/HBASE-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284843#comment-13284843 ]
nkeywal commented on HBASE-6109: -------------------------------- @stack bq. Is this a generic locker? Should it be named for what its locking? Renamed to LockerByString. If you have a better name... bq. NotifiableConcurrentSkipListMap needs class comment. It seems like its for use in a very particular circumstance. It needs explaining. done. bq. Does it need to be public? Only used in master package? Perhaps make it package private then? The issue was: {noformat} public NotifiableConcurrentSkipListMap<String, RegionState> getRegionsInTransition() { return regionsInTransition; } {noformat} But it's used in tests only, so I can actually make both package protected. Done. bq. internalList is a bad name for the internal delegate instance. Is 'delegatee' a better name than internalList? done. bq. We checked rit contains a name but then in a separate statement we do the waitForListUpdate? What if the region we are looking for is removed between the check and the waitForListUpdate invocation? Actually yes, it could happen. I added a timeout, so we will now check every 100ms. bq. Will this log be annoying? Removed. I added them while debugging. This one was already there however. I kept it. {noformat} public void removeClosedRegion(HRegionInfo hri) { if (regionsToReopen.remove(hri.getEncodedName()) != null) { LOG.debug("Removed region from reopening regions because it was closed"); } } {noformat} bq. Is this true / How is it enforced? Oops, it not enforced (I don't know I could do it), but it's also not true: the update will set it as well. But it's not an issue as it's an atomic long. Comment updated. It's btw tempting to: - change the implementation of updateTimestampToNow to use a lazySet - get the timestamp only once before looping on the region set. I didn't do it in my patch, but I think it should be done. bq. needs space after curly parens. Sometimes you do it and sometimes you don't. Done > @ted bq. It would be nice to have a test for NotifiableConcurrentSkipListMap. Will do for final release. bq. Since internalList is actually a Map, name the above method waitForUpdate() ? Done. bq. the above should read 'A utility class to manage a set of locks. Each lock is identified by a String which serves' Done bq. It should be Locker.class Done bq. The constant should be named NB_CONCURRENT_LOCKS. Done bq.The last word should be locked. Done bq. It would be nice to add more about reason. Done. bq. Looking at batchRemove() of http://www.docjar.com/html/api/java/util/ArrayList.java.html around line 669, I don't see synchronization. Meaning, existence check of elements from nodes in regionsInTransition.keySet() may not be deterministic. After looking at the java api code, I don't think there is an issue here. The set we're using is documented as: "The view's iterator is a "weakly consistent" iterator that will never throw ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction.". So we won't have any java error. Then, if an element is added/removed to/from the RIT while we're doing the removeAll, it may be added/removed or not, but we're not less deterministic that we would be by adding a lock around the removeAll: the add/remove could be as well be done just before/after we take the lock, and we would not know it. I'm currently checking how it works with split, then I will update it to the current trunk. > Improve RIT performances during assignment on large clusters > ------------------------------------------------------------ > > Key: HBASE-6109 > URL: https://issues.apache.org/jira/browse/HBASE-6109 > Project: HBase > Issue Type: Improvement > Components: master > Affects Versions: 0.96.0 > Reporter: nkeywal > Assignee: nkeywal > Priority: Minor > Attachments: 6109.v7.patch > > > The main points in this patch are: > - lowering the number of copy of the RIT list > - lowering the number of synchronization > - synchronizing on a region rather than on everything > It also contains: > - some fixes around the RIT notification: the list was sometimes modified > without a corresponding 'notify'. > - some tests flakiness correction, actually unrelated to this patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira