[
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507621#comment-13507621
]
stack commented on HBASE-7247:
------------------------------
bq. If one region server is opening a lot of regions, we just need one handler
to tickle the opening.
This would be a significant change for I'm-not-sure-what-benefit. The zk
transactions are region level/scoped and their handling is done at this level
in open/close exec handlers. Making it so RS does tracking and updating state
for master to read regards CLOSING/OPENING would alter a bunch of code.
I'm all for a reexamination of base operations. Stuff is this way because we
would have issues where an open would stall for whatever reason... Master would
intercept the open, take over the region and give it to someone else to open.
More often than not, we'd just fail again for same reason on the new location
but the odd time the new re-attempt would succeed.
We could give up reattempt and just let everything hinge on whether a region
server has a zk lease or not and let ServerShutdownHandler do it all. It'd be
a pretty radical difference. Simplify code but also, in an odd case, it might
mean we'd fail recover a region (I don't have stats on this).
bq. We used to have the 'owernership' issue as Stack mentioned. Now, I think we
are fine since AM should have a consistent view of region states.
This is a similar type of leap-in-the-dark (smile). I love the notion that AM
is now rock solid. It may be given the work expended (it certainly is a
million times better) but I'd like us to run w/ the new AM a while in a few
productions before making this ruling.
Again, if we could undo the OPENING/CLOSING, etc., stuff would be
cleaner/simpler (logs would be way less noisy)
> Assignment performances decreased by 50% because of
> regionserver.OpenRegionHandler#tickleOpening
> ------------------------------------------------------------------------------------------------
>
> Key: HBASE-7247
> URL: https://issues.apache.org/jira/browse/HBASE-7247
> Project: HBase
> Issue Type: Improvement
> Components: master, Region Assignment, regionserver
> Affects Versions: 0.96.0
> Reporter: nkeywal
> Assignee: nkeywal
> Priority: Critical
>
> The regionserver.OpenRegionHandler#tickleOpening updates the region znode as
> "Do this so master doesn't timeout this region-in-transition.".
> However, on the usual test, this makes the assignment time of 1500 regions
> goes from 70s to 100s, that is, we're 50% slower because of this.
> More generally, ZooKeper commits to disk all the data update, and this takes
> time. Using it to provide a keep alive seems overkill. At the very list, it
> could be made asynchronous.
> I'm not sure how necessary these updates are required (I need to go deeper in
> the internal, feedback welcome), but it seems very important to optimize
> this... The trival fix would be to make this optional.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira