[
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098232#comment-14098232
]
Virag Kothari commented on HBASE-11165:
---------------------------------------
bq. Interesting observation! I believe it's radically different from previous
benchmark
They are not comparable.
In previous setup, we never ran with forcesync=no on a bigger cluster (300
node). We only ran using that config on a smaller cluster with 23 RS and with
1M regions, each RS had more regions to open which itself would take more time
causing the overall startup time to increase. Also, the master was not lagging
behind by more than few mins after the region servers had opened all region
when using forcesync off. We had observed that opening 3.3M regions on 300 node
takes only ~35 minutes on RS, but it seems that forceSync=yes was causing the
master to receive late notifications from zk. So I think that if we had set
forceSync=no during the previous benchmark, it would have taken a bit more than
35 minutes for 3.3M which is consistent with the current result of 10 mins for
1M.
Also, during that time we were measuring the entire clean startup time. In the
current benchmark, we only measure the bulk assignment time. The entire startup
time also included the cost of scanning META (3 times)
I will check if there were any significant differences between the hardware
config of zk machines for both the setups.
> Scaling so cluster can host 1M regions and beyond (50M regions?)
> ----------------------------------------------------------------
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
> Issue Type: Brainstorming
> Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf,
> zk_less_assignment_comparison.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569"
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M
> regions maybe even 50M later. This issue is about discussing how we will do
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.
--
This message was sent by Atlassian JIRA
(v6.2#6252)