[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111626#comment-15111626 ] stack commented on HBASE-11165: --- [~toffer] Sweet > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111617#comment-15111617 ] Francis Liu commented on HBASE-11165: - Just an update, we've been running a production cluster which has 16 meta regions and around 260k regions for the last 2 months or so. We'll start getting the supporting changes in zkless (if any), HBASE-11290. Then the big patch, tho will try to break it up into smaller pieces if possible. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711736#comment-14711736 ] Elliott Clark commented on HBASE-11165: --- Thanks [~stack] some good notes there. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344512#comment-14344512 ] Lars Hofhansl commented on HBASE-11165: --- Thanks. Sorry for being a pain in the a**. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344491#comment-14344491 ] stack commented on HBASE-11165: --- bq. Just saying we should do what solves the problem with least amount of effort. Agree. Let me do a better job writing up what we've all spewed across tens of JIRAs and in offline emails and go from there. Thanks [~lhofhansl] > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344384#comment-14344384 ] Lars Hofhansl commented on HBASE-11165: --- Just saying we should do what solves the problem with least amount of effort. I think Flurry had abnormally small regions (1g), and that's why they have so many (with 20g region they'd have 5.8pb :)) If having many regions and fixing all the issues from that is easiest we should do that. If the other ways are easier we should do those. There have been other ideas floating (have regions share a memstore, groups of regions for assignment, etc). I'm not against this. Exploring what new problems we're exchanging for the old problems: # splittable META # scalable assignment manager # handle many memstores # multi-master # potential NN scaling issues If those are easier to solve than those and the previous comment, we know what we should do :) > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344250#comment-14344250 ] stack commented on HBASE-11165: --- In your list, IMO, 1. is reason enough to do small regions. 4. Smaller regions mean less of the keyspace is offline when balancing; also more even balance is possible when regions are smaller. bq. Can we only solve these three with many small regions? If we just did small regions, without introducing anything new (other than doing what we already do 'better'/'faster'), we could improve on your list without need to add custom compaction policy(-ies) and the recording of interstices at 100MB intervals in metadata (which we'd have to teach clients to read), etc. Regards a problem statement, you want one on why we should tend down toward small rather than continue our current trajectory of larger and larger regions, or do you want a problem statement for the subject of this JIRA? Regards this JIRA, we have users who are headed toward 1M now (Flurry reported being at 300k afraid to go up from there and Francis has 'larger' clusters) so we have to deal. You thinking we should explore going up from 10/20G toward 100G or 1TB? (With stripe compactions++ and means of apportioning out the 1TB region, etc., to address the 1-4 list above?). > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344011#comment-14344011 ] Mikhail Antonov commented on HBASE-11165: - I also remember some time ago it was discovered that we're keeping too many versions of cells in meta, and the patch making is configurable was committed. So I suppose [~toffer] or others had a chance to play with it and see if their problem was alleviated and to what extent? Curious to know the results. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343882#comment-14343882 ] Lars Hofhansl commented on HBASE-11165: --- Lemme step back... The fundamental conflict is that for assignment we to assign in large enough chunks (i.e. a region size), but for other parts of the system smaller chunks would be better (compactions, input split for M/R), etc. Right? The unit of failure is always going to be a region server, whether we have 2000 1GB regions or 20 100GB region makes no difference from that angle. Assuming server with 16TB of disk space or so, the granularity of assignment is also not at issue as long as we keep in that ball park (i.e. 1GB - 1TB or so). So what are the exact problems with large regions: # compactions and write amplification # input split calculation for M/R # log replay upon recovery (is that an issue, i.e. is it worse replaying 1 large log compared to replaying 100 small ones) # (more?) Can we *only* solve these three with many small regions? (or do stripe compactions, simple width stats for M/R, etc) I'm trying to get from a statement about an implementation (that might just shift complexity from one part of HBase to another) to an exact problem statement. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343705#comment-14343705 ] Andrew Purtell commented on HBASE-11165: See above on this issue where [~toffer] wanted to split meta. Since his shop is running 0.98 I offered to maintain a branch of 0.98 that includes that change in ASF Git for their use, figuring that positive results there would inform on when/if we should do that on all branches. Not sure what the current status of this discussion is. No patch yet, for 0.98 or any other branch. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343691#comment-14343691 ] stack commented on HBASE-11165: --- [~octo47] IMO, yeah. Tried getting consensus on that a while back and seemed to have it but didn't nail it down... Will be back. Doc. above needs work too. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14342903#comment-14342903 ] Andrey Stepachev commented on HBASE-11165: -- [~stack] so, we are going splitmeta? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14342847#comment-14342847 ] stack commented on HBASE-11165: --- Some notes on having a scalable meta here: https://docs.google.com/document/d/1eCuqf7i2dkWHL0PxcE1HE1nLRQ_tCyXI4JsOB6TAk60/edit?usp=sharing Let me attach a PDF version. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149859#comment-14149859 ] Andrew Purtell commented on HBASE-11165: Might also be useful to track the ratio when bringing a new table online and scaling it up from 0 to a few billion rows. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149823#comment-14149823 ] stack commented on HBASE-11165: --- [~eclark] asked last night if stripped compactions could help w/ compaction problem in meta? Regards the 50k reads/1 write, what is the ratio across a restart do you think? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149778#comment-14149778 ] Elliott Clark commented on HBASE-11165: --- Had some discussions last night and people were asking about read vs write numbers for meta. So currently on a stable cluster meta is getting a read to write request ratio of 50,000 reads for every one write request. To me that really suggest that adding read replicas is the next thing that we should look at before splitting meta or master. I think that could get us quite a way. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139101#comment-14139101 ] Andrey Stepachev commented on HBASE-11165: -- [~virag] I agree, 1. is enough for most cases. In HBASE-12016 I made number of versions configurable. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139093#comment-14139093 ] Virag Kothari commented on HBASE-11165: --- bq. 1. Meta should keep 1 version (with atomic row updates we definitely need no more then one actual version), even better make that configurable. bq. 2. Meta HLogs should be archived for debugging For 1., making configurable sounds good. I don't feel a strong need for 2. and again we need to have a separate config for cleanup of archived meta WALs. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138802#comment-14138802 ] Andrey Stepachev commented on HBASE-11165: -- [~stack] thanks for comments. I added jira for that: https://issues.apache.org/jira/browse/HBASE-12016 > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138478#comment-14138478 ] stack commented on HBASE-11165: --- bq. ...do we have significant objections on that? Not from me. Even keeping just keeping three versions would be an improvement. A compact representation in master will help (Let me update the attached google doc and fix my misstatements). > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137789#comment-14137789 ] Mikhail Antonov commented on HBASE-11165: - Thanks for explanation [~virag]. Yeah, as comment suggest it seems there were no special reason to have it set to 10? Making it configurable seems trivial change? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137783#comment-14137783 ] Andrey Stepachev commented on HBASE-11165: -- Looks like we can do two things to solve that on up to 1.0 version of hbase (without significant changes) 1. Meta should keep 1 version (with atomic row updates we definitely need no more then one actual version), even better make that configurable. 2. Meta HLogs should be archived for debugging That will reduce scan overhead (much less KVs to scan) and reduce memory footprint and reduce load times for very big metas. [~virag], [~apurtell], do we have significant objections on that? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137754#comment-14137754 ] Mikhail Antonov commented on HBASE-11165: - bq.Would you mind filing a new issue for that Mikhail Antonov and say more about this on that tangent? Thanks for the interest [~apurtell], I sure will file it, but I'd just like to do that in a day or maybe couple of days, when we have run more experiments and also put together actual code to show. Once filed, I'll make sure to link it to this jira. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137715#comment-14137715 ] Andrey Stepachev commented on HBASE-11165: -- [~virag] thanks. :) didn't see that comment somehow. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137662#comment-14137662 ] Virag Kothari commented on HBASE-11165: --- why 10 versions are needed? Looking at the code for META_TABLEDESC in HTableDescriptor {code} // Ten is arbitrary number. Keep versions to help debugging. .setMaxVersions(10) {code} May be we need to revisit this? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137632#comment-14137632 ] Andrey Stepachev commented on HBASE-11165: -- [~virag], do you have some thoughts, why 10 versions are needed? Seems thats a great overhead (ten times more memory). > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137613#comment-14137613 ] Virag Kothari commented on HBASE-11165: --- bq. fully compacted meta with single-versioned cells 1 row in meta takes up 7-10k The 7GB for 1M is with 10 versions of meta. Meta has 10 versions by default and we dint change that for our experiments. But I see the confusion. The attached pdf says 10 versions while the shared google doc says one version. Sorry about that. Also 7GB was the size of store file on hdfs. The table was simply created using HexStringSplit with nothing extra. Also this was on 0.98 (I think master code adds some stuff about region replicas in meta) with zk-less assignment. bq. I'd appreciate a lot if you could share a sample representative row from your meta, so we can see the typical size of elements in it? On prod, we currently run 0.94. I can check if we can share some sample row from there. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137450#comment-14137450 ] Andrew Purtell commented on HBASE-11165: bq. And the list of Results is not exactly super-compacted structure, if initial experiments we were able to compact it further quite a bit. Would you mind filing a new issue for that [~mantonov] and say more about this on that tangent? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136860#comment-14136860 ] Mikhail Antonov commented on HBASE-11165: - [~toffer], [~virag] - guys, getting back for a moment to the size of meta table..there's an interesting thing about it. [~octo47], [~sergey.soldatov] and myself are doing some prototyping of more compact & fast in-memory representation of meta, and have noticed interesting thing inspecting the data structures under the microscope (like https://code.google.com/p/memory-measurer). In the discussion about, and in the doc put together by [~stack] it's mentioned that in best case, for fully compacted meta with single-versioned cells 1 row in meta takes up 7-10kb (please correct me if I'm wrong). If I run a minicluster test, create some simple table with regions and then do MTA.fullScan() to get list of Result, a single Result won't take more than 1Kb, normally less than that. And the list of Results is not exactly super-compacted structure, if initial experiments we were able to compact it further quite a bit. So I'm curious how exactly the size heap occupied by meta was calculated (was it some sort of direct sizeOf, like using Unsafe or instrumentation etc), or default impl provided by HeapSize for hregion, or size of HFiles, or something else? Also, I'd appreciate a lot if you could share a sample representative row from your meta, so we can see the typical size of elements in it? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126581#comment-14126581 ] Konstantin Boudnik commented on HBASE-11165: I actually was trying to play along, Andrew - I guess my Russian's toughness got into the way of this comment being accepted as a joke ;) > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126564#comment-14126564 ] Andrew Purtell commented on HBASE-11165: I hope the tongue-in-cheek nature of that post would have been evident. But it is true that consensus is hard. The concern I have is this discussion not downplay issues and solutions with scaling that people have today in lieu of grander designs for tomorrow's versions. (Both are important IMHO). Looks like that concern is not an issue given Alex's comment above. Great. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126556#comment-14126556 ] Konstantin Boudnik commented on HBASE-11165: Honestly speaking the latest fashion (not to say obsession ;) with TLA+ doesn't help anything. It isn't a formal proof of the code correctness nor help to reconciles architecture contracts, as Alex has pointed out. So while someone can produce a fancy TLA+ diagram I am not sure I am not sure how many others will even pay the attention to it. It's like a UML - a good way to share the responsibility for a bad decision ;) We have posted the spec in that HDFS JIRA because it was easier to do that instead of keep the endless argument about nothingness. Will the same be required here, I wonder? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126549#comment-14126549 ] Alex Newman commented on HBASE-11165: - We also support splitting meta as being important for scalability. In fact we'd love to help. I think some of the work required for splitting meta could be aided by some of the work we have done in regards to rdsm. I assume that there is a significant master change required to make it work. Perhaps we can join forces in a way that won't get in your way. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126542#comment-14126542 ] Francis Liu commented on HBASE-11165: - {quote} I'm concerned about this notion of a split meta colocated with a split master function also. Split meta on its own is one thing. Also splitting the master as described smacks of the dynamic subtree partitioning of Ceph's MDS, which has never been stable at scale as far as I know, and there's no stability in sight there either. {quote} >From what I understand "splitting meta" supercedes "collocating the meta" as >having both together would be counterintuitive. With splitting meta I think it >is understood enough that code plus documentation should be enough? {quote} There's also the issue that Francis and crew want to run a version of HBase in production today. I don't see the multi-master alternative as viable before 2.0, or perhaps 1.1 I suppose, but not 1.0.x. Split meta is conceivably something that could be backported like ZK-less assignment* was, although no doubt risky and would need to be carefully done, hidden behind a default-off toggle. {quote} For our use case we don't have data yet to motivate sharded master or HA master. For now it seems splitting meta will address our short-term scaling use case. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126072#comment-14126072 ] Alex Newman commented on HBASE-11165: - We have a ton of people in house who understand TLA+. The problem is TLA+ doesn't verify anything about the code (or a prototype). I am glad that Steve was happy that we wrote a TLA+ specification but in my humble opinion, no one could give us any feedback of what we proposed. Frankly from the lack of push back on the TLA+ we wrote up, it is apparent to me that no one understood what we wrote. That being said, if it's what the community needs TLA+ to feel confident about the approach we are taking, we can do it. Just remember TLA+ doesn't - Verify anything about the code or implementation - Doesn't understand anything about your architecture - Won't tell you if your code isn't modeled by TLA+ It only verifies the theoretical behavior of concurrent systems, not the systems themselves. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123679#comment-14123679 ] Andrew Purtell commented on HBASE-11165: With apologies to [~ste...@apache.org], I'm going to channel my inner Steve Loughran and suggest that a split meta + split master + active-active master (am I getting all the basics in there?) should be prototyped with TLA+ so we have some idea it could even work correctly as proposed. It would also be a worthwhile exercise to model the current AssignmentManager and related protocols with TLA+ but I'll understand if nobody volunteers. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123674#comment-14123674 ] Andrew Purtell commented on HBASE-11165: bq. It seems like everyone is rushing to following the example of HDFS to federation (that's what split meta, split masters gets us). I'm concerned about this notion of a split meta colocated with a split master function also. Split meta on its own is one thing. Also splitting the master as described smacks of the dynamic subtree partitioning of Ceph's MDS, which has never been stable at scale as far as I know, and there's no stability in sight there either. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123670#comment-14123670 ] stack commented on HBASE-11165: --- [~apurtell] Yes. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123667#comment-14123667 ] Andrew Purtell commented on HBASE-11165: There's also the issue that Francis and crew want to run a version of HBase in production today. I don't see the multi-master alternative as viable before 2.0, or perhaps 1.1 I suppose, but not 1.0.x. Split meta is conceivably something that could be backported like ZK-less assignment* was, although no doubt risky and would need to be carefully done, hidden behind a default-off toggle. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122383#comment-14122383 ] Andrey Stepachev commented on HBASE-11165: -- Thanks Stack, that works. -- Andrey. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122382#comment-14122382 ] stack commented on HBASE-11165: --- Try this link instead https://docs.google.com/document/d/1eCuqf7i2dkWHL0PxcE1HE1nLRQ_tCyXI4JsOB6TAk60/edit?usp=sharing > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122381#comment-14122381 ] stack commented on HBASE-11165: --- Here's some notes summarizing back and forth so far. Its public so lacerate to your hearts content: https://docs.google.com/document/d/1eCuqf7i2dkWHL0PxcE1HE1nLRQ_tCyXI4JsOB6TAk60/edit#heading=h.i9al4pf1bajh > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122350#comment-14122350 ] stack commented on HBASE-11165: --- bq. ...In my mind it's a failure. OK. Lets avoid going that route. bq. ...99% of users don't need them... True. We could just hang out in the realm of the 99% in an orbit just above "webscale". Meantime the excluded 1% are our brothers and sisters who need to go bigger. Lets club together and try and figure a way where we can go 'big' w/o going complicated. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121545#comment-14121545 ] Elliott Clark commented on HBASE-11165: --- bq.Is the failure you refer to our having a root and not making use of it? No. I was referring to HDFS Federation as a failure. In my mind it's a failure. 99% of HDFS users don't need or want it yet federation slows down all feature developent and increases complexity on just about every operation. In my mind the solutions being discussed here seem to run parallel. 99% of users don't need them but everyone's going to deal with extra lookpus on region location misses and extra complexity while running the system. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121455#comment-14121455 ] stack commented on HBASE-11165: --- [~mantonov] bq. ...that's kind of hard to compare the relative complexity without proposed detailed designs for both Hopefully we don't need detailed design. I think a sketch will be sufficient. TODO. [~eclark] bq. Then we should be focusing there on places where we have slowness. 'slowness' is but one of the dimensions that needs addressing. There is also 'size' -- size in HDFS, size of cache -- as well as availability No to federation. I don't think we need to split master. Is the failure you refer to our having a root and not making use of it? Let me post something for folks to skewer (listing tangible benefit).. Hopefully tonight (out for the day). Good stuff > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121051#comment-14121051 ] Elliott Clark commented on HBASE-11165: --- bq.There is a largish gap at the moment. Then we should be focusing there on places where we have slowness. Not on adding more complexity. It seems like everyone is rushing to following the example of HDFS to federation (that's what split meta, split masters gets us). I for one am terrified of going that way. From where I sit that was just a failure and following it isn't something I'm ready to do without tangible benefits. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121038#comment-14121038 ] Mikhail Antonov commented on HBASE-11165: - [~stack] that's kind of hard to compare the relative complexity without proposed detailed designs for both, I believe both options worth research and comparison (also we should be able to combine them). W.r.t multiple masters, we do try to streamline the design (with that incremental approach for "cold->warm->hot-> active-active" transition) to make it less intrusive change + provide positive side effects like simplifying current workflow management around region splits, log splitting etc. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120984#comment-14120984 ] stack commented on HBASE-11165: --- [~mantonov] Wouldn't splitting meta be a simpler and less risky route to more read and write iops, more cache, and smaller meta regions than dev'ing multiple masters with colocated replicas? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120976#comment-14120976 ] Mikhail Antonov commented on HBASE-11165: - bq. Getting more r/w iops, cache, and putting off mad compactions on a single big, critical meta region are the more immediate priorities. Hm, sounds like that's the fit for multiple masters then? If each one of several active masters has its own co-located consistent replica of meta, then: - we can do rolling major compactions - read IO (which I think is prevailing) is improved, + meta is cached on several machines? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120970#comment-14120970 ] stack commented on HBASE-11165: --- [~eclark] bq. ...If we can get into the same scaling range as HDFS's namenode then I don't see the urgency to split meta. There is a largish gap at the moment. In-memory representation of cluster is not the immediate barrier to scaling. Getting more r/w iops, cache, and putting off mad compactions on a single big, critical meta region are the more immediate priorities. Row caching and read replicas are just as crazy pants as splitting meta but in the end you are only 'solving' the read i/o issue and leaving aside how we'd deal w/ fat meta, write iops, and cache -- nevermind we have all eggs in one basket. bq. ...doing more active writes then meta on a stable cluster Stable cluster is uninteresting. Its the stop cluster, install software, restart cluster in minimal time is where all the fun is. bq. ...changing the deployment or complexity of normal users. Yeah, this is an issue. Most people will not want/need split meta. A 'go big' button would be messy... [~mantonov] Let me put up doc. for all to throw darts at. Has a few stats in it. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120953#comment-14120953 ] Mikhail Antonov commented on HBASE-11165: - [~toffer] thinking about that more..meta table size isn't what limits you now, is it? 1M regions is estimated to take up about 3-4Gb of space (7Gb is mentioned in last doc, for 10 last versions kept for meta), which should comfortably fit in memory of any single machine. So bottleneck isn't memory, but CPU, due to (possibly) overly coarse-grained locking, is that what it looks like?.. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120938#comment-14120938 ] Elliott Clark commented on HBASE-11165: --- Yeah it's not a direct comparison, but I still think that making meta and meta look ups faster will be more beneficial than adding extra complexity. Adding things like row caching and read replica aware meta look ups will get us a lot before drastically changing the deployment or complexity of normal users. Until we've tuned everything we can I don't feel that it's on overall benefit to add complexity to an already very complex system. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120912#comment-14120912 ] Francis Liu commented on HBASE-11165: - {quote} Yeah I know. What I'm saying is that we should work on getting there before working on the more complex split meta and split master. I would argue that we can get on par (or better) than the NN since it's doing more active writes then meta on a stable cluster. Then when that happens the NN will be the bottleneck and there will be no need for split meta. {quote} It's hard to make a comparison IMHO. NN uses a filer for WAL (at least for us). It's not an LSM so it doesn't suffer from write amplification. Major compaction could just creep up and you could get hosed till its done. Having higher write throughput would definitely be a good thing but IMHO the clear way to scale, is to split meta as it addresses a bunch of issues and enables horizontal scalability for regions. Bottom line for us is we need to scale to 1M regions (soon) and beyond. The guys here will help us with any hdfs related blockers. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120561#comment-14120561 ] Elliott Clark commented on HBASE-11165: --- bq.We already can't scale the number of regions close to the number hdfs can't handle. Yeah I know. What I'm saying is that we should work on getting there before working on the more complex split meta and split master. I would argue that we can get on par (or better) than the NN since it's doing more active writes then meta on a stable cluster. Then when that happens the NN will be the bottleneck and there will be no need for split meta. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120535#comment-14120535 ] Francis Liu commented on HBASE-11165: - {quote} If we can get into the same scaling range as HDFS's namenode then I don't see the urgency to split meta. {quote} [~eclark] We already can't scale the number of regions close to the number hdfs can't handle. See attached doc. Also the hdfs guys internally will be working to support hbase scaling requirements. {quote} Is the NN you mentioned with 250M files is solely dedicated to HBase installation? {quote} The NN I mentioned belongs to a different cluster. The files/per region {quote} I mean, could the assumption be made that the HBase cluster with 1M or large regions consumes about 250M of files in HDFS, so roughly 250 files / per region, or would it be too bold assumption? {quote} Yeah that wouldn't be accurate. It's hard to come up with a good estimate because it's use case dependent. Tho even for our use case I'm making estimates as things are still in flux. I'll share something once we have more data. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120516#comment-14120516 ] Mikhail Antonov commented on HBASE-11165: - [~toffer] thanks! I'd be really curious to look at those numbers. Is the NN you mentioned with 250M files is solely dedicated to HBase installation? I mean, could the assumption be made that the HBase cluster with 1M or large regions consumes about 250M of files in HDFS, so roughly 250 files / per region, or would it be too bold assumption? [~eclark] so if we take as a baseline that (num of files) >> (num regions), I wonder how close to NN limits we are? I mean, if we're talking about case with 10M regions (or even 50M), with the same ratio of region-to-files, 10M regions would give us 2.5B files in HDFS? How close is that to HDFS limits? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120511#comment-14120511 ] Francis Liu commented on HBASE-11165: - {quote} Francis Liu, Virag Kothari - do you guys have by chance some recent numbers (or maybe estimate) on how long does full master failover take on the cluster with 300k or 3M regions? I didn't find those in the recent doc, eager to see that. {quote} [~mantonov] We don't have the numbers we'll get them next time. Tho failover recovery is essentially bounded on scanning meta and recovering dead servers. So without dead servers it would just be a fraction of the startup time. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120485#comment-14120485 ] Elliott Clark commented on HBASE-11165: --- If we can get into the same scaling range as HDFS's namenode then I don't see the urgency to split meta. Num Files >> Num Regions So it would seem that addressing in-memory representation of meta would mean that the scaling bottle neck would be back to the NN. At some point there will be limits there, but that seems fine as long as there are the same limits to our underlying foundation (hdfs). > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120408#comment-14120408 ] Francis Liu commented on HBASE-11165: - We're currently using huge NNs. We haven't looked into the number of inodes as that didn't seem to be an issue for the 1M case (We have a single NN running ~250M files). But we'll be watching it for the post 1M benchmarks. Will post results here. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120360#comment-14120360 ] Andrey Stepachev commented on HBASE-11165: -- Also, it is very interesting, did big users of HBase with so many regions use NameNode federation or use enormous machine to handle NameNode with so many regions? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120349#comment-14120349 ] Mikhail Antonov commented on HBASE-11165: - [~stack], [~octo47] - on this compaction topic, I also mentioned early on in the thread: bq. I wonder if it makes sense to have google doc linked to this jira to save various proposals, findings and estimates? Like that summarizes current usage to be conservatively 3.5Gb in meta / 1M regions. So seems like we're using 3-3.5 Kb per region-row? That should be compressible, looking at the data in meta rows. Also I think it would help if we can post here some numbers and capture in the documents, so we have the baseline for our work. For example: - how many kb in memory per-region in meta - how many hdfs inodes per region (depends on numbers of store files, but some estimate?) To estimate, how big would be a deployment where meta doesn't fit in memory? How many RSs, how many petabytes of data? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120144#comment-14120144 ] Andrey Stepachev commented on HBASE-11165: -- bq.But how HDFS would handle that, as Mikhail Antonov mentioned above? that should be a question > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120142#comment-14120142 ] Andrey Stepachev commented on HBASE-11165: -- bq.Yeah, we'll have to go this route if we are trying to keep state of a big cluster in heap. Could work on making the representation more compact. You arguing for single meta region Andrey Stepachev then? There is also the on-hdfs size to consider (write-amplification) and the r/w i/os. For sure, compact representation doesn't implicate single meta. Compact meta allows to bother with split meta only for really big installations. But how HDFS would handle that, as [~mantonov] mentioned above. As for compact META representations we can use other technics to reduce HDFS impact for big meta. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120086#comment-14120086 ] stack commented on HBASE-11165: --- bq. I mean, seems tricky to make it be the same code path? Yeah. One of the code paths will 'suffer' neglect. [~mantonov] I like your "cold backup -> warm -> hot -> active-active" let me try and put together a bit of a summary so far on findings and arguments. bq. ...and reduce usage of memory Yeah, we'll have to go this route if we are trying to keep state of a big cluster in heap. Could work on making the representation more compact. You arguing for single meta region [~octo47] then? There is also the on-hdfs size to consider (write-amplification) and the r/w i/os. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119832#comment-14119832 ] Andrey Stepachev commented on HBASE-11165: -- Just thinking, did anyone tried to measure how meta uses memory and reduce usage of memory? It is interesting, why NN is able to handle much more data in memory, while HMaster can't. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119513#comment-14119513 ] Mikhail Antonov commented on HBASE-11165: - A side question to folks who recently benchmarked it on big clusters.. what's avg ratio of hdfs inodes / per region you observed? Trying to estimate the load the proposed 1M or 50M regions setup puts on NN. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115646#comment-14115646 ] Mikhail Antonov commented on HBASE-11165: - [~stack] regarding multi-master..yeah, I'd make sure that it doesn't complicate the deployment (and to reduce scope of changes, that would be gradually in the direction of cold backup -> warm -> hot -> active-active). > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115582#comment-14115582 ] Mikhail Antonov commented on HBASE-11165: - Totally agree that for most clusters, 1 meta region should suffice, and that if there's one region, it's easier to bind it to master (no rpcs, less complicated coordination and failure scenarios). Though, for clusters where >1 meta regions is required, that would need to be served on other machines using RPC, right? I mean, seems tricky to make it be the same code path? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115562#comment-14115562 ] Jimmy Xiang commented on HBASE-11165: - One meta region should be good enough for most of clusters. If there is just one meta region, it's better to put it together with the master to avoid possible complications. Additional meta regions should be optional. It should be simple to add new meta regions. Ideally, the same code path should be used no matter how many meta regions there are. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115558#comment-14115558 ] Mikhail Antonov commented on HBASE-11165: - Oops, sorry, instead of "For multi-masters (replicated master) the objection I believe is", I meant "..objective". > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115524#comment-14115524 ] stack commented on HBASE-11165: --- bq. ..by the way, could somebody estimate the distribution or read vs writes to meta table, in terms of requests per second/networking traffic/disk access? [~toffer] or [~virag] Do you have any numbers from your experiments? bq. I believe there are 2 dimensions here, right? There is a more immediate 3rd dimension/option and that is what we have now where we have a single master and if it fails, backup assumes its role. For your dimension 1., (also known as "HBASE-10296 Replace ZK with a consensus lib(paxos,zab or raft running within master processes to provide better master failover performance and state consistency"), it would be a nice-to-have but if we can, lets try and avoid our having to have a HA master. As long as the master comes back inside some reasonable window and the cluster can keep on chugging while its figuring out its recovery, this would be a simpler deploy than one that needs a HA master mini-cluster. Multi-master would help scale reads but as you note, doesn't help if one massive meta region and we want to cache it. We'd go to partitioned masters, as I see it, if we can't make one master do. bq. The viable combined solution ... I suggest we look at single master with split meta served by regionservers first? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114920#comment-14114920 ] Mikhail Antonov commented on HBASE-11165: - bq. That is the issue where client takes a quorum of masters instead of a quorum of zks, right? I was extrapolating that the endpoint of this issue is a quorum of masters where we could read from any (or at least some reads could be stale...) as another means of scaling out the read i/o when master and meta colocated. Yes, that's this one. Yeah, it should help with scaling read I/O (which, I believe, is a vast majority of meta access calls? by the way, could somebody estimate the distribution or read vs writes to meta table, in terms of requests per second/networking traffic/disk access? are there metrics for that?) bq. This issue seems to be arriving at single master to serve mllions of regions I believe there are 2 dimensions here, right? - For multi-masters (replicated master) the objection I believe is to 1) maintain up-to-date in-memory state on >1 master, thus avoiding startup cost for second master (that's actually why I solicited the estimates of how long does the full restart takes now on that big cluster) and 2) scale out reads by serving them out of copies located at different machines. But multi-masters does not solve problem when meta doesn't fit in memory, of when writes need to scale better - partitioned masters, in turn, address this second aspect, fitting meta in memory and scaling writes. Does it look like accurate summary to you? The viable combined solution might be multi-master setup, when masters are serving split meta and grouped by meta region replicas (like, HM1 and HM2 are serving replicas of metaRegion1, metaRegion2, MetaRegion3, and HM3 and HM4 are serving replicas of metaRegion4 and metaRegion5? with masters being a region server now, having really many masters in the cluster might be just right direction to go with?) bq. In fact I don't even think a client can connect to the cluster currently if master is down which makes a bit of a farce of the above notion and needs fixing. That probably is an argument for multi-master layout too :) bq. Let me look at HBASE-7767 That would be great. I also did a first pass to review the patch on the review board and planning to get back to get closer look. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114894#comment-14114894 ] stack commented on HBASE-11165: --- bq. I thought that the primary requirement for splittable meta is not really read-write throughput, but rather the thinking that on large enough cluster with 1M+ regions, meta may not simply fit in master memory (or JVM would have hard time keeping process consuming that much memory up)? It'd be both [~mantonov] Scale reads because pieces of meta served by many servers and master can write in // so scale writes too (though not all ops will be //izable). Yeah, adding features to meta comes up all the time (today it was adding region flush/compaction history, previous its keeping hfiles per cf in meta, region replica info, and so on) bq. I would think that if we want to keep regions as a unit of assignment/recovery, splittable meta is a the must (so far didn't see an approach describing how to avoid it) bq. HBASE-10070 is about region replicas I was thinking that we could have meta region replicas to scale out the read i/o using hbase-10070 (solving one of the objections to splitting meta arguments) bq. HBASE-11467 is about ZK-less client... That is the issue where client takes a quorum of masters instead of a quorum of zks, right? I was extrapolating that the endpoint of this issue is a quorum of masters where we could read from any (or at least some reads could be stale...) as another means of scaling out the read i/o when master and meta colocated. bq. ...but it would work fine with current single active-many backup-masters schema as well. Good to know [~mantonov] bq. I'm thinking that multi-masters and partitioned-masters (if we go in these approaches) need to be discussed closely together and considering each other, otherwise it'd be really hard to merge them together later on. Agree. This issue seems to be arriving at single master to serve mllions of regions. A quorum of masters or partitioning master responsibilities for sure should be discussed together but I don't think they are soln to this issues problem (maybe partitioned master but single server soln seems simpler?) bq. I'd be curious to hear more opinions/assessments on how bad is that when master is down, and what timeframe various people would consider as "generally ok", "kind of long, really want it to be faster" and "unacceptably long"? Yes. Will write something up. In fact I don't even think a client can connect to the cluster currently if master is down which makes a bit of a farce of the above notion and needs fixing. Let me look at HBASE-7767. I got burned by it today. Its annoying to say the least. Yeah, that seems to be the conclusion that is beginning to prevail here. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114633#comment-14114633 ] Mikhail Antonov commented on HBASE-11165: - bq. Please pile on all with thoughts. We need to put stake in grounds soon for hbase 2.0 cluster topology. 2 humble cents from my side: - I thought that the primary requirement for splittable meta is not really read-write throughput, but rather the thinking that on large enough cluster with 1M+ regions, meta may not simply fit in master memory (or JVM would have hard time keeping process consuming that much memory up)? That might be worsened if we take any actions towards keeping more metadata in meta table. [~enis] I believe you brought up before possible other things we may want to add to meta table, which would inflate it in size? Couldn't find that jira/thread. I would think that if we want to keep regions as a unit of assignment/recovery, splittable meta is a the must (so far didn't see an approach describing how to avoid it) bq. ..and until we have HBASE-10295 "Refactor the replication implementation to eliminate permanent zk node" and/or HBASE-11467 "New impl of Registry interface not using ZK + new RPCs on master protocol" (Maybe a later phase of HBASE-10070 when followers can run closer in to the leader state would work here) or a new master layout where we partition meta across multiple master server. Unless I've missed some recent developments, HBASE-10070 is about region replicas, while HBASE-11467 is about ZK-less client (the patch there is about to grow big enough to provide for zk-less client, it's absorbing other subtasks :) ). May be worth to reiterate that zk-less client is some sort of pre-requisite or component of multi-master approach we're working on now, but it would work fine with current single active-many backup-masters schema as well. I'm thinking that multi-masters and partitioned-masters (if we go in these approaches) need to be discussed closely together and considering each other, otherwise it'd be really hard to merge them together later on. Also on this: bq. A plus split meta has over colocated master and meta is that master currently can be down for some period of time and the cluster keeps working; no splits and no merges and if a machine crashes while master is down, data is offline till master comes back (needs more exercise). This is less the case when colocated master and meta. I'd be curious to hear more opinions/assessments on how bad is that when master is down, and what timeframe various people would consider as "generally ok", "kind of long, really want it to be faster" and "unacceptably long"? {quote}So far it seems to me the driving requirements are: + scale + high availability + stop using zookeeper completely/for persistence {quote} Yeah, I think that are exactly the points and they could be discussed together. Besides scale, HA here probably consists of 2 parts - HA for region replicas (read- and rw-), and improved HA for master. Improved master HA (multi-master) for master is being researched/worked on now. On "stop using ZK completely" there are general changes here coming along (like see HBASE-7767, on stopping using ZK for keeping table state.. a patch from [~octo47] is there ready for reviews), and proposed changes on client side to make hbase client non-dependent on ZK (that's HBASE-11467 [~stack] mentioned above, and that's what would be complementary to multi-master work). > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114582#comment-14114582 ] Mikhail Antonov commented on HBASE-11165: - [~toffer], [~virag] - do you guys have by chance some recent numbers (or maybe estimate) on how long does full master failover take on the cluster with 300k or 3M regions? I didn't find those in the recent doc, eager to see that. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105102#comment-14105102 ] Mikhail Antonov commented on HBASE-11165: - bq. Do we need to split this conversation into what to do on master and what to do with 0.98? That makes lot of sense to me. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104809#comment-14104809 ] Francis Liu commented on HBASE-11165: - {quote} If the 0.98 is different to what folks want for 2.0, as per Andy lets split this issue. {quote} We plan to start working on splitting meta the week after next or maybe even next. If there's no clear conclusion to the approach we will likely bring back root for 0.98. If there's an agreed upon solution prior we'd be happy to work/collaborate to get it done. I'm hoping to have splittable meta stable soon so we can avoid having to do 2 backward incompatible rollouts. [~apurtell] hope you're okay with this. {quote} We need to put stake in grounds soon for hbase 2.0 cluster topology. {quote} So far it seems to me the driving requirements are: + scale + high availability + stop using zookeeper completely/for persistence(?) There's a lot of unknowns specific requirements may change. Let's pile on the ideas and have a roadmap iteratively experimenting and adding features with clear gains. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102942#comment-14102942 ] Virag Kothari commented on HBASE-11165: --- bq. Is the shrink in time from 12 mins to 5mins on zk-less assign or forceSync=no? (or for both)? Zk-less. Not yet checked on forceSync=no. Will do the comparison once we have proper fixes for those issues. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102764#comment-14102764 ] stack commented on HBASE-11165: --- bq. If split meta, then 1) Less write amplification (ie no large compactions) ... Good point. i.e. if we want to move to lots of small regions, it would be odd if there was an "except for meta" clause. bq. Better W throughput. If Master is only writer, we'd need to ensure we are writing in // (i.e. Virag's recent patches). bq. 2) More disks, more R/W throughput. Yes. bq. More heap to fit meta... More heap to cache meta, yes. bq. ...We need to do experiments for 1 rack and 2 rack failure... Agreed that in time of catastrophic part-failure, we'd need the better R/W throughput a split meta can give you. Other pluses are we would treat meta like any other table. Negatives are we need our root back and startup is more complicated (but at least all inside single master in this case). In https://docs.google.com/document/d/1xC-bCzAAKO59Xo3XN-Cl6p-5CM_4DMoR-WpnkmYZgpw/edit# I (and others) argue for colocated meta and master going forward looking at options. Let me freshen it with arguments made here. Colocating meta and master has nice properties. The in-memory image of the cluster layout -- probably a severe sub-set of what is actually in meta -- would need to fit a single-server's RAM in either model. When colocated, operations are faster, less prone-to-error when less RPC involved (We'd still be subject to http://writings.quilt.org/2014/05/12/distributed-systems-and-the-end-of-the-api/ if persisting meta in hdfs as francis notes above). A single machine hosting single meta would not be able to service a 50M region startup with hundreds or regionservers as well as a deploy with split meta. It could. It'd just be slower. Colocated meta and master implies single meta forever and that single meta is served by one server only -- a 50M meta region would be an anomaly in the cluster being bigger than all the rest -- and until we have HBASE-10295 "Refactor the replication implementation to eliminate permanent zk node" and/or HBASE-11467 "New impl of Registry interface not using ZK + new RPCs on master protocol" (Maybe a later phase of HBASE-10070 when followers can run closer in to the leader state would work here) or a new master layout where we partition meta across multiple master server. A plus split meta has over colocated master and meta is that master currently can be down for some period of time and the cluster keeps working; no splits and no merges and if a machine crashes while master is down, data is offline till master comes back (needs more exercise). This is less the case when colocated master and meta. Please pile on all with thoughts. We need to put stake in grounds soon for hbase 2.0 cluster topology. Francis needs something in 0.98 timeframe. If the 0.98 is different to what folks want for 2.0, as per Andy lets split this issue. Thoughts-for-the-day: + HBase is supposed to be able to scale + Single meta came about because way back, we were too lazy to fix issues that arose when meta was split (at the time, we didn't need to scale as much). > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102555#comment-14102555 ] Francis Liu commented on HBASE-11165: - {quote} Can I have some pointers on how to read the above. Zk-less AM is better because you scan a table – you don't have to ls znodes? What is the 1M znodes vs 1M rows about in above? {quote} Essentially the apis are better. ie 1M rows we can iterate over the rows instead of ls and get back a huge chunk of data. ie deleting 1M znodes takes too long, this could be parallelizable against an hbase table. For 2.a, response is below. For 2.b, it's mainly a concern wether we'll hit other ZK issues when having that many child znodes (1M and beyond). HDFS guys are already looking into scaling number of child directories for NN. Will update doc. {quote} Francis Liu Is the above the basis for your "...As our experiments shows splitting is a must for scaling."? If split meta, then more read/write throughput? {quote} If split meta, then: 1) Less write amplification (ie no large compactions), Better W throughput. 2) More disks, more R/W throughput. 3. More heap to fit meta, better R throughput. {quote} Because the meta table could be served by many machines so field more reads/writes? The reads/writes are needed at starttime or during cluster lifetime in your judgement? Thanks. {quote} Yep needed for startup. We need to do experiments for 1 rack and 2 rack failure for cluster lifetime case. Though large compactions would creep up on you. So splitting would still be motivating for cluster lifetime IMHO. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101827#comment-14101827 ] stack commented on HBASE-11165: --- On the nice attached doc: {code} Thoughts: Advantage of zk lessassignment 1) Better API's (ls on znode vs scantable) 2) Scalability (If METAis split) a) Read/Write throughput b) Storage (1M child znodes vs 1M rowsin META) {code} Can I have some pointers on how to read the above. Zk-less AM is better because you scan a table -- you don't have to ls znodes? What is the 1M znodes vs 1M rows about in above? [~toffer] Is the above the basis for your "...As our experiments shows splitting is a must for scaling."? If split meta, then more read/write throughput? Because the meta table could be served by many machines so field more reads/writes? The reads/writes are needed at starttime or during cluster lifetime in your judgement? Thanks. [~virag] Is the shrink in time from 12 mins to 5mins on zk-less assign or forceSync=no? (or for both)? [~enis] bq. Do we need to bring root back? Can't we simply host all of root in zookeeper? We expect the number of meta regions in the tens / hundreds case right? Root access would be different to the access of any other table if we went this route. Everyone -- all clients, etc. -- would need to know how to do this new access type. How would you do it in zk anyways? Would be a single znode with all of root in it? Root would have zk enforced limits. Might be ok for a small table that changes infrequently. Not sure about znode with 1k rows in it (history, replicas? etc.). We'd have to test. [~lhofhansl] bq. We need a bootstrapping mechanism to find root anyway... Yeah. There is zk now. Elsewhere, a quorum of masters has been proposed; you'd go to the master to figure where everything is. That'd be a big change. 2.0.x. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101717#comment-14101717 ] Francis Liu commented on HBASE-11165: - {quote} Do we need to split this conversation into what to do on master and what to do with 0.98? We could for example file two separate subtasks that approach the meta scaling problem in different ways for the respective branches. They are divergent enough so that would be a good idea IMHO> {quote} It sounds to me like we've at least come to an agreement that splitting META is reasonable. Off-hand any implementation in trunk should be backportable 0.98. The parallel discussion was about multi-master and the merits of collocation which can be a separate issue. Correct me if I'm wrong [~mbertozzi] and [~jxiang]. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101710#comment-14101710 ] Francis Liu commented on HBASE-11165: - {quote} I agree with Matteo on this. One more benefit to have meta and master together is the meta/master recovery will be much simpler (I mean there won't be scenario like master is recovering, meta regionserver may be down). {quote} Seemed to miss this. What is the other benefit? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101707#comment-14101707 ] Francis Liu commented on HBASE-11165: - {quote} meta is only modified by the master, and the master is mainly doing operations on meta. so if meta is down you have a master up which is not useful at much. {quote} Hmm so in this context things are neither really better or worse with either option? {quote} Also since all writes to meta are controlled by the master having meta anywhere else will not improve things since the master is the one processing requests and you also have to go to another server to perform the operation. {quote} For writes you already have to go to a bunch of other servers (datanodes) to perform the write operation and for reads worst case remote dfs read. Also as we've pointed out in our experiments that overhead is not much (if any at all) and overshadowed by the gains you get from horizontal scalability through SoC. {quote} if you are talking about the read side, I can understand why you want to split it, but isn't having multiple copies even as cache simpler and you'll get the same results? {quote} I'm talking about both read and writes. Being able to split it means being able to have read/write throughput equivalent to the number of machines hosting the table. Have less write amplification (which is already an issue) as well as horizontally scale to have enough memory to have meta in block cache. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101556#comment-14101556 ] Francis Liu commented on HBASE-11165: - {quote} Came here to say exactly that. Enis beat me to it. Can we please not bring back root? We need a bootstrapping mechanism to find root anyway, and if we can do that we should be able to bootstrap a few dozen meta regions from there. Currently we'd use ZK for that, but we don't have to. {quote} [~enis] My main intent is to be able to split meta to avoid having such a large region. The root approach is well understood and AFAIK has no real downsides? Having it ZK is one way as long as we can have around a thousand entries in there that should be fine for our use case. I thought the trend was to move a lot of heavy lifting out of ZK which sounds reasonable to me. [~lhofhansl] where do you propose we put it? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100953#comment-14100953 ] Andrew Purtell commented on HBASE-11165: bq. I agree with Matteo on this. One more benefit to have meta and master together is the meta/master recovery will be much simpler Do we need to split this conversation into what to do on master and what to do with 0.98? We could for example file two separate subtasks that approach the meta scaling problem in different ways for the respective branches. They are divergent enough so that would be a good idea IMHO> > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100778#comment-14100778 ] Jimmy Xiang commented on HBASE-11165: - I agree with Matteo on this. One more benefit to have meta and master together is the meta/master recovery will be much simpler (I mean there won't be scenario like master is recovering, meta regionserver may be down). > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099529#comment-14099529 ] Matteo Bertozzi commented on HBASE-11165: - {quote}I don't see why collocation is required to have the features you mentioned. IMHO all you need is the mutations/locking/etc to be centrally managed (ie all updates to meta is sent to the master), the meta itself can be anywhere. So I'd argue that central management is needed, collocation is not required. Please correct me if I'm missing something here.{quote} meta is only modified by the master, and the master is mainly doing operations on meta. so if meta is down you have a master up which is not useful at much. Also since all writes to meta are controlled by the master having meta anywhere else will not improve things since the master is the one processing requests and you also have to go to another server to perform the operation. if you are talking about the read side, I can understand why you want to split it, but isn't having multiple copies even as cache simpler and you'll get the same results? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099505#comment-14099505 ] Lars Hofhansl commented on HBASE-11165: --- bq. Do we need to bring root back? Can't we simply host all of root in zookeeper? Came here to say exactly that. Enis beat me to it. Can we please not bring back root? We need a bootstrapping mechanism to find root anyway, and if we can do that we should be able to bootstrap a few dozen meta regions from there. Currently we'd use ZK for that, but we don't have to. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099401#comment-14099401 ] Enis Soztutar commented on HBASE-11165: --- bq. That's great, as a first step will bring back root. Do we need to bring root back? Can't we simply host all of root in zookeeper? We expect the number of meta regions in the tens / hundreds case right? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099287#comment-14099287 ] Andrew Purtell commented on HBASE-11165: bq. For trunk it should be fine to have it enabled by default or not even have the switch In my opinion for trunk it's not necessary to have a switch. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099285#comment-14099285 ] Francis Liu commented on HBASE-11165: - {quote} Yes, but older clients must be able to run against any given latest 0.98.x server side version, so these changes should hide behind a default-disabled configuration toggle. {quote} Yep will do this for 0.98. For trunk it should be fine to have it enabled by default or not even have the switch? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099283#comment-14099283 ] Francis Liu commented on HBASE-11165: - {quote} I haven't read the full discussion, so sorry if I missed this piece. does splitting meta means having multiple master each one handing its own meta and its own set of RS? {quote} It does not have to be. So far we can see there are gains to be made in scalability by spliting meta so that it is served by multiple RS. We will do the work to motivate multi-master and justify the complexity if need be. {quote} otherwise we go back as before, the idea of having meta colocated with the master was have operation like assignment, or disable/enable, create/delete interact with "local data" and avoid complexity in handling failure when interacting with other machines. {quote} I don't see why collocation is required to have the features you mentioned. IMHO all you need is the mutations/locking/etc to be centrally managed (ie all updates to meta is sent to the master), the meta itself can be anywhere. So I'd argue that central management is needed, collocation is not required. Please correct me if I'm missing something here. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099168#comment-14099168 ] Andrew Purtell commented on HBASE-11165: {quote} bq. As long as lower scale deploys don't run into potential new issues with a split meta in 0.98, we could be good, i.e. disabled by default in 0.98, enabled by default in later versions. That's great, as a first step will bring back root. And next patch would be do split meta. Does it sound reasonable? {quote} Yes, but older clients must be able to run against any given latest 0.98.x server side version, so these changes should hide behind a default-disabled configuration toggle. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099118#comment-14099118 ] Matteo Bertozzi commented on HBASE-11165: - {quote}That's great, as a first step will bring back root. And next patch would be do split meta. Does it sound reasonable?{quote} I haven't read the full discussion, so sorry if I missed this piece. does splitting meta means having multiple master each one handing its own meta and its own set of RS? otherwise we go back as before, the idea of having meta colocated with the master was have operation like assignment, or disable/enable, create/delete interact with "local data" and avoid complexity in handling failure when interacting with other machines. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099107#comment-14099107 ] Virag Kothari commented on HBASE-11165: --- [~jxiang][~mantonov] I had hackily fixed HBASE-11290, HBASE-11758 and HBASE-11759 and saw that bulk assignment time reduced to ~5 mins from 12 mins. Will be working on proper fix for HBASE-11290 soon..your thoughts will be helpful. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099097#comment-14099097 ] Francis Liu commented on HBASE-11165: - [~mantonov] Have yet to read replicated master jira as well. > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099096#comment-14099096 ] Francis Liu commented on HBASE-11165: - [~mantonov] I think we can split meta and later on build/extend it to support partitioned master if need be, though as Andy mentioned the master part prolly not in 0.98. We'll be doing more investigation to motivate multi-master. [~andrew.purt...@gmail.com] That's great, as a first step will bring back root. And next patch would be do split meta. Does it sound reasonable? > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099005#comment-14099005 ] Mikhail Antonov commented on HBASE-11165: - bq. That kind of change could not be backported Right - that's why I brought this question up :) > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098949#comment-14098949 ] Andrew Purtell commented on HBASE-11165: bq. do you think split meta implies partitioned masters, each responsible for its piece That kind of change could not be backported > Scaling so cluster can host 1M regions and beyond (50M regions?) > > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming >Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, > zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" > and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M > regions maybe even 50M later. This issue is about discussing how we will do > that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)