[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2016-01-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111626#comment-15111626
 ] 

stack commented on HBASE-11165:
---

[~toffer] Sweet

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2016-01-21 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111617#comment-15111617
 ] 

Francis Liu commented on HBASE-11165:
-

Just an update, we've been running a production cluster which has 16 meta 
regions and around 260k regions for the last 2 months or so. We'll start 
getting the supporting changes in zkless (if any), HBASE-11290. Then the big 
patch, tho will try to break it up into smaller pieces if possible.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2015-08-25 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711736#comment-14711736
 ] 

Elliott Clark commented on HBASE-11165:
---

Thanks [~stack] some good notes there.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2015-03-02 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344512#comment-14344512
 ] 

Lars Hofhansl commented on HBASE-11165:
---

Thanks. Sorry for being a pain in the a**.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2015-03-02 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344491#comment-14344491
 ] 

stack commented on HBASE-11165:
---

bq. Just saying we should do what solves the problem with least amount of 
effort.

Agree. Let me do a better job writing up what we've all spewed across tens of 
JIRAs and in offline emails and go from there. Thanks [~lhofhansl]

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2015-03-02 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344384#comment-14344384
 ] 

Lars Hofhansl commented on HBASE-11165:
---

Just saying we should do what solves the problem with least amount of effort.
I think Flurry had abnormally small regions (1g), and that's why they have so 
many (with 20g region they'd have 5.8pb :)) 
If having many regions and fixing all the issues from that is easiest we should 
do that. If the other ways are easier we should do those.
There have been other ideas floating (have regions share a memstore, groups of 
regions for assignment, etc).

I'm not against this. Exploring what new problems we're exchanging for the old 
problems:
# splittable META
#  scalable assignment manager
# handle many memstores
# multi-master
# potential NN scaling issues

If those are easier to solve than those and the previous comment, we know what 
we should do :)


> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2015-03-02 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344250#comment-14344250
 ] 

stack commented on HBASE-11165:
---

In your list, IMO, 1. is reason enough to do small regions.

4. Smaller regions mean less of the keyspace is offline when balancing; also 
more even balance is possible when regions are smaller.

bq. Can we only solve these three with many small regions?

If we just did small regions, without introducing anything new (other than 
doing what we already do 'better'/'faster'), we could improve on your list 
without need to add custom compaction policy(-ies) and the recording of 
interstices at 100MB intervals in metadata (which we'd have to teach clients to 
read), etc.

Regards a problem statement, you want one on why we should tend down toward 
small rather than continue our current trajectory of larger and larger regions, 
or do you want a problem statement for the subject of this JIRA? Regards this 
JIRA, we have users who are headed toward 1M now (Flurry reported being at 300k 
afraid to go up from there and Francis has 'larger' clusters) so we have to 
deal.  You thinking we should explore going up from 10/20G toward 100G or 1TB? 
(With stripe compactions++ and means of apportioning out the 1TB region, etc., 
to address the 1-4 list above?).

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2015-03-02 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344011#comment-14344011
 ] 

Mikhail Antonov commented on HBASE-11165:
-

I also remember some time ago it was discovered that we're keeping too many 
versions of cells in meta, and the patch making is configurable was committed. 
So I suppose [~toffer] or others had a chance to play with it and see if their 
problem was alleviated and to what extent? Curious to know the results.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2015-03-02 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343882#comment-14343882
 ] 

Lars Hofhansl commented on HBASE-11165:
---

Lemme step back... The fundamental conflict is that for assignment we to assign 
in large enough chunks (i.e. a region size), but for other parts of the system 
smaller chunks would be better (compactions, input split for M/R), etc. Right?

The unit of failure is always going to be a region server, whether we have 2000 
1GB regions or 20 100GB region makes no difference from that angle. Assuming 
server with 16TB of disk space or so, the granularity of assignment is also not 
at issue as long as we keep in that ball park (i.e. 1GB - 1TB or so).

So what are the exact problems with large regions:
# compactions and write amplification
# input split calculation for M/R
# log replay upon recovery (is that an issue, i.e. is it worse replaying 1 
large log compared to replaying 100 small ones)
# (more?)

Can we *only* solve these three with many small regions? (or do stripe 
compactions, simple width stats for M/R, etc)
I'm trying to get from a statement about an implementation (that might just 
shift complexity from one part of HBase to another) to an exact problem 
statement.


> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2015-03-02 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343705#comment-14343705
 ] 

Andrew Purtell commented on HBASE-11165:


See above on this issue where [~toffer] wanted to split meta. Since his shop is 
running 0.98 I offered to maintain a branch of 0.98 that includes that change 
in ASF Git for their use, figuring that positive results there would inform on 
when/if we should do that on all branches. Not sure what the current status of 
this discussion is. No patch yet, for 0.98 or any other branch.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2015-03-02 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343691#comment-14343691
 ] 

stack commented on HBASE-11165:
---

[~octo47] IMO, yeah. Tried getting consensus on that a while back and seemed to 
have it but didn't nail it down... Will be back.  Doc. above needs work too.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2015-03-02 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14342903#comment-14342903
 ] 

Andrey Stepachev commented on HBASE-11165:
--

[~stack] so, we are going splitmeta?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2015-03-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14342847#comment-14342847
 ] 

stack commented on HBASE-11165:
---

Some notes on having a scalable meta here: 
https://docs.google.com/document/d/1eCuqf7i2dkWHL0PxcE1HE1nLRQ_tCyXI4JsOB6TAk60/edit?usp=sharing
  Let me attach a PDF version.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-26 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149859#comment-14149859
 ] 

Andrew Purtell commented on HBASE-11165:


Might also be useful to track the ratio when bringing a new table online and 
scaling it up from 0 to a few billion rows.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149823#comment-14149823
 ] 

stack commented on HBASE-11165:
---

[~eclark] asked last night if stripped compactions could help w/ compaction 
problem in meta?

Regards the 50k reads/1 write, what is the ratio across a restart do you think?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-26 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149778#comment-14149778
 ] 

Elliott Clark commented on HBASE-11165:
---

Had some discussions last night and people were asking about read vs write 
numbers for meta.
So currently on a stable cluster meta is getting a read to write request ratio 
of 50,000 reads for every one write request. To me that really suggest that 
adding read replicas is the next thing that we should look at before splitting 
meta or master. I think that could get us quite a way.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-18 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139101#comment-14139101
 ] 

Andrey Stepachev commented on HBASE-11165:
--

[~virag] I agree, 1. is enough for most cases. In HBASE-12016 I made number of 
versions configurable.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-18 Thread Virag Kothari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139093#comment-14139093
 ] 

Virag Kothari commented on HBASE-11165:
---

bq. 1. Meta should keep 1 version (with atomic row updates we definitely need 
no more then one actual version), even better make that configurable.
bq. 2. Meta HLogs should be archived for debugging

For 1., making configurable sounds good. I don't feel a strong need for 2. and 
again we need to have a separate config for cleanup of archived meta WALs.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-18 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138802#comment-14138802
 ] 

Andrey Stepachev commented on HBASE-11165:
--

[~stack] thanks for comments. 
I added jira for that: https://issues.apache.org/jira/browse/HBASE-12016

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138478#comment-14138478
 ] 

stack commented on HBASE-11165:
---

bq. ...do we have significant objections on that?

Not from me.  Even keeping just keeping three versions would be an improvement.

A compact representation in master will help (Let me update the attached google 
doc and fix my misstatements).

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-17 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137789#comment-14137789
 ] 

Mikhail Antonov commented on HBASE-11165:
-

Thanks for explanation [~virag]. 

Yeah, as comment suggest it seems there were no special reason to have it set 
to 10? Making it configurable seems trivial change?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-17 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137783#comment-14137783
 ] 

Andrey Stepachev commented on HBASE-11165:
--

Looks like we can do two things to solve that on up to 1.0 version of hbase 
(without significant changes)
1. Meta should keep 1 version (with atomic row updates we definitely need no 
more then one actual version), even better make that configurable.
2. Meta HLogs should be archived for debugging

That will reduce scan overhead (much less KVs to scan) and reduce memory 
footprint and reduce load times for very big metas.

[~virag], [~apurtell], do we have significant objections on that?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-17 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137754#comment-14137754
 ] 

Mikhail Antonov commented on HBASE-11165:
-

bq.Would you mind filing a new issue for that Mikhail Antonov and say more 
about this on that tangent?
Thanks for the interest [~apurtell], I sure will file it, but I'd just like to 
do that in a day or maybe couple of days, when we have run more experiments and 
also put together actual code to show. Once filed, I'll make sure to link it to 
this jira.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-17 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137715#comment-14137715
 ] 

Andrey Stepachev commented on HBASE-11165:
--

[~virag] thanks. :) didn't see that comment somehow.


> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-17 Thread Virag Kothari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137662#comment-14137662
 ] 

Virag Kothari commented on HBASE-11165:
---

why 10 versions are needed?

Looking at the code for META_TABLEDESC in HTableDescriptor
{code}
// Ten is arbitrary number.  Keep versions to help debugging.
.setMaxVersions(10)
{code}

May be we need to revisit this?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-17 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137632#comment-14137632
 ] 

Andrey Stepachev commented on HBASE-11165:
--

[~virag], do you have some thoughts, why 10 versions are needed? Seems thats a 
great overhead (ten times more memory).

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-17 Thread Virag Kothari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137613#comment-14137613
 ] 

Virag Kothari commented on HBASE-11165:
---

bq.  fully compacted meta with single-versioned cells 1 row in meta takes up 
7-10k

The 7GB for 1M is with 10 versions of meta. Meta has 10 versions by default and 
we dint change that for our experiments. But I see the confusion. The attached 
pdf says 10 versions while the shared google doc says one version. Sorry about 
that.
Also 7GB was the size of store file on hdfs. The table was simply created using 
HexStringSplit with nothing extra. Also this was on 0.98 (I think master  code 
adds some stuff about region replicas in meta) with zk-less assignment.

bq. I'd appreciate a lot if you could share a sample representative row from 
your meta, so we can see the typical size of elements in it?

On prod, we currently run 0.94. I can check if we can share some sample row 
from there. 

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137450#comment-14137450
 ] 

Andrew Purtell commented on HBASE-11165:


bq. And the list of Results is not exactly super-compacted structure, if 
initial experiments we were able to compact it further quite a bit.

Would you mind filing a new issue for that [~mantonov] and say more about this 
on that tangent?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-16 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136860#comment-14136860
 ] 

Mikhail Antonov commented on HBASE-11165:
-

[~toffer], [~virag] - guys, getting back for a moment to the size of meta 
table..there's an interesting thing about it. [~octo47], [~sergey.soldatov] and 
myself are doing some prototyping of more compact & fast in-memory 
representation of meta, and have noticed interesting thing inspecting the data 
structures under the microscope (like 
https://code.google.com/p/memory-measurer).

In the discussion about, and in the doc put together by [~stack] it's mentioned 
that in best case, for fully compacted meta with single-versioned cells 1 row 
in meta takes up 7-10kb (please correct me if I'm wrong).

If I run a minicluster test, create some simple table with regions and then do 
MTA.fullScan() to get list of Result, a single Result won't take more than 1Kb, 
normally less than that. And the list of Results is not exactly super-compacted 
structure, if initial experiments we were able to compact it further quite a 
bit.

So I'm curious how exactly the size heap occupied by meta was calculated (was 
it some sort of direct sizeOf, like using Unsafe or instrumentation etc), or 
default impl provided by HeapSize for hregion, or size of HFiles, or something 
else? Also, I'd appreciate a lot if you could share a sample representative row 
from your meta, so we can see the typical size of elements in it?


> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-08 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126581#comment-14126581
 ] 

Konstantin Boudnik commented on HBASE-11165:


I actually was trying to play along, Andrew - I guess my Russian's toughness 
got into the way of this comment being accepted as a joke ;)

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-08 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126564#comment-14126564
 ] 

Andrew Purtell commented on HBASE-11165:


I hope the tongue-in-cheek nature of that post would have been evident. But it 
is true that consensus is hard.

The concern I have is this discussion not downplay issues and solutions with 
scaling that people have today in lieu of grander designs for tomorrow's 
versions. (Both are important IMHO). Looks like that concern is not an issue 
given Alex's comment above. Great. 

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-08 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126556#comment-14126556
 ] 

Konstantin Boudnik commented on HBASE-11165:


Honestly speaking the latest fashion (not to say obsession ;) with TLA+ doesn't 
help anything. It isn't a formal proof of the code correctness nor help to 
reconciles architecture contracts, as Alex has pointed out. So while someone 
can produce a fancy TLA+ diagram I am not sure I am not sure how many others 
will even pay the attention to it. It's like a UML - a good way to share the 
responsibility for a bad decision ;)

We have posted the spec in that HDFS JIRA because it was easier to do that 
instead of keep the endless argument about nothingness. Will the same be 
required here, I wonder?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-08 Thread Alex Newman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126549#comment-14126549
 ] 

Alex Newman commented on HBASE-11165:
-

We also support splitting meta as being important for scalability. In fact
we'd love to help. I think some of the work required for splitting meta
could be aided by some of the work we have done in regards to rdsm. I
assume that there is a significant master change required to make it work.
Perhaps we can join forces in a way that won't get in your way.



> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-08 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126542#comment-14126542
 ] 

Francis Liu commented on HBASE-11165:
-

{quote}
I'm concerned about this notion of a split meta colocated with a split master 
function also. Split meta on its own is one thing. Also splitting the master as 
described smacks of the dynamic subtree partitioning of Ceph's MDS, which has 
never been stable at scale as far as I know, and there's no stability in sight 
there either.
{quote}
>From what I understand "splitting meta" supercedes "collocating the meta" as 
>having both together would be counterintuitive. With splitting meta I think it 
>is understood enough that code plus documentation should be enough?

{quote}
There's also the issue that Francis and crew want to run a version of HBase in 
production today. I don't see the multi-master alternative as viable before 
2.0, or perhaps 1.1 I suppose, but not 1.0.x. Split meta is conceivably 
something that could be backported like ZK-less assignment* was, although no 
doubt risky and would need to be carefully done, hidden behind a default-off 
toggle.
{quote}
For our use case we don't have data yet to motivate sharded master or HA 
master. For now it seems splitting meta will address our short-term scaling use 
case.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-08 Thread Alex Newman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126072#comment-14126072
 ] 

Alex Newman commented on HBASE-11165:
-

We have a ton of people in house who understand TLA+. The problem is TLA+ 
doesn't verify anything about the code (or a prototype). I am glad that Steve 
was happy that we wrote a TLA+ specification but in my humble opinion, no one 
could give us any feedback of what we proposed. Frankly from the lack of push 
back on the TLA+ we wrote up, it is apparent to me that no one understood what 
we wrote. That being said, if it's what the community needs TLA+ to feel 
confident about the approach we are taking, we can do it. Just remember TLA+ 
doesn't

- Verify anything about the code or implementation 
- Doesn't understand anything about your architecture
- Won't tell you if your code isn't modeled by TLA+

It only verifies the theoretical behavior of concurrent systems, not the 
systems themselves.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-05 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123679#comment-14123679
 ] 

Andrew Purtell commented on HBASE-11165:


With apologies to [~ste...@apache.org], I'm going to channel my inner Steve 
Loughran and suggest that a split meta + split master + active-active master 
(am I getting all the basics in there?) should be prototyped with TLA+ so we 
have some idea it could even work correctly as proposed. It would also be a 
worthwhile exercise to model the current AssignmentManager and related 
protocols with TLA+ but I'll understand if nobody volunteers. 

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-05 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123674#comment-14123674
 ] 

Andrew Purtell commented on HBASE-11165:


bq. It seems like everyone is rushing to following the example of HDFS to 
federation (that's what split meta, split masters gets us).

I'm concerned about this notion of a split meta colocated with a split master 
function also. Split meta on its own is one thing. Also splitting the master as 
described smacks of the dynamic subtree partitioning of Ceph's MDS, which has 
never been stable at scale as far as I know, and there's no stability in sight 
there either.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123670#comment-14123670
 ] 

stack commented on HBASE-11165:
---

[~apurtell] Yes.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-05 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123667#comment-14123667
 ] 

Andrew Purtell commented on HBASE-11165:


There's also the issue that Francis and crew want to run a version of HBase in 
production today. I don't see the multi-master alternative as viable before 
2.0, or perhaps 1.1 I suppose, but not 1.0.x. Split meta is conceivably 
something that could be backported like ZK-less assignment* was, although no 
doubt risky and would need to be carefully done, hidden behind a default-off 
toggle.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-04 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122383#comment-14122383
 ] 

Andrey Stepachev commented on HBASE-11165:
--

Thanks Stack, that works.






-- 
Andrey.


> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122382#comment-14122382
 ] 

stack commented on HBASE-11165:
---

Try this link instead 
https://docs.google.com/document/d/1eCuqf7i2dkWHL0PxcE1HE1nLRQ_tCyXI4JsOB6TAk60/edit?usp=sharing

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122381#comment-14122381
 ] 

stack commented on HBASE-11165:
---

Here's some notes summarizing back and forth so far. Its public so lacerate to 
your hearts content:

https://docs.google.com/document/d/1eCuqf7i2dkWHL0PxcE1HE1nLRQ_tCyXI4JsOB6TAk60/edit#heading=h.i9al4pf1bajh

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122350#comment-14122350
 ] 

stack commented on HBASE-11165:
---

bq. ...In my mind it's a failure.

OK. Lets avoid going that route.

bq. ...99% of users don't need them...

True.  We could just hang out in the realm of the 99% in an orbit just above 
"webscale". Meantime the excluded 1% are our brothers and sisters who need to 
go bigger.  Lets club together and try and figure a way where we can go 'big' 
w/o going complicated.



> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-04 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121545#comment-14121545
 ] 

Elliott Clark commented on HBASE-11165:
---

bq.Is the failure you refer to our having a root and not making use of it?
No. I was referring to HDFS Federation as a failure.  In my mind it's a 
failure.  99% of HDFS users don't need or want it yet federation slows down all 
feature developent and increases complexity on just about every operation. 

In my mind the solutions being discussed here seem to run parallel. 99% of 
users don't need them but everyone's going to deal with extra lookpus on region 
location misses and extra complexity while running the system.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121455#comment-14121455
 ] 

stack commented on HBASE-11165:
---

[~mantonov]

bq. ...that's kind of hard to compare the relative complexity without proposed 
detailed designs for both

Hopefully we don't need detailed design.  I think a sketch will be sufficient.  
 TODO.

[~eclark]

bq. Then we should be focusing there on places where we have slowness. 

'slowness' is but one of the dimensions that needs addressing.  There is also 
'size' -- size in HDFS, size of cache -- as well as availability

No to federation. I don't think we need to split master.

Is the failure you refer to our having a root and not making use of it?

Let me post something for folks to skewer (listing tangible benefit).. 
Hopefully tonight (out for the day).

Good stuff

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-04 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121051#comment-14121051
 ] 

Elliott Clark commented on HBASE-11165:
---

bq.There is a largish gap at the moment.
Then we should be focusing there on places where we have slowness.  Not on 
adding more complexity.

It seems like everyone is rushing to following the example of HDFS to 
federation (that's what split meta, split masters gets us). I for one am 
terrified of going that way. From where I sit that was just a failure and 
following it isn't something I'm ready to do without tangible benefits.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121038#comment-14121038
 ] 

Mikhail Antonov commented on HBASE-11165:
-

[~stack] that's kind of hard to compare the relative complexity without 
proposed detailed designs for both, I believe both options worth research and 
comparison (also we should be able to combine them). W.r.t multiple masters, we 
do try to streamline the design (with that incremental approach for 
"cold->warm->hot-> active-active" transition) to make it less intrusive change 
+ provide positive side effects like simplifying current workflow management 
around region splits, log splitting etc.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120984#comment-14120984
 ] 

stack commented on HBASE-11165:
---

[~mantonov] Wouldn't splitting meta be a simpler and less risky route to more 
read and write iops, more cache, and smaller meta regions than dev'ing multiple 
masters with colocated replicas?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120976#comment-14120976
 ] 

Mikhail Antonov commented on HBASE-11165:
-

bq.  Getting more r/w iops, cache, and putting off mad compactions on a single 
big, critical meta region are the more immediate priorities. 
Hm, sounds like that's the fit for multiple masters then? If each one of 
several active masters has its own co-located consistent replica of meta, then:

 - we can do rolling major compactions
 - read IO (which I think is prevailing) is improved, + meta is cached on 
several machines?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120970#comment-14120970
 ] 

stack commented on HBASE-11165:
---

[~eclark] 

bq. ...If we can get into the same scaling range as HDFS's namenode then I 
don't see the urgency to split meta.

There is a largish gap at the moment.

In-memory representation of cluster is not the immediate barrier to scaling.  
Getting more r/w iops, cache, and putting off mad compactions on a single big, 
critical meta region are the more immediate priorities. Row caching and read 
replicas are just as crazy pants as splitting meta but in the end you are only 
'solving' the read i/o issue and leaving aside how we'd deal w/ fat meta, write 
iops, and cache -- nevermind we have all eggs in one basket.

bq. ...doing more active writes then meta on a stable cluster

Stable cluster is uninteresting. Its the stop cluster, install software, 
restart cluster in minimal time is where all the fun is.

bq. ...changing the deployment or complexity of normal users. 

Yeah, this is an issue.  Most people will not want/need split meta.  A 'go big' 
button would be messy...

[~mantonov] Let me put up doc. for all to throw darts at.  Has a few stats in 
it.




> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120953#comment-14120953
 ] 

Mikhail Antonov commented on HBASE-11165:
-

[~toffer] thinking about that more..meta table size isn't what limits you now, 
is it? 1M regions is estimated to take up about 3-4Gb of space (7Gb is 
mentioned in last doc, for 10 last versions kept for meta), which should 
comfortably fit in memory of any single machine. So bottleneck isn't memory, 
but CPU, due to (possibly) overly coarse-grained locking, is that what it looks 
like?..

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120938#comment-14120938
 ] 

Elliott Clark commented on HBASE-11165:
---

Yeah it's not a direct comparison, but I still think that making meta and meta 
look ups faster will be more beneficial than adding extra complexity.  Adding 
things like row caching and read replica aware meta look ups will get us a lot 
before drastically changing the deployment or complexity of normal users. Until 
we've tuned everything we can I don't feel that it's on overall benefit to add 
complexity to an already very complex system.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120912#comment-14120912
 ] 

Francis Liu commented on HBASE-11165:
-

{quote}
Yeah I know. What I'm saying is that we should work on getting there before 
working on the more complex split meta and split master. I would argue that we 
can get on par (or better) than the NN since it's doing more active writes then 
meta on a stable cluster. Then when that happens the NN will be the bottleneck 
and there will be no need for split meta.
{quote}
It's hard to make a comparison IMHO. NN uses a filer for WAL (at least for us). 
It's not an LSM so it doesn't suffer from write amplification. Major compaction 
could just creep up and you could get hosed till its done.  Having higher write 
throughput would definitely be a good thing but IMHO the clear way to scale, is 
to split meta as it addresses a bunch of issues and enables horizontal 
scalability for regions. Bottom line for us is we need to scale to 1M regions 
(soon) and beyond. The guys here will help us with any hdfs related blockers.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120561#comment-14120561
 ] 

Elliott Clark commented on HBASE-11165:
---

bq.We already can't scale the number of regions close to the number hdfs can't 
handle.
Yeah I know. What I'm saying is that we should work on getting there before 
working on the more complex split meta and split master. I would argue that we 
can get on par (or better) than the NN since it's doing more active writes then 
meta on a stable cluster.  Then when that happens the NN will be the bottleneck 
and there will be no need for split meta.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120535#comment-14120535
 ] 

Francis Liu commented on HBASE-11165:
-

{quote}
If we can get into the same scaling range as HDFS's namenode then I don't see 
the urgency to split meta.
{quote}
[~eclark] We already can't scale the number of regions close to the number hdfs 
can't handle. See attached doc. Also the hdfs guys internally will be working 
to support hbase scaling requirements.

{quote}
Is the NN you mentioned with 250M files is solely dedicated to HBase 
installation? 
{quote}
The NN I mentioned belongs to a different cluster. The files/per region

{quote}
I mean, could the assumption be made that the HBase cluster with 1M or large 
regions consumes about 250M of files in HDFS, so roughly 250 files / per 
region, or would it be too bold assumption?
{quote}
Yeah that wouldn't be accurate. It's hard to come up with a good estimate 
because it's use case dependent. Tho even for our use case I'm making estimates 
as things are still in flux. I'll share something once we have more data.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120516#comment-14120516
 ] 

Mikhail Antonov commented on HBASE-11165:
-

[~toffer] thanks! I'd be really curious to look at those numbers.

Is the NN you mentioned with 250M files is solely dedicated to HBase 
installation? I mean, could the assumption be made that the HBase cluster with 
1M or large regions consumes about 250M of files in HDFS, so roughly 250 files 
/ per region, or would it be too bold assumption?

[~eclark] so if we take as a baseline that (num of files) >> (num regions), I 
wonder how close to NN limits we are? I mean, if we're talking about case with 
10M regions (or even 50M), with the same ratio of region-to-files, 10M regions 
would give us 2.5B files in HDFS? How close is that to HDFS limits?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120511#comment-14120511
 ] 

Francis Liu commented on HBASE-11165:
-

{quote}
Francis Liu, Virag Kothari - do you guys have by chance some recent numbers (or 
maybe estimate) on how long does full master failover take on the cluster with 
300k or 3M regions? I didn't find those in the recent doc, eager to see that.
{quote}
[~mantonov] We don't have the numbers we'll get them next time. Tho failover 
recovery is essentially bounded on scanning meta and recovering dead servers. 
So without dead servers it would just be a fraction of the startup time.



> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120485#comment-14120485
 ] 

Elliott Clark commented on HBASE-11165:
---

If we can get into the same scaling range as HDFS's namenode then I don't see 
the urgency to split meta.

Num Files >> Num Regions

So it would seem that addressing in-memory representation of meta would mean 
that the scaling bottle neck would be back to the NN. At some point there will 
be limits there, but that seems fine as long as there are the same limits to 
our underlying foundation (hdfs).

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120408#comment-14120408
 ] 

Francis Liu commented on HBASE-11165:
-

We're currently using huge NNs. 

We haven't looked into the number of inodes as that didn't seem to be an issue 
for the 1M case (We have a single NN running ~250M files). But we'll be 
watching it for the post 1M  benchmarks. Will post results here.



> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120360#comment-14120360
 ] 

Andrey Stepachev commented on HBASE-11165:
--

Also, it is very interesting, did big users of HBase with so many regions use 
NameNode federation or use enormous machine to handle NameNode with so many 
regions?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120349#comment-14120349
 ] 

Mikhail Antonov commented on HBASE-11165:
-

[~stack],  [~octo47] - on this compaction topic,  I also mentioned early on in 
the thread:

bq. I wonder if it makes sense to have google doc linked to this jira to save 
various proposals, findings and estimates? Like that summarizes current usage 
to be conservatively 3.5Gb in meta / 1M regions.

So seems like we're using 3-3.5 Kb per region-row? That should be compressible, 
looking at the data in meta rows. Also I think it would help if we can post 
here some numbers and capture in the documents, so we have the baseline for our 
work. For example:

 - how many kb in memory per-region in meta
 - how many hdfs inodes per region (depends on numbers of store files, but some 
estimate?)

To estimate, how big would be a deployment where meta doesn't fit in memory? 
How many RSs, how many petabytes of data?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120144#comment-14120144
 ] 

Andrey Stepachev commented on HBASE-11165:
--

bq.But how HDFS would handle that, as Mikhail Antonov mentioned above?
that should be a question

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120142#comment-14120142
 ] 

Andrey Stepachev commented on HBASE-11165:
--

bq.Yeah, we'll have to go this route if we are trying to keep state of a big 
cluster in heap. Could work on making the representation more compact. You 
arguing for single meta region Andrey Stepachev then? There is also the on-hdfs 
size to consider (write-amplification) and the r/w i/os.

For sure, compact representation doesn't implicate single meta. Compact meta 
allows to bother with split meta only for really big installations. But how 
HDFS would handle that, as [~mantonov] mentioned above.

As for compact META representations we can use other technics to reduce HDFS 
impact for big meta.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120086#comment-14120086
 ] 

stack commented on HBASE-11165:
---

bq.  I mean, seems tricky to make it be the same code path?

Yeah. One of the code paths will 'suffer' neglect.

[~mantonov] I like your "cold backup -> warm -> hot -> active-active" let 
me try and put together a bit of a summary so far on findings and arguments.


bq. ...and reduce usage of memory

Yeah, we'll have to go this route if we are trying to keep state of a big 
cluster in heap.  Could work on making the representation more compact.  You 
arguing for single meta region [~octo47] then? There is also the on-hdfs size 
to consider (write-amplification) and the r/w i/os.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119832#comment-14119832
 ] 

Andrey Stepachev commented on HBASE-11165:
--

Just thinking, did anyone tried to measure how meta uses memory and reduce 
usage of memory? It is interesting, why NN is able to handle much more data in 
memory, while HMaster can't.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119513#comment-14119513
 ] 

Mikhail Antonov commented on HBASE-11165:
-

A side question to folks who recently benchmarked it on big clusters.. what's 
avg ratio of hdfs inodes / per region you observed? Trying to estimate the load 
the proposed 1M or 50M regions setup puts on NN.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-29 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115646#comment-14115646
 ] 

Mikhail Antonov commented on HBASE-11165:
-

[~stack] regarding multi-master..yeah, I'd make sure that it doesn't complicate 
the deployment (and to reduce scope of changes, that would be gradually in the 
direction of cold backup -> warm -> hot -> active-active).

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-29 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115582#comment-14115582
 ] 

Mikhail Antonov commented on HBASE-11165:
-

Totally agree that for most clusters, 1 meta region should suffice, and that if 
there's one region, it's easier to bind it to master (no rpcs, less complicated 
coordination and failure scenarios). Though, for clusters where >1 meta regions 
is required, that would need to be served on other machines using RPC, right? I 
mean, seems tricky to make it be the same code path?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-29 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115562#comment-14115562
 ] 

Jimmy Xiang commented on HBASE-11165:
-

One meta region should be good enough for most of clusters. If there is just 
one meta region, it's better to put it together with the master to avoid 
possible complications.  Additional meta regions should be optional. It should 
be simple to add new meta regions. Ideally, the same code path should be used 
no matter how many meta regions there are.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-29 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115558#comment-14115558
 ] 

Mikhail Antonov commented on HBASE-11165:
-

Oops, sorry, instead of "For multi-masters (replicated master) the objection I 
believe is", I meant "..objective".

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115524#comment-14115524
 ] 

stack commented on HBASE-11165:
---

bq. ..by the way, could somebody estimate the distribution or read vs writes to 
meta table, in terms of requests per second/networking traffic/disk access?

[~toffer] or [~virag] Do you have any numbers from your experiments?

bq. I believe there are 2 dimensions here, right?

There is a more immediate 3rd dimension/option and that is what we have now 
where we have a single master and if it fails, backup assumes its role.

For your dimension 1., (also known as  "HBASE-10296 Replace ZK with a consensus 
lib(paxos,zab or raft running within master processes to provide better master 
failover performance and state consistency"), it would be a nice-to-have but if 
we can, lets try and avoid our having to have a HA master. As long as the 
master comes back inside some reasonable window and the cluster can keep on 
chugging while its figuring out its recovery, this would be a simpler deploy 
than one that needs a HA master mini-cluster.  Multi-master would help scale 
reads but as you note, doesn't help if one massive meta region and we want to 
cache it.

We'd go to partitioned masters, as I see it, if we can't make one master do.

bq. The viable combined solution ...

I suggest we look at single master with split meta served by regionservers 
first?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-28 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114920#comment-14114920
 ] 

Mikhail Antonov commented on HBASE-11165:
-

bq. That is the issue where client takes a quorum of masters instead of a 
quorum of zks, right? I was extrapolating that the endpoint of this issue is a 
quorum of masters where we could read from any (or at least some reads could be 
stale...) as another means of scaling out the read i/o when master and meta 
colocated.

Yes, that's this one. Yeah, it should help with scaling read I/O (which, I 
believe, is a vast majority of meta access calls? by the way, could somebody 
estimate the distribution or read vs writes to meta table, in terms of requests 
per second/networking traffic/disk access? are there metrics for that?)

bq. This issue seems to be arriving at single master to serve mllions of regions
I believe there are 2 dimensions here, right?

 - For multi-masters (replicated master) the objection I believe is to 1) 
maintain up-to-date in-memory state on >1 master, thus avoiding startup cost 
for second master (that's actually why I solicited the estimates of how long 
does the full restart takes now on that big cluster) and 2) scale out reads by 
serving them out of copies located at different machines. But multi-masters 
does not solve problem when meta doesn't fit in memory, of when writes need to 
scale better
 - partitioned masters, in turn, address this second aspect, fitting meta in 
memory and scaling writes.

Does it look like accurate summary to you? The viable combined solution might 
be multi-master setup, when masters are serving split meta and grouped by meta 
region replicas  (like, HM1 and HM2 are serving replicas of metaRegion1, 
metaRegion2, MetaRegion3, and HM3 and HM4 are serving replicas of metaRegion4 
and metaRegion5? with masters being a region server now, having really many 
masters in the cluster might be just right direction to go with?)

bq. In fact I don't even think a client can connect to the cluster currently if 
master is down which makes a bit of a farce of the above notion and needs 
fixing.
That probably is an argument for multi-master layout too :)

bq. Let me look at HBASE-7767
That would be great. I also did a first pass to review the patch on the review 
board and planning to get back to get closer look.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114894#comment-14114894
 ] 

stack commented on HBASE-11165:
---

bq. I thought that the primary requirement for splittable meta is not really 
read-write throughput, but rather the thinking that on large enough cluster 
with 1M+ regions, meta may not simply fit in master memory (or JVM would have 
hard time keeping process consuming that much memory up)? 

It'd be both [~mantonov] Scale reads because pieces of meta served by many 
servers and master can write in // so scale writes too (though not all ops will 
be //izable).

Yeah, adding features to meta comes up all the time (today it was adding region 
flush/compaction history, previous its keeping hfiles per cf in meta, region 
replica info, and so on)

bq. I would think that if we want to keep regions as a unit of 
assignment/recovery, splittable meta is a the must (so far didn't see an 
approach describing how to avoid it)

bq. HBASE-10070 is about region replicas 

I was thinking that we could have meta region replicas to scale out the read 
i/o using hbase-10070 (solving one of the objections to splitting meta 
arguments)

bq.  HBASE-11467 is about ZK-less client...

That is the issue where client takes a quorum of masters instead of a quorum of 
zks, right?  I was extrapolating that the endpoint of this issue is a quorum of 
masters where we could read from any (or at least some reads could be stale...) 
as another means of scaling out the read i/o when master and meta colocated.

bq. ...but it would work fine with current single active-many backup-masters 
schema as well.

Good to know [~mantonov]

bq. I'm thinking that multi-masters and partitioned-masters (if we go in these 
approaches) need to be discussed closely together and considering each other, 
otherwise it'd be really hard to merge them together later on.

Agree.  This issue seems to be arriving at single master to serve mllions of 
regions. A quorum of masters or partitioning master responsibilities for sure 
should be discussed together but I don't think they are soln to this issues 
problem (maybe partitioned master but single server soln seems simpler?)

bq. I'd be curious to hear more opinions/assessments on how bad is that when 
master is down, and what timeframe various people would consider as "generally 
ok", "kind of long, really want it to be faster" and "unacceptably long"?

Yes. Will write something up. In fact I don't even think a client can connect 
to the cluster currently if master is down which makes a bit of a farce of the 
above notion and needs fixing.

Let me look at HBASE-7767. I got burned by it today. Its annoying to say 
the least.

Yeah, that seems to be the conclusion that is beginning to prevail  here.








> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-28 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114633#comment-14114633
 ] 

Mikhail Antonov commented on HBASE-11165:
-

bq. Please pile on all with thoughts. We need to put stake in grounds soon for 
hbase 2.0 cluster topology.

2 humble cents from my side:

 - I thought that the primary requirement for splittable meta is not really 
read-write throughput, but rather the thinking that on large enough cluster 
with 1M+ regions, meta may not simply fit in master memory (or JVM would have 
hard time keeping process consuming that much memory up)? That might be 
worsened if we take any actions towards keeping more metadata in meta table. 
[~enis] I believe you brought up before possible other things we may want to 
add to meta table, which would inflate it in size? Couldn't find that 
jira/thread. I would think that if we want to keep regions as a unit of 
assignment/recovery, splittable meta is a the must (so far didn't see an 
approach describing how to avoid it)

bq. ..and until we have HBASE-10295 "Refactor the replication implementation to 
eliminate permanent zk node" and/or HBASE-11467 "New impl of Registry interface 
not using ZK + new RPCs on master protocol" (Maybe a later phase of HBASE-10070 
when followers can run closer in to the leader state would work here) or a new 
master layout where we partition meta across multiple master server.
Unless I've missed some recent developments, HBASE-10070 is about region 
replicas, while HBASE-11467 is about ZK-less client (the patch there is about 
to grow big enough to provide for zk-less client, it's absorbing other subtasks 
:) ). May be worth to reiterate that zk-less client is some sort of 
pre-requisite or component of multi-master approach we're working on now, but 
it would work fine with current single active-many backup-masters schema as 
well.
I'm thinking that multi-masters and partitioned-masters (if we go in these 
approaches) need to be discussed closely together and considering each other, 
otherwise it'd be really hard to merge them together later on.

Also  on this:
bq. A plus split meta has over colocated master and meta is that master 
currently can be down for some period of time and the cluster keeps working; no 
splits and no merges and if a machine crashes while master is down, data is 
offline till master comes back (needs more exercise). This is less the case 
when colocated master and meta.
I'd be curious to hear more opinions/assessments on how bad is that when master 
is down, and what timeframe various people would consider as "generally ok", 
"kind of long, really want it to be faster" and "unacceptably long"?

{quote}So far it seems to me the driving requirements are:
+ scale
+ high availability
+ stop using zookeeper completely/for persistence
{quote}
Yeah, I think that are exactly the points and they could be discussed together. 
Besides scale, HA here probably consists of 2 parts - HA for region replicas 
(read- and rw-), and improved HA for master. Improved master HA (multi-master) 
for master is being researched/worked on now.

On "stop using ZK completely" there are general changes here coming along (like 
see HBASE-7767, on stopping using ZK for keeping table state.. a patch from 
[~octo47] is there ready for reviews), and proposed changes on client side to 
make hbase client non-dependent on ZK (that's HBASE-11467 [~stack] mentioned 
above, and that's what would be complementary to multi-master work).

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-28 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114582#comment-14114582
 ] 

Mikhail Antonov commented on HBASE-11165:
-

[~toffer], [~virag] - do you guys have by chance some recent numbers (or maybe 
estimate) on how long does full master failover take on the cluster with 300k 
or 3M regions? I didn't find those in the recent doc, eager to see that.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-20 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105102#comment-14105102
 ] 

Mikhail Antonov commented on HBASE-11165:
-

bq. Do we need to split this conversation into what to do on master and what to 
do with 0.98?
That makes lot of sense to me.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-20 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104809#comment-14104809
 ] 

Francis Liu commented on HBASE-11165:
-

{quote}
If the 0.98 is different to what folks want for 2.0, as per Andy lets split 
this issue.
{quote}
We plan to start working on splitting meta the week after next or maybe even 
next. If there's no clear conclusion to the approach we will likely bring back 
root for 0.98. If there's an agreed upon solution prior we'd be happy to 
work/collaborate to get it done. I'm hoping to have splittable meta stable soon 
so we can avoid having to do 2 backward incompatible rollouts. [~apurtell] hope 
you're okay with this.

{quote}
We need to put stake in grounds soon for hbase 2.0 cluster topology.
{quote}
So far it seems to me the driving requirements are:

+ scale
+ high availability
+ stop using zookeeper completely/for persistence(?)

There's a lot of unknowns specific requirements may change. Let's pile on the 
ideas and have a roadmap iteratively experimenting and adding features with 
clear gains.


> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-19 Thread Virag Kothari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102942#comment-14102942
 ] 

Virag Kothari commented on HBASE-11165:
---

bq. Is the shrink in time from 12 mins to 5mins on zk-less assign or 
forceSync=no? (or for both)?

Zk-less. Not yet checked on forceSync=no. Will do the comparison once we have 
proper fixes for those issues.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102764#comment-14102764
 ] 

stack commented on HBASE-11165:
---

bq. If split meta, then 1) Less write amplification (ie no large compactions) 
...

Good point. i.e. if we want to move to lots of small regions, it would be odd 
if there was an "except for meta" clause.

bq. Better W throughput.

If Master is only writer, we'd need to ensure we are writing in // (i.e. 
Virag's recent patches).

bq. 2) More disks, more R/W throughput.

Yes.

bq. More heap to fit meta...

More heap to cache meta, yes.

bq. ...We need to do experiments for 1 rack and 2 rack failure...

Agreed that in time of catastrophic part-failure, we'd need the better R/W 
throughput a split meta can give you.

Other pluses are we would treat meta like any other table. Negatives are we 
need our root back and startup is more complicated (but at least all inside 
single master in this case).

In 
https://docs.google.com/document/d/1xC-bCzAAKO59Xo3XN-Cl6p-5CM_4DMoR-WpnkmYZgpw/edit#
 I (and others) argue for colocated meta and master going forward looking at 
options. Let me freshen it with arguments made here.

Colocating meta and master has nice properties. The in-memory image of the 
cluster layout -- probably a severe sub-set of what is actually in meta -- 
would need to fit a single-server's RAM in either model.  When colocated, 
operations are faster, less prone-to-error when less RPC involved (We'd still 
be subject to 
http://writings.quilt.org/2014/05/12/distributed-systems-and-the-end-of-the-api/
 if persisting meta in hdfs as francis notes above).  A single machine hosting 
single meta would not be able to service a 50M region startup with hundreds or 
regionservers as well as a deploy with split meta.  It could. It'd just be 
slower. Colocated meta and master implies single meta forever and that single 
meta is served by one server only -- a 50M meta region would be an anomaly in 
the cluster being bigger than all the rest -- and until we have HBASE-10295 
"Refactor the replication implementation to eliminate permanent zk node" and/or 
HBASE-11467 "New impl of Registry interface not using ZK + new RPCs on master 
protocol" (Maybe a later phase of HBASE-10070 when followers can run closer in 
to the leader state would work here) or a new master layout where we partition 
meta across multiple master server.

A plus split meta has over colocated master and meta is that master currently 
can be down for some period of time and the cluster keeps working; no splits 
and no merges and if a machine crashes while master is down, data is offline 
till master comes back (needs more exercise).  This is less the case when 
colocated master and meta.

Please pile on all with thoughts. We need to put stake in grounds soon for 
hbase 2.0 cluster topology.  Francis needs something in 0.98 timeframe.  If the 
0.98 is different to what folks want for 2.0, as per Andy lets split this issue.

Thoughts-for-the-day:

+ HBase is supposed to be able to scale
+ Single meta came about because way back, we were too lazy to fix issues that 
arose when meta was split (at the time, we didn't need to scale as much).

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-19 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102555#comment-14102555
 ] 

Francis Liu commented on HBASE-11165:
-

{quote}
Can I have some pointers on how to read the above. Zk-less AM is better because 
you scan a table – you don't have to ls znodes? What is the 1M znodes vs 1M 
rows about in above?
{quote}
Essentially the apis are better. ie 1M rows we can iterate over the rows 
instead of ls and get back a huge chunk of data. ie deleting 1M znodes takes 
too long, this could be parallelizable against an hbase table.

For 2.a, response is below. For 2.b, it's mainly a concern wether we'll hit 
other ZK issues when having that many child znodes (1M and beyond). HDFS guys 
are already looking into scaling number of child directories for NN.

Will update doc.

{quote}
Francis Liu Is the above the basis for your "...As our experiments shows 
splitting is a must for scaling."? If split meta, then more read/write 
throughput? 
{quote}
If split meta, then:  1) Less write amplification (ie no large compactions), 
Better W throughput. 2) More disks, more R/W throughput. 3. More heap to fit 
meta, better R throughput.

{quote}
Because the meta table could be served by many machines so field more 
reads/writes? The reads/writes are needed at starttime or during cluster 
lifetime in your judgement? Thanks.
{quote}
Yep needed for startup. We need to do experiments for 1 rack and 2 rack failure 
for cluster lifetime case. Though large compactions would creep up on you. So 
splitting would still be motivating for cluster lifetime IMHO.  


> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101827#comment-14101827
 ] 

stack commented on HBASE-11165:
---

On the nice attached doc:

{code}
Thoughts:
Advantage   of  zk  lessassignment
1)  Better  API's   (ls on  znode   vs  scantable)
2)  Scalability (If METAis  split)
a)  Read/Write  throughput
b)  Storage (1M child   znodes  vs  1M  
rowsin  META)
{code}

Can I have some pointers on how to read the above.  Zk-less AM is better 
because you scan a table -- you don't have to ls znodes?  What is the 1M znodes 
vs 1M rows about in above?

[~toffer] Is the above the basis for your "...As our experiments shows 
splitting is a must for scaling."?  If split meta, then more read/write 
throughput?  Because the meta table could be served by many machines so field 
more reads/writes?   The reads/writes are needed at starttime or during cluster 
lifetime in your judgement?  Thanks.

[~virag] Is the shrink in time from 12 mins to 5mins on zk-less assign or 
forceSync=no? (or for both)?

[~enis]
bq. Do we need to bring root back? Can't we simply host all of root in 
zookeeper? We expect the number of meta regions in the tens / hundreds case 
right?
 
Root access would be different to the access of any other table if we went this 
route. Everyone -- all clients, etc. -- would need to know how to do this new 
access type.  How would you do it in zk anyways?  Would be a single znode with 
all of root in it? Root would have zk enforced limits. Might be ok for a small 
table that changes infrequently. Not sure about znode with 1k rows in it 
(history, replicas? etc.). We'd have to test.

[~lhofhansl]
bq. We need a bootstrapping mechanism to find root anyway...

Yeah. There is zk now. Elsewhere, a quorum of masters has been proposed; you'd 
go to the master to figure where everything is.  That'd be a big change. 2.0.x.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-18 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101717#comment-14101717
 ] 

Francis Liu commented on HBASE-11165:
-

{quote}
Do we need to split this conversation into what to do on master and what to do 
with 0.98? We could for example file two separate subtasks that approach the 
meta scaling problem in different ways for the respective branches. They are 
divergent enough so that would be a good idea IMHO>
{quote}
It sounds to me like we've at least come to an agreement that splitting META is 
reasonable. Off-hand any implementation in trunk should be backportable 0.98. 
The parallel discussion was about multi-master and the merits of collocation 
which can be a separate issue. Correct me if I'm wrong [~mbertozzi] and 
[~jxiang].  

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-18 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101710#comment-14101710
 ] 

Francis Liu commented on HBASE-11165:
-

{quote}
I agree with Matteo on this. One more benefit to have meta and master together 
is the meta/master recovery will be much simpler (I mean there won't be 
scenario like master is recovering, meta regionserver may be down).
{quote}
Seemed to miss this. What is the other benefit?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-18 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101707#comment-14101707
 ] 

Francis Liu commented on HBASE-11165:
-

{quote}
meta is only modified by the master, and the master is mainly doing operations 
on meta. so if meta is down you have a master up which is not useful at much. 
{quote}
Hmm so in this context things are neither really better or worse with either 
option?

{quote}
Also since all writes to meta are controlled by the master having meta anywhere 
else will not improve things since the master is the one processing requests 
and you also have to go to another server to perform the operation. 
{quote}
For writes you already have to go to a bunch of other servers (datanodes) to 
perform the write operation and for reads worst case remote dfs read. Also as 
we've pointed out in our experiments that overhead is not much (if any at all) 
and overshadowed by the gains you get from horizontal scalability through SoC. 

{quote}
if you are talking about the read side, I can understand why you want to split 
it, but isn't having multiple copies even as cache simpler and you'll get the 
same results?
{quote}
I'm talking about both read and writes. Being able to split it means being able 
to have read/write throughput equivalent to the number of machines hosting the 
table. Have less write amplification (which is already an issue) as well as 
horizontally scale to have enough memory to have meta in block cache. 





> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-18 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101556#comment-14101556
 ] 

Francis Liu commented on HBASE-11165:
-

{quote}
Came here to say exactly that. Enis beat me to it. Can we please not bring back 
root? We need a bootstrapping mechanism to find root anyway, and if we can do 
that we should be able to bootstrap a few dozen meta regions from there. 
Currently we'd use ZK for that, but we don't have to.
{quote}

[~enis] My main intent is to be able to split meta to avoid having such a large 
region. The root approach is well understood and AFAIK has no real downsides? 
Having it ZK is one way as long as we can have around a thousand entries in 
there that should be fine for our use case. I thought the trend was to move a 
lot of heavy lifting out of ZK which sounds reasonable to me. 

[~lhofhansl] where do you propose we put it? 

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100953#comment-14100953
 ] 

Andrew Purtell commented on HBASE-11165:


bq. I agree with Matteo on this. One more benefit to have meta and master 
together is the meta/master recovery will be much simpler

Do we need to split this conversation into what to do on master and what to do 
with 0.98? We could for example file two separate subtasks that approach the 
meta scaling problem in different ways for the respective branches. They are 
divergent enough so that would be a good idea IMHO>

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-18 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100778#comment-14100778
 ] 

Jimmy Xiang commented on HBASE-11165:
-

I agree with Matteo on this. One more benefit to have meta and master together 
is the meta/master recovery will be much simpler (I mean there won't be 
scenario like master is recovering,  meta regionserver may be down).

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-15 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099529#comment-14099529
 ] 

Matteo Bertozzi commented on HBASE-11165:
-

{quote}I don't see why collocation is required to have the features you 
mentioned. IMHO all you need is the mutations/locking/etc to be centrally 
managed (ie all updates to meta is sent to the master), the meta itself can be 
anywhere. So I'd argue that central management is needed, collocation is not 
required. Please correct me if I'm missing something here.{quote}
meta is only modified by the master, and the master is mainly doing operations 
on meta. so if meta is down you have a master up which is not useful at much. 
Also since all writes to meta are controlled by the master having meta anywhere 
else will not improve things since the master is the one processing requests 
and you also have to go to another server to perform the operation.
if you are talking about the read side, I can understand why you want to split 
it, but isn't having multiple copies even as cache simpler and you'll get the 
same results?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-15 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099505#comment-14099505
 ] 

Lars Hofhansl commented on HBASE-11165:
---

bq. Do we need to bring root back? Can't we simply host all of root in 
zookeeper?

Came here to say exactly that. Enis beat me to it. Can we please not bring back 
root? We need a bootstrapping mechanism to find root anyway, and if we can do 
that we should be able to bootstrap a few dozen meta regions from there. 
Currently we'd use ZK for that, but we don't have to.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-15 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099401#comment-14099401
 ] 

Enis Soztutar commented on HBASE-11165:
---

bq. That's great, as a first step will bring back root.
Do we need to bring root back? Can't we simply host all of root in zookeeper? 
We expect the number of meta regions in the tens / hundreds case right? 

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-15 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099287#comment-14099287
 ] 

Andrew Purtell commented on HBASE-11165:


bq. For trunk it should be fine to have it enabled by default or not even have 
the switch

In my opinion for trunk it's not necessary to have a switch. 

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-15 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099285#comment-14099285
 ] 

Francis Liu commented on HBASE-11165:
-

{quote}
Yes, but older clients must be able to run against any given latest 0.98.x 
server side version, so these changes should hide behind a default-disabled 
configuration toggle.
{quote}
Yep will do this for 0.98. For trunk it should be fine to have it enabled by 
default or not even have the switch?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-15 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099283#comment-14099283
 ] 

Francis Liu commented on HBASE-11165:
-

{quote}
I haven't read the full discussion, so sorry if I missed this piece.
does splitting meta means having multiple master each one handing its own meta 
and its own set of RS?
{quote}
It does not have to be. So far we can see there are gains to be made in 
scalability by spliting meta so that it is served by multiple RS. We will do 
the work to motivate multi-master and justify the complexity if need be.

{quote}
otherwise we go back as before, the idea of having meta colocated with the 
master was have operation like assignment, or disable/enable, create/delete 
interact with "local data" and avoid complexity in handling failure when 
interacting with other machines.
{quote}
I don't see why collocation is required to have the features you mentioned. 
IMHO all you need is the mutations/locking/etc to be centrally managed (ie all 
updates to meta is sent to the master), the meta itself can be anywhere. So I'd 
argue that central management is needed, collocation is not required. Please 
correct me if I'm missing something here.


> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-15 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099168#comment-14099168
 ] 

Andrew Purtell commented on HBASE-11165:


{quote}
bq. As long as lower scale deploys don't run into potential new issues with a 
split meta in 0.98, we could be good, i.e. disabled by default in 0.98, enabled 
by default in later versions.
That's great, as a first step will bring back root. And next patch would be do 
split meta. Does it sound reasonable?
{quote}
Yes, but older clients must be able to run against any given latest 0.98.x 
server side version, so these changes should hide behind a default-disabled 
configuration toggle.

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-15 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099118#comment-14099118
 ] 

Matteo Bertozzi commented on HBASE-11165:
-

{quote}That's great, as a first step will bring back root. And next patch would 
be do split meta. Does it sound reasonable?{quote}
I haven't read the full discussion, so sorry if I missed this piece.
does splitting meta means having multiple master each one handing its own meta 
and its own set of RS?
otherwise we go back as before, the idea of having meta colocated with the 
master was have operation like assignment, or disable/enable, create/delete 
interact with "local data" and avoid complexity in handling failure when 
interacting with other machines.



> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-15 Thread Virag Kothari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099107#comment-14099107
 ] 

Virag Kothari commented on HBASE-11165:
---

[~jxiang][~mantonov] I had hackily fixed HBASE-11290, HBASE-11758 and 
HBASE-11759 and saw that bulk assignment time reduced to ~5 mins from 12 mins. 
Will be working on proper fix for HBASE-11290 soon..your thoughts will be 
helpful. 


> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-15 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099097#comment-14099097
 ] 

Francis Liu commented on HBASE-11165:
-

[~mantonov] Have yet to read replicated master jira as well. 

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-15 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099096#comment-14099096
 ] 

Francis Liu commented on HBASE-11165:
-

[~mantonov] I think we can split meta and later on build/extend it to support 
partitioned master if need be, though as Andy mentioned the master part prolly 
not in 0.98. We'll be doing more investigation to motivate multi-master.

[~andrew.purt...@gmail.com] That's great, as a first step will bring back root. 
And next patch would be do split meta. Does it sound reasonable?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-15 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099005#comment-14099005
 ] 

Mikhail Antonov commented on HBASE-11165:
-

bq. That kind of change could not be backported
Right - that's why I brought this question up :)

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-15 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098949#comment-14098949
 ] 

Andrew Purtell commented on HBASE-11165:


bq. do you think split meta implies partitioned masters, each responsible for 
its piece

That kind of change could not be backported

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> 
>
> Key: HBASE-11165
> URL: https://issues.apache.org/jira/browse/HBASE-11165
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
> Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
> zk_less_assignment_comparison.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" 
> and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
> regions maybe even 50M later.  This issue is about discussing how we will do 
> that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >