[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-22 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937551#comment-15937551
 ] 

Anoop Sam John commented on HBASE-17739:


Agree..  
BTW we by def wont be keeping compressed blocks in cache.  Uncompressed data 
only we keep. Ya when blocks are DBEed , we keep the block in encoded format 
and for data like time series, the size will be much lesser.

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-22 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936970#comment-15936970
 ] 

Vladimir Rodionov commented on HBASE-17739:
---

Do not forget 8-byte padding for small data types. Ref size is 4 bytes only if 
heap is less than 28GB (or 32?) and block size is much less if you consider 
compression, especially for time-series data, where value is much less than 
rowkey. Again, overhead depends on the application data, compression and block 
size, and usually can be at least 1% of a cache size.  

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-22 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936813#comment-15936813
 ] 

Anoop Sam John commented on HBASE-17739:


Thanks.  Ya I missed some parts like Object size as such and CSLM entry size.  
My math comes this way
BlockCacheKey
---
String hfileName  -  Ref  - 4
long offset  - 8
BlockType blockType  - Ref  - 4
boolean isPrimaryReplicaBlock  - 1
Total  =  12 (Object) + 17 = 29

BucketEntry

int offsetBase  -  4
int length  - 4
byte offset1  -  1
byte deserialiserIndex  -  1
long accessCounter  -  8
BlockPriority priority  - Ref  - 4
volatile boolean markedForEvict  -  1
AtomicInteger refCount  -  16 + 4
long cachedTime  -  8
Total = 12 (Object) + 51 = 63

ConcurrentHashMap Map.Entry  -  40
blocksByHFile ConcurrentSkipListSet Entry  -  40

Total = 29 + 63 + 80 = 172

May be u considering 8 as the Reference size.  I am following what ClassSize 
returns.  Pls refer UnsafeLayout

Ya it is big only I agree.   For 10 million entries we will end up having 1.6 
GB heap size.  (Say considering 64 KB block size, 10 million means 600 GB cache 
size).

I went through this some time back also.  Seeing how far we grow in heap size 
when growing really big with BC.  But some calc did not get correctly and it 
was not coming this big.
Let me raise an issue. I can see some possibility to reduce this. Any reduction 
here will help us for sure.

Thanks Vladimir for raising the concern.

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-21 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935224#comment-15935224
 ] 

Vladimir Rodionov commented on HBASE-17739:
---

{quote}
What was ur math V?
{quote}

BucketEntry has actually 80 bytes and BucketCacheKey - 48. Besides you need to 
add ConcurrentHashMap Map.Entry overhead for every block in a backing map, 
which is 48 bytes more =>

Only for backingMap we have 168 bytes overhead

blocksByHFile which is ConcurrentSkipListSet adds another 48 bytes (for 
Map.Entry) 

Total 216 bytes already. That is not 500 as I posted above, just miscalculated 
some overhead in IdReadWriteLock, but nevertheless - it is quite substantial.

 

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-21 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934137#comment-15934137
 ] 

ramkrishna.s.vasudevan commented on HBASE-17739:


bq.The implications are two-fold: decreased SSD life span due to huge write 
amplification and decreased write performance when SSD is close to full
I can check on this more. Currently doing some work related to this SSD 
performance. Will be back on it.

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-20 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934111#comment-15934111
 ] 

Anoop Sam John commented on HBASE-17739:


bq.Heap overhead is close to 500 bytes per block. 
You mean the Backing map entry per block right?
BucketEntry if I see, having an overhead of 47 bytes per object and 
BucketCacheKey 17 bytes per object.  So per entry size is below 100 only?  What 
was ur math V?

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-20 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933266#comment-15933266
 ] 

Vladimir Rodionov commented on HBASE-17739:
---

BucketCache has two major design/implementation issues:

# It keeps all meta (block locations and block list per file) on Java heap
# It is not write friendly for SSDs. All write operations are random. The 
implications are two-fold: decreased SSD life span due to huge write 
amplification and decreased write performance when SSD is close to full

Both are very serious problems, which prevents BucketCache going big. 

Heap overhead is close to 500 bytes per block. Do your math. 10M blocks require 
5GB. If block size with compression 20Kb it translates to 200GB cache. For 1TB 
cache you will need 25GB of Java heap only to keep blocks meta.


  

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-15 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15926136#comment-15926136
 ] 

Anoop Sam John commented on HBASE-17739:


Yes. Once we complete the 64 KB bucket slots, we will convert the 512 KB slots 
into 64 KB (Bucket will have more slots then and each slot size is reduced) and 
use.  The #2 case is said considering this way only.  The point is say when we 
have blocks of size 2.1 KB and the slot size is 4, each of the block addition 
waste almost 50% memory.  The HP use case which Stack originally described also 
with low block size. The impact of #2 is more when actual block size itself is 
more.  U can see with every block size bucket, we give some extra size to 
accommodate block header or so and also block size boundary is not a strict 
one..  The brainstorming point here is any way we can avoid/reduce this.   Hope 
u got the intent correctly now.

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-15 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925985#comment-15925985
 ] 

ramkrishna.s.vasudevan commented on HBASE-17739:


AFter reading the code, I think in case of #1 - yes the buckets in between are 
not used efficiently when we constantly get 64K buckets because all other 
intermediate buckets upto 513K size are only one in number. so we will not use 
them.
But once that 513k buckets are getting used up I think we won't waste it 
instead we will convert those 513k to 64k sized buckets right?
So for #2 once we start getting on avg some XKB buckets I think after that 
overall we wil be allowing the bucket to grow in an uniform way and the wastage 
should be getting normalized. Infact while freeing the block also - say all the 
513 blocks are getting converted to 64k blocks we will keep freeing those 
blocks and add it to the 64k block count only?
I may be wrong. Pls do correct me if I am missing some thing here.
I think this problem is more prominent when overall our block sizes keep 
varying and I think that is where for encoded blocks they try to come up with a 
standardisation so that after encoding also the ratio is maintained across 
blocks. 


> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-13 Thread Janos Gub (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15922507#comment-15922507
 ] 

Janos Gub commented on HBASE-17739:
---

If there is no bucket of that size we will end up evicting a bucket and 
initializing a correct sized one right? (That is how we avoid having almost 
only 513K sized buckets with current implementation)

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-13 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15907743#comment-15907743
 ] 

Anoop Sam John commented on HBASE-17739:


One bucket size (Say 8 KB) is there as per the config and we dont know what 
time  there will be a need for caching a block of this size of near this size. 
If there is not even one bucket of this size, we will end up in having no most 
apt bucket to store that block. Say that block was 7.5 KB and the next size to 
8 KB bucket is 12 KB.  So the selection algo has to select the 12 KB sized 
bucket and the wastage will be more..  Might be like, we will need more 
intelligence in the system so as to auto tune these sizes.  That is also an aim 
from this jira if am not wrong.. Lets take it to next levels.

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-13 Thread Janos Gub (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15907736#comment-15907736
 ] 

Janos Gub commented on HBASE-17739:
---

Why is it needed to hold at least one bucket for each size? 

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-13 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15907717#comment-15907717
 ] 

Anoop Sam John commented on HBASE-17739:


It might help with issue#2 what is mentioned in description but will badly 
affect the issue#1?  We have more #sized buckets now and every size is supposed 
to keep at least one bucket, chances of many sizes going waste?  Or else the 
sizes to be very well tuned and selected.

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-13 Thread Janos Gub (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15907611#comment-15907611
 ] 

Janos Gub commented on HBASE-17739:
---

What if instead of the fix number of buckets we set up a bucket distribution 
between 1K and maxsize in a way to have a controlled cache loss? e.g. we would 
like to ensure we lose at most 10% of our cache. Then we will have an 1K 
bucket, then a 1.1K, 1.21K etc. In this way we will have more typesizes (for 
513K we will have 66). And forget about the rule of you must have at least one 
bucket for each size? 

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-07 Thread Allan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900828#comment-15900828
 ] 

Allan Yang commented on HBASE-17739:


Here is the Issue
https://issues.apache.org/jira/browse/HBASE-17757

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-07 Thread Allan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900589#comment-15900589
 ] 

Allan Yang commented on HBASE-17739:


Right away, Sir! [~zjushch] [~stack]

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-07 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900583#comment-15900583
 ] 

chunhui shen commented on HBASE-17739:
--

It's a feature completed by my teammate [~allan163], he will open a new issue 
to talk about this.
Thanks, sir

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15898786#comment-15898786
 ] 

stack commented on HBASE-17739:
---

bq. How about making the encoded blocks have similar size?

Tell us more [~zjushch]?

(Thanks for all the help on bucket cache)

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations

2017-03-06 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15898603#comment-15898603
 ] 

chunhui shen commented on HBASE-17739:
--

How about making the encoded blocks have similar size?
In fact, we did this to increase the ratio of bucket cache space. 
If any interesting  or aggree so , I could open a new issue to upload the patch.

> BucketCache is inefficient/wasteful/dumb in its bucket allocations
> --
>
> Key: HBASE-17739
> URL: https://issues.apache.org/jira/browse/HBASE-17739
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>
> By default we allocate 14 buckets with sizes from 5K to 513K. If lots of heap 
> given over to bucketcache and say no allocattions made for a particular 
> bucket size, this means we have a bunch of the bucketcache that just goes 
> idle/unused.
> For example, say heap is 100G. We'll divide it up among the sizes. If say we 
> only ever do 5k records, then most of the cache will go unused while the 
> allocation for 5k objects will see churn.
> Here is an old note of [~anoop.hbase]'s' from a conversation on bucket cache 
> we had offlist that describes the issue:
> "By default we have those 14 buckets with size range of 5K to 513K.
>   All sizes will have one bucket (with size 513*4) each except the
> last size.. ie. 513K sized many buckets will be there.  If we keep on
> writing only same sized blocks, we may loose all in btw sized buckets.
> Say we write only 4K sized blocks. We will 1st fill the bucket in 5K
> size. There is only one such bucket. Once this is filled, we will try
> to grab a complete free bucket from other sizes..  But we can not take
> it from 9K... 385K sized ones as there is only ONE bucket for these
> sizes.  We will take only from 513 size.. There are many in that...
> So we will eventually take all the buckets from 513 except the last
> one.. Ya it has to keep at least one in evey size.. So we will
> loose these much size.. They are of no use."
> We should set the size type on the fly as the records come in.
> Or better, we should choose record size on the fly. Here is another comment 
> from [~anoop.hbase]:
> "The second is the biggest contributor.  Suppose instead of 4K
> sized blocks, the user has 2 K sized blocks..  When we write a block to 
> bucket slot, we will reserve size equal to the allocated size for that block.
> So when we write 2K sized blocks (may be actual size a bit more than
> 2K ) we will take 5K with each of the block.  So u can see that we are
> loosing ~3K with every block. Means we are loosing more than half."
> He goes on: "If am 100% sure that all my table having 2K HFile block size, I 
> need to give this config a value 3 * 1024 (Exact 2 K if I give there may be
> again problem! That is another story we need to see how we can give
> more guarantee for the block size restriction HBASE-15248)..  So here also 
> ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(“"
> So, we should figure the record sizes ourselves on the fly.
> Anything less has us wasting loads of cache space, nvm inefficiences lost 
> because of how we serialize base types to cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)