>There're some new regions that they're just a some KBytes!. Why they are so
small?? When does HBase decide to split? because it started to split two
hours later to create the table.

When hbase does a split, it doesn't actually split at the disk/file level.
Its just a metadata operation which creates new regions that contain the
reference files that still point to old HFiles. That is the reason you find
KB size regions.

>I thought major compaction just happen once at day and compact many files
per region. Data is always the same here, I don't inject new data.

IIRC sometimes minor compactions get promoted to major compactions based on
some criteria, but I'll leave it for others to answer!



On Tue, Apr 15, 2014 at 3:15 PM, Guillermo Ortiz <konstt2...@gmail.com>wrote:

> I have a table in Hbase that sizes around 96Gb,
>
> I generate 4 regions of 30Gb. Some time, table starts to split because the
> max size for region is 1Gb (I just realize of that, I'm going to change it
> or create more pre-splits.).
>
> There're two things that I don't understand. how is it creating the splits?
> right now I have 130 regions and growing. The problem is the size of the
> new regions:
>
> 1.7 M    /hbase/filters/4ddbc34a2242e44c03121ae4608788a2
> 1.6 G    /hbase/filters/548bdcec79cfe9a99fa57cb18f801be2
> 3.1 G    /hbase/filters/58b50df089bd9d4d1f079f53238e060d
> 2.5 M    /hbase/filters/5a0d6d5b3b8faf67889ac5f5c2947c4f
> 1.9 G    /hbase/filters/5b0a35b5735a473b7e804c4b045ce374
> 883.4 M  /hbase/filters/5b49c68e305b90d87b3c64a0eee60b8c
> 1.7 M    /hbase/filters/5d43fd7ea9808ab7d2f2134e80fbfae7
> 632.4 M  /hbase/filters/5f04c7cd450d144f88fb4c7cff0796a2
>
> There're some new regions that they're just a some KBytes!. Why they are so
> small?? When does HBase decide to split? because it started to split two
> hours later to create the table.
>
> One, I create the table and insert data, I don't insert new data or modify
> them.
>
>
> Another interested point it's why there're major compactions:
> 2014-04-15 11:33:47,400 INFO org.apache.hadoop.hbase.regionserver.Store:
> Renaming compacted file at
>
> hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/.tmp/df90c260cb4e4256a153dd178244f04c
> to
>
> hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/d/df90c260cb4e4256a153dd178244f04c
> 2014-04-15 11:33:47,407 INFO
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader: Loaded ROWCOL
> (CompoundBloomFilter) metadata for df90c260cb4e4256a153dd178244f04c
> 2014-04-15 11:33:47,416 INFO org.apache.hadoop.hbase.regionserver.Store:*
> Completed major compaction of 1 file*(s) in d of
> filters,51,1397554175140.ef994715505054299ede8c48c600cea4. into
> df90c260cb4e4256a153dd178244f04c, size=789.1 M; total size for store is
> 789.1 M
> 2014-04-15 11:33:47,416 INFO
> org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest:
> completed compaction:
> regionName=filters,51,1397554175140.ef994715505054299ede8c48c600cea4.,
> storeName=d, fileCount=1, fileSize=1.5 G, priority=6, time=414761474510060;
> duration=7sec
>
> I thought major compaction just happen once at day and compact many files
> per region. Data is always the same here, I don't inject new data.
>
>
> I'm working with 0.94.6 CDH44. I'm going to change the size of the regions,
> but, I would like to understand why things happen.
>
> Thank you.
>



-- 
Bharath Vissapragada
<http://www.cloudera.com>

Reply via email to