bq. HBase doesn't do well with more than 2-3 column families

The above is out of date - we have per column family flush which would
reduce the number of small hfiles.

bq. Why can't we just create several tables instead?

Currently hbase doesn't provide transaction across region boundary. This
means with more than one table, burden is on application code to
achieve transaction
where needed.
Since the multiple tables tend to have same row key design as you
mentioned, region servers carry more regions, increasing load on assignment
manager / balancer, etc.

Cheers

On Thu, Jun 22, 2017 at 5:44 AM, Alexander Ilyin <alexan...@weborama.com>
wrote:

> Hi,
>
> A general question regarding column families. It is said in the doc that
> HBase doesn't do well with more than 2-3 column families because flushing
> and compactions are done on a per region basis which should be addressed in
> the future: http://hbase.apache.org/book.html#number.of.cfs
>
> Is it still the case in new versions of HBase or there were some
> improvements on this?
>
> I also don't understand why using several column families might be useful
> even if data access is column scoped. Why can't we just create several
> tables instead? Row key is stored with every cell anyway and it's possible
> to filter by column when querying.
>
> In general, I don't see when it might make sense to have more than one
> column family in a table with current limitations.
>
> Thanks in advance.
>

Reply via email to