RE: HBase number of columns

2016-06-21 Thread Siddharth Ubale
Thanks Saad!

-Original Message-
From: Saad Mufti [mailto:saad.mu...@gmail.com] 
Sent: Thursday, June 16, 2016 10:30 PM
To: user@hbase.apache.org
Subject: Re: HBase number of columns

There is no real column schema in HBase other than defining the column family, 
each write to a column writes a cell with the column name plus value, so in 
theory number of columns doesn't really matter. What matters is how much data 
you read and write.

That said there are settings in the column family schema for 
DATA_BLOCK_ENCODING that affect how much actual space each column/cell takes, 
FAST_DIFF is a decent choice to make sure there is not too much redundancy by 
writing the same column name over and over again if lots of rows have the same 
column name. There are also compression settings of course.

Hope that helps.


Saad


On Wed, Jun 15, 2016 at 7:11 AM, Siddharth Ubale < siddharth.ub...@syncoms.com> 
wrote:

> Hi,
>
> As per the official documentation of HBase it is mentioned that HBase 
> typical schema should contain 1 to 3 column families per table ( 
> https://hbase.apache.org/book.html#table_schema_rules_of_thumb ) .
> However there is no mention of how many column qualifiers should a row 
> contain for each column family to see good read & write performance.
> Could anybody let us know their input on how many columns per row is 
> desirable in HBase or how many column qualifiers per column family 
> would be desirable.
> Thanks,
> Siddharth Ubale,
>
>


Re: HBase number of columns

2016-06-16 Thread Saad Mufti
There is no real column schema in HBase other than defining the column
family, each write to a column writes a cell with the column name plus
value, so in theory number of columns doesn't really matter. What matters
is how much data you read and write.

That said there are settings in the column family schema for
DATA_BLOCK_ENCODING
that affect how much actual space each column/cell takes, FAST_DIFF is a
decent choice to make sure there is not too much redundancy by writing the
same column name over and over again if lots of rows have the same column
name. There are also compression settings of course.

Hope that helps.


Saad


On Wed, Jun 15, 2016 at 7:11 AM, Siddharth Ubale <
siddharth.ub...@syncoms.com> wrote:

> Hi,
>
> As per the official documentation of HBase it is mentioned that HBase
> typical schema should contain 1 to 3 column families per table (
> https://hbase.apache.org/book.html#table_schema_rules_of_thumb ) .
> However there is no mention of how many column qualifiers should a row
> contain for each column family to see good read & write performance.
> Could anybody let us know their input on how many columns per row is
> desirable in HBase or how many column qualifiers per column family would be
> desirable.
> Thanks,
> Siddharth Ubale,
>
>


HBase number of columns

2016-06-15 Thread Siddharth Ubale
Hi,

As per the official documentation of HBase it is mentioned that HBase typical 
schema should contain 1 to 3 column families per table 
(https://hbase.apache.org/book.html#table_schema_rules_of_thumb ) .
However there is no mention of how many column qualifiers should a row contain 
for each column family to see good read & write performance.
Could anybody let us know their input on how many columns per row is desirable 
in HBase or how many column qualifiers per column family would be desirable.
Thanks,
Siddharth Ubale,