Re: Understanding the HBase Scanner and Null Cells

stack Fri, 13 Nov 2009 16:23:16 -0800

On Fri, Nov 13, 2009 at 7:19 AM, Cedric McDougal <[email protected]>wrote:


> Hi,
>
> I'm using HBase for a project in which I have very few columns in each
> table
> with greatly varying lengths. For example, in one table I might have one
> column with 1 million rows of data and one column with 100. In other words,
> there will be a lot of null cells in each table.
>
> What I'm wondering is how these null cells are treated when the table is
> read into memory using the scan operation? I'm assuming they are read into
> a
> buffer, found to be null, then discarded, but I'm not really sure what is
> happening within the system during the scan. Will a large number of null
> cells noticeably slow down the scan or are they handled very quickly? Would
> it be too expensive to have a single table with a lot of nulls vs. having
> multiple tables with very few?
>


nulls do not cost.  There is no 'null' signifier stored per row to mark an
absence.

If you have a table with a couple of rows where one column has an entry
across 1M rows whereas the other has only 10 entries across the same 1M
rows, only the ten values of the second column are stored.

St.Ack

Re: Understanding the HBase Scanner and Null Cells

Reply via email to