Sorry, 1000 columns, each 2K, so each row is 2M. I guess HBase will keep a 
single KV (i.e., a column rather than a row) in a block, so a row will 
span multiple blocks?

My scan pattern is: I will do range scan, find the matching row keys, and 
fetch the whole row for each row that matches my criteria.

Best regards,
Wei

---------------------------------
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center
http://researcher.ibm.com/person/us-wtan



From:   lars hofhansl <la...@apache.org>
To:     "user@hbase.apache.org" <user@hbase.apache.org>, 
Date:   01/29/2014 03:49 PM
Subject:        Re: larger HFile block size for very wide row?



You 1000 columns? Not 1000k = 1m column, I assume.
So you'll have 2MB KVs. That's a bit on the large side.

HBase will "grow" the block to fit the KV into it. It means you have 
basically one block per KV.
I guess you address these rows via point gets (GET), and do not typically 
scan through them, right?

Do you see any performance issues?

-- Lars



________________________________
 From: Wei Tan <w...@us.ibm.com>
To: user@hbase.apache.org 
Sent: Wednesday, January 29, 2014 12:35 PM
Subject: larger HFile block size for very wide row?
 

Hi, I have a HBase table where each row has ~1000k columns, ~2K each. My 
table scan pattern is to use a row key filter but I need to fetch the 
whole row (~1000 k) columns back.

Shall I set HFile block size to be larger than the default 64K?
Thanks,
Wei

---------------------------------
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center
http://researcher.ibm.com/person/us-wtan

Reply via email to