Sorry, 1000 columns, each 2K, so each row is 2M. I guess HBase will keep a single KV (i.e., a column rather than a row) in a block, so a row will span multiple blocks?
My scan pattern is: I will do range scan, find the matching row keys, and fetch the whole row for each row that matches my criteria. Best regards, Wei --------------------------------- Wei Tan, PhD Research Staff Member IBM T. J. Watson Research Center http://researcher.ibm.com/person/us-wtan From: lars hofhansl <la...@apache.org> To: "user@hbase.apache.org" <user@hbase.apache.org>, Date: 01/29/2014 03:49 PM Subject: Re: larger HFile block size for very wide row? You 1000 columns? Not 1000k = 1m column, I assume. So you'll have 2MB KVs. That's a bit on the large side. HBase will "grow" the block to fit the KV into it. It means you have basically one block per KV. I guess you address these rows via point gets (GET), and do not typically scan through them, right? Do you see any performance issues? -- Lars ________________________________ From: Wei Tan <w...@us.ibm.com> To: user@hbase.apache.org Sent: Wednesday, January 29, 2014 12:35 PM Subject: larger HFile block size for very wide row? Hi, I have a HBase table where each row has ~1000k columns, ~2K each. My table scan pattern is to use a row key filter but I need to fetch the whole row (~1000 k) columns back. Shall I set HFile block size to be larger than the default 64K? Thanks, Wei --------------------------------- Wei Tan, PhD Research Staff Member IBM T. J. Watson Research Center http://researcher.ibm.com/person/us-wtan