Hmm, I don't mean query bloom filter directly. I mean the storefilescanner will query rowcol bloomfilter to see is it need a seek or not. And I guess this will be performed on every row without need to specific a row keys?
> ROWCOL bloom says whether for a given row (rowkey) a given column (qualifier) > is present in an HFile or not. But for the user he dont know the rowkeys. He > wants all the rows with column 'x' > > -Anoop- > > ________________________________________ > From: Liu, Raymond [raymond....@intel.com] > Sent: Monday, March 11, 2013 7:43 AM > To: user@hbase.apache.org > Subject: RE: How HBase perform per-column scan? > > Just curious, won't ROWCOL bloom filter works for this case? > > Best Regards, > Raymond Liu > > > > > As per the above said, you will need a full table scan on that CF. > > As Ted said, consider having a look at your schema design. > > > > -Anoop- > > > > > > On Sun, Mar 10, 2013 at 8:10 PM, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > bq. physically column family should be able to perform efficiently > > > (storage layer > > > > > > When you scan a row, data for different column families would be > > > brought into memory (if you don't utilize HBASE-5416) Take a look at: > > > > > > > > > https://issues.apache.org/jira/browse/HBASE-5416?focusedCommentId=1354 > > > 1258&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-ta > > > bp > > > anel#comment-13541258 > > > > > > which was based on the settings described in: > > > > > > > > > > > > https://issues.apache.org/jira/browse/HBASE-5416?focusedCommentId=1354 > > > 1191&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-ta > > > bp > > > anel#comment-13541191 > > > > > > This boils down to your schema design. If possible, consider > > > extracting column C into its own column family. > > > > > > Cheers > > > > > > On Sun, Mar 10, 2013 at 7:14 AM, PG <pengyunm...@gmail.com> wrote: > > > > > > > Hi, Ted and Anoop, thanks for your notes. > > > > I am talking about column rather than column family, since > > > > physically column family should be able to perform efficiently > > > > (storage layer, CF's are stored separately). But columns of the > > > > same column family may be > > > mixed > > > > physically, and that makes filters column value hard... So I want > > > > to know if there are any mechanism in HBase worked on this... > > > > Regards, > > > > Yun > > > > > > > > On Mar 10, 2013, at 10:01 AM, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > > > > > Hi, Yun: > > > > > Take a look at HBASE-5416 (Improve performance of scans with > > > > > some kind > > > of > > > > > filters) which is in 0.94.5 release. > > > > > > > > > > In your case, you can use a filter which specifies column C as > > > > > the essential family. > > > > > Here I interpret column C as column family. > > > > > > > > > > Cheers > > > > > > > > > > On Sat, Mar 9, 2013 at 11:11 AM, yun peng > > > > > <pengyunm...@gmail.com> > > > wrote: > > > > > > > > > >> Hi, All, > > > > >> I want to find all existing values for a given column in a > > > > >> HBase, and > > > > would > > > > >> that result in a full-table scan in HBase? For example, given a > > > > >> column > > > > C, > > > > >> the table is of very large number of rows, from which few rows > > > > >> (say > > > > only 1 > > > > >> row) have non-empty values for column C. Would HBase still ues > > > > >> a full > > > > table > > > > >> scan to find this row? Or HBase has any optimization work for > > > > >> this > > > kind > > > > of > > > > >> query? > > > > >> Thanks... > > > > >> Regards > > > > >> Yun > > > > >> > > > > > > >