On Wed, Dec 2, 2015 at 10:01 PM, Jerry He <jerry...@gmail.com> wrote:
> Thanks for the response. You got my question correctly. > If we are scanning the rows one by one and we have the requested column in > the column tracker, we have the row+column to look up in the bloom filter, > don't we? We may not be able to filter out the file scanners upfront. But > may at the later time and lower level to skip something? > > <I've not looked at the code>You are right. If more than one explicit column specified, we could do a bloom check for the second and so on since we'd have the current row to hand. It could make for a nice speedup for scans of many explicit columns traversing a dataset that is sparsely populated.</I've not looked at the code>. St.Ack > Jerry > > On Mon, Nov 30, 2015 at 10:55 PM, Stack <st...@duboce.net> wrote: > > > On Mon, Nov 30, 2015 at 9:56 AM, Jerry He <jerry...@gmail.com> wrote: > > > > > Hi, experts > > > > > > HBASE supports ROWCOL bloom filter. ROW+COL would be the bloom key. > > > In most of the documentations, it says only GET would benefit. For > > > multi-column as well. > > > > > > If I do scan with StartRow and EndRow, and also specify columns. > > > Would ROWCOL bloom filter provide any benefit in anyway? > > > > > > > > If I understand your question properly, the answer is no. While we might > > have a set of columns to check in the bloom, we'd not know the set of > rows > > between start and end row and so would not be able to formulate a query > > against the ROW+COL bloom filter. > > > > St.Ack > > > > > > > > > Thank you. > > > > > > Jerry > > > > > >