Ugh. I am afraid not. The two changes that I am advocating (that could break someone else, which is of course problematic) are:
1) SingleColumnValueFilter.filterKeyValue(KeyValue keyValue) When the column name does not match, the return value should be NEXT_ROW, rather than INCLUDE. As mentioned earlier, when called by FilterList, the INCLUDE return value discontinues further filter evaluation for a given KeyValue in FilterList. That is problematic because matchedColumn is later checked in filterRow and will always be false for unevaluated filters. 2) FilterList.filterKeyValue(KeyValue v) returns SKIP and I do not know why. In the case of MUST_PASS_ALL, a filter not returning an INCLUDE should result in a NEXT_ROW (not SKIP) being returned, and at the bottom, an INCLUDE should always be returned (rather than a SKIP). Here is a dumb question. A while ago, I tried to add my own filter to the server, but I could not get it going without adding an entry in HbaseObjectWritable.addToMap(). Should I be able to add a filter without this step? If so, I am content to have my own version of the SingleColumnValueFilter and FilterList and not risk breaking others (though I do think the code is incorrect). On Dec 17, 2009, at 10:27 AM, stack wrote: > On Tue, Dec 15, 2009 at 10:42 PM, Paul Ambrose <pambr...@mac.com> wrote: > >> Hey Michael, >> >> If hbase-2037 will make it into 0.20.3, I am fine. >> > > Grand. > > Will hbase-2037 fix both issues you describe? (Have you tried it I wonder?) > > St.Ack > > > >> If not, I would greatly appreciate you breaking it out for 0.20.3. >> >> > > > > >> Thanks, >> Paul >> >> >> >> On Dec 15, 2009, at 10:28 PM, stack wrote: >> >>> Paul: >>> >>> I can apply the fix from hbase-2037... I can break it out of the posted >>> patch thats up there. Just say the word. >>> >>> St.Ack >>> >>> >>> On Tue, Dec 15, 2009 at 4:17 PM, Ram Kulbak <ram.kul...@gmail.com> >> wrote: >>> >>>> Hi Paul, >>>> >>>> I've encountered the same problem. I think its fixed as part of >>>> https://issues.apache.org/jira/browse/HBASE-2037 >>>> >>>> Regards, >>>> Yoram >>>> >>>> >>>> >>>> On Wed, Dec 16, 2009 at 10:45 AM, Paul Ambrose <pambr...@mac.com> >> wrote: >>>> >>>>> I ran into some problems with FilterList and SingleColumnValueFilter. >>>>> >>>>> I created a FilterList with MUST_PASS_ONE and two >>>> SingleColumnValueFilters >>>>> (each testing equality on a different columns) and query some trivial >>>> data: >>>>> >>>>> http://pastie.org/744890 >>>>> >>>>> The problem that I encountered were two-fold: >>>>> >>>>> SingleColumnValueFilter.filterKeyValues() returns ReturnCode.INCLUDE >>>>> if the column names do not match. If FilterList is employed, then when >>>> the >>>>> first Filter returns INCLUDE (because the column names do not match), >> no >>>>> more filters for that KeyValue are evaluated. That is problematic >>>> because >>>>> when filterRow() is finally called for those filters, matchedColumn is >>>>> never >>>>> found to be true because they were not invoked (due to FilterList >> exiting >>>>> from >>>>> the filterList iteration when the name mismatched INCLUDE was >> returned). >>>>> The fix (at least for this scenario) is for >>>>> SingleColumnValueFilter.filterKeyValues() to >>>>> return ReturnCode.NEXT_ROW (rather than INCLUDE). >>>>> >>>>> The second problem is at the bottom of FilterList.filterKeyValue() >>>>> where ReturnCode.SKIP is returned if MUST_PASS_ONE is the operator, >>>>> rather than always returning ReturnCode.INCLUDE and then leaving the >>>>> final filter decision to be made by the call to filterRow(). I am >> sure >>>>> there is a good >>>>> reason for returning SKIP in other scenarios, but it is problematic in >>>>> mine. >>>>> >>>>> Feedback would be much appreciated. >>>>> >>>>> Paul >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >> >>