Maybe you two smart fellas can between you make a recommendation and a patch? Thanks lads, St.Ack
On Fri, Dec 18, 2009 at 11:44 AM, bmdevelopment <bmdevelopm...@gmail.com>wrote: > Hi, > Fyi, I came across similar issues when working on HBASE-1975. > The return values did not seem to be correct to me either, but when I began > changing them it seemed to lead to quite involved changes in the SCVF and > Filter unit tests - something I wanted to avoid. > In the end, I tried to keep the changes to SCVF as simple as possible. > At one point, I did also attempt my own version of SCVF and ran into the > same issue of having to use HbaseObjectWritable.addToMap(). > > Now I am beginning to use MUST_PAST_ALL and MUST_PASS_ONE FilterList of > SCVFs - maybe similar to what Paul is doing in his original mail. So, if it > is not working as expected, I will probably need this in the near future as > well. > > Thanks > Jeremiah > > > Paul Ambrose wrote: > >> Ugh. I am afraid not. >> The two changes that I am advocating (that could break someone else, which >> is >> of course problematic) are: >> >> 1) SingleColumnValueFilter.filterKeyValue(KeyValue keyValue) >> When the column name does not match, the return value should be NEXT_ROW, >> rather than INCLUDE. As mentioned earlier, when called by FilterList, >> the INCLUDE return value discontinues further filter evaluation for a >> given KeyValue >> in FilterList. That is problematic because matchedColumn is later checked >> in filterRow >> and will always be false for unevaluated filters. >> >> 2) FilterList.filterKeyValue(KeyValue v) returns SKIP and I do not know >> why. >> In the case of MUST_PASS_ALL, a filter not returning an INCLUDE >> should result in a NEXT_ROW (not SKIP) being returned, and at the bottom, >> an INCLUDE should always be returned (rather than a SKIP). >> >> Here is a dumb question. A while ago, I tried to add my own filter to the >> server, but I could not get it going without adding an entry in >> HbaseObjectWritable.addToMap(). Should I be able to add a filter without >> this step? If so, I am content to have my own version of the >> SingleColumnValueFilter >> and FilterList and not risk breaking others (though I do think the code is >> incorrect). >> >> >> >> On Dec 17, 2009, at 10:27 AM, stack wrote: >> >> On Tue, Dec 15, 2009 at 10:42 PM, Paul Ambrose <pambr...@mac.com> wrote: >>> >>> Hey Michael, >>>> >>>> If hbase-2037 will make it into 0.20.3, I am fine. >>>> >>>> Grand. >>> >>> Will hbase-2037 fix both issues you describe? (Have you tried it I >>> wonder?) >>> >>> St.Ack >>> >>> >>> >>> If not, I would greatly appreciate you breaking it out for 0.20.3. >>>> >>>> >>>> >>> >>> >>> Thanks, >>>> Paul >>>> >>>> >>>> >>>> On Dec 15, 2009, at 10:28 PM, stack wrote: >>>> >>>> Paul: >>>>> >>>>> I can apply the fix from hbase-2037... I can break it out of the posted >>>>> patch thats up there. Just say the word. >>>>> >>>>> St.Ack >>>>> >>>>> >>>>> On Tue, Dec 15, 2009 at 4:17 PM, Ram Kulbak <ram.kul...@gmail.com> >>>>> >>>> wrote: >>>> >>>>> Hi Paul, >>>>>> >>>>>> I've encountered the same problem. I think its fixed as part of >>>>>> https://issues.apache.org/jira/browse/HBASE-2037 >>>>>> >>>>>> Regards, >>>>>> Yoram >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Dec 16, 2009 at 10:45 AM, Paul Ambrose <pambr...@mac.com> >>>>>> >>>>> wrote: >>>> >>>>> I ran into some problems with FilterList and SingleColumnValueFilter. >>>>>>> >>>>>>> I created a FilterList with MUST_PASS_ONE and two >>>>>>> >>>>>> SingleColumnValueFilters >>>>>> >>>>>>> (each testing equality on a different columns) and query some trivial >>>>>>> >>>>>> data: >>>>>> >>>>>>> http://pastie.org/744890 >>>>>>> >>>>>>> The problem that I encountered were two-fold: >>>>>>> >>>>>>> SingleColumnValueFilter.filterKeyValues() returns ReturnCode.INCLUDE >>>>>>> if the column names do not match. If FilterList is employed, then >>>>>>> when >>>>>>> >>>>>> the >>>>>> >>>>>>> first Filter returns INCLUDE (because the column names do not match), >>>>>>> >>>>>> no >>>> >>>>> more filters for that KeyValue are evaluated. That is problematic >>>>>>> >>>>>> because >>>>>> >>>>>>> when filterRow() is finally called for those filters, matchedColumn >>>>>>> is >>>>>>> never >>>>>>> found to be true because they were not invoked (due to FilterList >>>>>>> >>>>>> exiting >>>> >>>>> from >>>>>>> the filterList iteration when the name mismatched INCLUDE was >>>>>>> >>>>>> returned). >>>> >>>>> The fix (at least for this scenario) is for >>>>>>> SingleColumnValueFilter.filterKeyValues() to >>>>>>> return ReturnCode.NEXT_ROW (rather than INCLUDE). >>>>>>> >>>>>>> The second problem is at the bottom of FilterList.filterKeyValue() >>>>>>> where ReturnCode.SKIP is returned if MUST_PASS_ONE is the operator, >>>>>>> rather than always returning ReturnCode.INCLUDE and then leaving the >>>>>>> final filter decision to be made by the call to filterRow(). I am >>>>>>> >>>>>> sure >>>> >>>>> there is a good >>>>>>> reason for returning SKIP in other scenarios, but it is problematic >>>>>>> in >>>>>>> mine. >>>>>>> >>>>>>> Feedback would be much appreciated. >>>>>>> >>>>>>> Paul >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>> >> >