My earlier suggestion of having SCVF.filterKeyValue() not return INCLUDE
on column name mismatches was incorrect because INCLUDE is appropriate when SCVF
is used without FIlterLIst (in the case of MUST_PASS_ONE).  I think the fix is 
to have FilterList 
evaluate all the filters and not bail early when an INCLUDE is found.  I will 
continue to play with it.

On Dec 18, 2009, at 1:45 PM, bmdevelopment wrote:

> Hi,
> Yes, I'll be doing testing on FilterLists in my program in the next few 
> weeks, so will come back with my results afterwards and my recommendations as 
> well. :)
> Thanks, enjoy the weekend.
> 
> stack wrote:
>> Maybe you two smart fellas can between you make a recommendation and a
>> patch?
>> Thanks lads,
>> St.Ack
>> On Fri, Dec 18, 2009 at 11:44 AM, bmdevelopment 
>> <bmdevelopm...@gmail.com>wrote:
>>> Hi,
>>> Fyi, I came across similar issues when working on HBASE-1975.
>>> The return values did not seem to be correct to me either, but when I began
>>> changing them it seemed to lead to quite involved changes in the SCVF and
>>> Filter unit tests - something I wanted to avoid.
>>> In the end, I tried to keep the changes to SCVF as simple as possible.
>>> At one point, I did also attempt my own version of SCVF and ran into the
>>> same issue of having to use HbaseObjectWritable.addToMap().
>>> 
>>> Now I am beginning to use MUST_PAST_ALL and MUST_PASS_ONE FilterList of
>>> SCVFs - maybe similar to what Paul is doing in his original mail. So, if it
>>> is not working as expected, I will probably need this in the near future as
>>> well.
>>> 
>>> Thanks
>>> Jeremiah
>>> 
>>> 
>>> Paul Ambrose wrote:
>>> 
>>>> Ugh.  I am afraid not.
>>>> The two changes that I am advocating (that could break someone else, which
>>>> is
>>>> of course problematic) are:
>>>> 
>>>> 1)  SingleColumnValueFilter.filterKeyValue(KeyValue keyValue)
>>>> When the column name does not match, the return value should be NEXT_ROW,
>>>> rather than INCLUDE.  As mentioned earlier, when called by FilterList,
>>>> the INCLUDE return value discontinues further filter evaluation for a
>>>> given KeyValue
>>>> in FilterList. That is problematic because matchedColumn is later checked
>>>> in filterRow
>>>> and will always be false for unevaluated filters.
>>>> 
>>>> 2) FilterList.filterKeyValue(KeyValue v) returns SKIP and I do not know
>>>> why.
>>>> In the case of MUST_PASS_ALL, a filter not returning an INCLUDE
>>>> should result in a NEXT_ROW (not SKIP) being returned, and at the bottom,
>>>> an INCLUDE should always be returned (rather than a SKIP).
>>>> 
>>>> Here is a dumb question.  A while ago, I tried to add my own filter to the
>>>> server, but I could not get it going without adding an entry in
>>>> HbaseObjectWritable.addToMap().  Should I be able to add a filter without
>>>> this step?  If so, I am content to have my own version of the
>>>> SingleColumnValueFilter
>>>> and FilterList and not risk breaking others (though I do think the code is
>>>> incorrect).
>>>> 
>>>> 
>>>> 
>>>> On Dec 17, 2009, at 10:27 AM, stack wrote:
>>>> 
>>>> On Tue, Dec 15, 2009 at 10:42 PM, Paul Ambrose <pambr...@mac.com> wrote:
>>>>> Hey Michael,
>>>>>> If hbase-2037 will make it into 0.20.3, I am fine.
>>>>>> 
>>>>>> Grand.
>>>>> Will hbase-2037 fix both issues you describe? (Have you tried it I
>>>>> wonder?)
>>>>> 
>>>>> St.Ack
>>>>> 
>>>>> 
>>>>> 
>>>>> If not, I would greatly appreciate you breaking it out for 0.20.3.
>>>>>> 
>>>>>> 
>>>>> 
>>>>> Thanks,
>>>>>> Paul
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Dec 15, 2009, at 10:28 PM, stack wrote:
>>>>>> 
>>>>>> Paul:
>>>>>>> I can apply the fix from hbase-2037... I can break it out of the posted
>>>>>>> patch thats up there.  Just say the word.
>>>>>>> 
>>>>>>> St.Ack
>>>>>>> 
>>>>>>> 
>>>>>>> On Tue, Dec 15, 2009 at 4:17 PM, Ram Kulbak <ram.kul...@gmail.com>
>>>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi Paul,
>>>>>>>> I've encountered the same problem. I think its fixed as part of
>>>>>>>> https://issues.apache.org/jira/browse/HBASE-2037
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Yoram
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Wed, Dec 16, 2009 at 10:45 AM, Paul Ambrose <pambr...@mac.com>
>>>>>>>> 
>>>>>>> wrote:
>>>>>>> I ran into some problems with FilterList and SingleColumnValueFilter.
>>>>>>>>> I created a FilterList with MUST_PASS_ONE and two
>>>>>>>>> 
>>>>>>>> SingleColumnValueFilters
>>>>>>>> 
>>>>>>>>> (each testing equality on a different columns) and query some trivial
>>>>>>>>> 
>>>>>>>> data:
>>>>>>>> 
>>>>>>>>> http://pastie.org/744890
>>>>>>>>> 
>>>>>>>>> The problem that I encountered were two-fold:
>>>>>>>>> 
>>>>>>>>> SingleColumnValueFilter.filterKeyValues() returns ReturnCode.INCLUDE
>>>>>>>>> if the column names do not match. If FilterList is employed, then
>>>>>>>>> when
>>>>>>>>> 
>>>>>>>> the
>>>>>>>> 
>>>>>>>>> first Filter returns INCLUDE (because the column names do not match),
>>>>>>>>> 
>>>>>>>> no
>>>>>>> more filters for that KeyValue are evaluated.  That is problematic
>>>>>>>> because
>>>>>>>> 
>>>>>>>>> when filterRow() is finally called for those filters, matchedColumn
>>>>>>>>> is
>>>>>>>>> never
>>>>>>>>> found to be true because they were not invoked (due to FilterList
>>>>>>>>> 
>>>>>>>> exiting
>>>>>>> from
>>>>>>>>> the filterList iteration when the name mismatched INCLUDE was
>>>>>>>>> 
>>>>>>>> returned).
>>>>>>> The fix (at least for this scenario) is for
>>>>>>>>> SingleColumnValueFilter.filterKeyValues() to
>>>>>>>>> return ReturnCode.NEXT_ROW (rather than INCLUDE).
>>>>>>>>> 
>>>>>>>>> The second problem is at the bottom of FilterList.filterKeyValue()
>>>>>>>>> where ReturnCode.SKIP is returned if MUST_PASS_ONE is the operator,
>>>>>>>>> rather than always returning ReturnCode.INCLUDE and then leaving the
>>>>>>>>> final filter decision to be made by the call to filterRow().   I am
>>>>>>>>> 
>>>>>>>> sure
>>>>>>> there is a good
>>>>>>>>> reason for returning SKIP in other scenarios, but it is problematic
>>>>>>>>> in
>>>>>>>>> mine.
>>>>>>>>> 
>>>>>>>>> Feedback would be much appreciated.
>>>>>>>>> 
>>>>>>>>> Paul
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
> 

Reply via email to