If its the same class and its not a patch, then the first class loaded wins. 

So if you have a Class Foo and HBase has a Class Foo, your code will never see 
the light of day.

Perhaps I'm stating the obvious but its something to think about when working w 
Hadoop. 

On Jan 19, 2013, at 3:36 AM, Eugeny Morozov <emoro...@griddynamics.com> wrote:

> Ted,
> 
> that is correct.
> HBase 0.92.x and we use part of the patch 6509.
> 
> I use the filter as a custom filter, it lives in separate jar file and goes
> to HBase's classpath. I did not patch HBase.
> Moreover I do not use protobuf's descriptions that comes with the filter in
> patch. Only two classes I have - FuzzyRowFilter itself and its test class.
> 
> And it works perfectly on small dataset like 100 rows (1 region). But when
> my dataset is more than 10mln (260 regions), it somehow loosing rows. I'm
> not sure, but it seems to me it is not fault of the filter.
> 
> 
> On Sat, Jan 19, 2013 at 3:56 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> 
>> To my knowledge CDH-4.1.2 is based on HBase 0.92.x
>> 
>> Looks like you were using patch from HBASE-6509 which was integrated to
>> trunk only.
>> Please confirm.
>> 
>> Copying Alex who wrote the patch.
>> 
>> Cheers
>> 
>> On Fri, Jan 18, 2013 at 3:28 PM, Eugeny Morozov
>> <emoro...@griddynamics.com>wrote:
>> 
>>> Hi, folks!
>>> 
>>> HBase, Hadoop, etc version is CDH-4.1.2
>>> 
>>> I'm using custom FuzzyRowFilter, which I get from
>>> 
>>> 
>> http://blog.sematext.com/2012/08/09/consider-using-fuzzyrowfilter-when-in-need-for-secondary-indexes-in-hbase/and
>>> suddenly after quite a time we found that it starts loosing data.
>>> 
>>> Basically the idea of FuzzyRowFilter is that it tries to find key that
>> has
>>> been provided and if there is no such a key - but more exists in table -
>> it
>>> returns SEEK_NEXT_USING_HINT. And in getNextKeyHint(...) it builds
>> required
>>> key. As I understand, HBase in this key will fast-forward to required
>> key -
>>> it must be similar or same as to get Scan with setStartRow.
>>> 
>>> I'm trying to find key F7dt8QWPSIDw, it is definitely in HBase - I'm able
>>> to get it using Scan.setStartRow.
>>> For FuzzyFilter I'm using empty Scan - I didn't specify start row, stop
>> row
>>> or anything related.
>>> That's what happening:
>>> 
>>> Fzzy: AAAA1Q7iQ9JA
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: AQAAnA96rxTg
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: AgAADQWPSIDw
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: AwAA-Q33Zb9Q
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: BAAAOg8oyu7A
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: BQAA9gqVQrTw
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: BgABZQ7iQ9JA
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: BwAAbgrpAojg
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: CAAAUQWPSIDw
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: CQABVgqVQrTw
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: CgAAOQ7iQ9JA
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: CwAALwqVQrTw
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: DAAAMwWPSIDw
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: DQAADgjqzsIQ
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: DgAAOgCcWv9g
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: DwAAKg7iQ9JA
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: EAAAugqVQrTw
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: EQAAJAqVQrTw
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: EgAABgIOMBgg
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: EwAAEwqVQrTw
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: FAAACQqVQrTw
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: FQAAIAqVQrTw
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: FgAAeAWPSIDw
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: FwAAAw33Zb9Q
>>> Next fzzy: F7dtxwqVQ_Pw
>>> Fzzy: F7dt8QWPSIDw
>>> 
>>> It's obvious that my FuzzyRowFilter knows what to search and every time
>> it
>>> repeats its question.
>>> The very first key - I suppose is just the first key of a region where my
>>> key is located.
>>> The very last key - is the key that is already bigger than what I'm
>> trying
>>> to find - that's the reason why FuzzyFilter stopped there.
>>> 
>>> Do you know any issue with SEEK_NEXT_USING_HINT? I've searched, but
>>> unsuccessfully.
>>> Do you have any idea how to explain these many trials?
>>> 
>>> Thanks in advance.
>>> --
>>> Evgeny Morozov
>>> Developer Grid Dynamics
>>> Skype: morozov.evgeny
>>> www.griddynamics.com
>>> emoro...@griddynamics.com
>>> 
>> 
> 
> 
> 
> -- 
> Evgeny Morozov
> Developer Grid Dynamics
> Skype: morozov.evgeny
> www.griddynamics.com
> emoro...@griddynamics.com

Reply via email to