So are you saying that MUST_PASS_ALL might be flawed (for a FilterList of two QualifierFilters)? If so, I can dig into the source and see if I can find anything.
Or are you saying that my data profile is wrong? If so, can you (or someone else) suggest one that works? I tried this: hbase(main):032:0> scan 'testTable3' ROW COLUMN+CELL row1 column=col1:qualifier-1, timestamp=1264554774915, value=some_col1_qual1_value row1 column=col1:qualifier-2, timestamp=1264554866041, value=some_col1_qual2_value And that doesn't work with the two QualifierFilters. I haven't actually run the JUnit tests because I haven't dealt with JUnit before, but if you suggest I run those I can do that as well. I was hoping someone could submit a working implementation of MPALL with 2 QualfierFilters. I thought that might have been a pretty common operation. On Tue, Jan 26, 2010 at 8:01 PM, Stack <[email protected]> wrote: > On Tue, Jan 26, 2010 at 4:51 PM, Chris Bates > <[email protected]> wrote: > > > > Must pass all "works" because there's a unit test that asserts so? > I'm not sure what it is about your data profile that is messing with > this functionality. Its something involved where my guess is the only > way to figure it is to set up some kinda harness and step through the > debugger. Any chance of your having a go at that Chris? > > Thanks, > St.Ack > > > > Second, I'm still not able to get the AND operation working. > > > > To illustrate: > > > > hbase(main):010:0> scan 'testTable', {COLUMNS=>["user:theme", > > "user:REMOTE_ADDR"]} > > ROW COLUMN+CELL > > > > row1 column=user:REMOTE_ADDR, > > timestamp=1264464021672, value=172.16.1.3 > > row1 column=user:theme, timestamp=1264464041857, > > value=Frost > > row2 column=user:theme, timestamp=1264464058064, > > value=Sunshine > > row3 column=user:REMOTE_ADDR, > > timestamp=1264464083332, value=172.16.0.06 > > > > With MUST_PASS_ALL enabled... > > > > If I comment out the REMOTE_ADDR filter, I get: > > IP: null Theme: Frost > > IP: null Theme: Sunshine > > > > If I comment out the theme filter, I get the reverse. > > IP: 172.16.1.3 Theme: null > > IP: 172.16.0.06 Theme: null > > > > If I leave both in, I get __nothing__, when I want: > > IP: 172.16.1.3 Theme: Frost > > > > I thought this might be due to HBase not being able to do an AND > operation > > on Qualifiers of the same column, so I created another testTable2 with > two > > different columns: > > > > hbase(main):024:0> scan 'testTable2' > > ROW COLUMN+CELL > > > > row1 column=addr:REMOTE_ADDR, > > timestamp=1264552425218, value=172.16.1.3 > > row1 column=user:theme, timestamp=1264552375737, > > value=Frost > > row2 column=user:theme, timestamp=1264552505491, > > value=Sunshine > > row3 column=addr:REMOTE_ADDR, > > timestamp=1264552538651, value=172.16.0.36 > > > > But nothing changed. > > > > > > Any other thoughts? The only solution I can see to get this done is to > > implement a row counter for each column+qualifier and then store the > results > > that meet criteria that I expect, but I was hoping a native filter would > do > > the job. > > > > > > On Mon, Jan 25, 2010 at 8:43 PM, Stack <[email protected]> wrote: > > > >> See the TestFilterList under unit tests, src/test. Can you mess > >> around with it using your data and see if it tells you anything? > >> There's a testMPALL in there. Might give you a clue (Your code looks > >> fine) > >> > >> St.Ack > >> > >> On Mon, Jan 25, 2010 at 4:25 PM, Chris Bates > >> <[email protected]> wrote: > >> > thanks stack. i upgraded to the RC3 0.20.3. > >> > > >> > I was still getting the hanging, so I decided to create a real simple > >> table > >> > to try to see if I can get the logic working: > >> > > >> > hbase(main):031:0> scan 'testTable' > >> > ROW COLUMN+CELL > >> > > >> > row1 column=user:REMOTE_ADDR, > >> > timestamp=1264464021672, value=172.16.1.3 > >> > row1 column=user:theme, > timestamp=1264464041857, > >> > value=Frost > >> > row2 column=user:theme, > timestamp=1264464058064, > >> > value=Sunshine > >> > row3 column=user:REMOTE_ADDR, > >> > timestamp=1264464083332, value=172.16.0.06 > >> > > >> > Without the filter (http://pastebin.com/m20ba0d2d) this is my output > >> > client-side: > >> > IP: 172.16.1.3 > >> > Theme: Frost > >> > IP: null > >> > Theme: Sunshine > >> > IP: 172.16.0.06 > >> > Theme: null > >> > > >> > If I uncomment the setFilter, I get nothing. I'm expecting to get the > >> first > >> > two lines (row1). Thus I don't believe my filters are setup > correctly, > >> but > >> > I'm unsure where the error would be. > >> > > >> > Does anyone have any thoughts or examples? > >> > > >> > Thanks! > >> > > >> > > >> > On Mon, Jan 25, 2010 at 1:45 PM, Stack <[email protected]> wrote: > >> > > >> >> Check out the CHANGES in 0.20.2 and even in 0.20.3RC3: > >> >> > >> >> > >> > http://svn.apache.org/viewvc/hadoop/hbase/branches/0.20/CHANGES.txt?view=log > >> >> . > >> >> I believe what your issue fixed. > >> >> St.Ack > >> >> > >> >> On Mon, Jan 25, 2010 at 10:36 AM, Chris Bates > >> >> <[email protected]> wrote: > >> >> > 0.20.1 > >> >> > > >> >> > On Mon, Jan 25, 2010 at 1:31 PM, Stack <[email protected]> wrote: > >> >> > > >> >> >> What version of HBase? > >> >> >> St.Ack > >> >> >> > >> >> >> On Sat, Jan 23, 2010 at 7:49 PM, Chris Bates > >> >> >> <[email protected]> wrote: > >> >> >> > Hi all, > >> >> >> > > >> >> >> > I'm trying to do an AND operation and I'm not sure if I did the > >> >> filtering > >> >> >> > correctly because HBase is hanging on me. > >> >> >> > > >> >> >> > What I want is this: > >> >> >> > > >> >> >> > I have two qualifiers, theme and IP, to my column user. I'd > like > >> to > >> >> >> print > >> >> >> > out all matches (or maybe just 10) where the row has both of > them > >> in > >> >> it. > >> >> >> My > >> >> >> > impression is that this is what HBase would excel at, because > the > >> >> dataset > >> >> >> is > >> >> >> > VERY sparse, meaning that out of 1000-10,000 rows, maybe just 1 > or > >> 2 > >> >> will > >> >> >> > have BOTH an IP and a theme in it. Most of the time its just > one > >> or > >> >> the > >> >> >> > other. > >> >> >> > > >> >> >> > So this is my code to make that query, but as I said, its > hanging. > >> >> >> > http://pastebin.com/m7fcef49 > >> >> >> > > >> >> >> > If I comment out the filters, the query runs just fine and will > >> print > >> >> >> null > >> >> >> > wherever the value is not present. > >> >> >> > > >> >> >> > >> >> > > >> >> > >> > > >> > > >
