Could the prefix filter lead to full tablescan? In other words is PrefixFilter applied after fetching the rows?
Another question I have is say I have row key abc and abd and I search for row "abc", is it always guranteed to be the first key when returned from scanned results? If so I can alway put a condition in the client app. On Thu, Mar 28, 2013 at 9:15 AM, Ted Yu <yuzhih...@gmail.com> wrote: > Take a look at the following in > hbase-server/src/main/ruby/shell/commands/scan.rb > (trunk) > > hbase> scan 't1', {FILTER => "(PrefixFilter ('row2') AND > (QualifierFilter (>=, 'binary:xyz'))) AND (TimestampsFilter ( 123, > 456))"} > > Cheers > > On Thu, Mar 28, 2013 at 9:02 AM, Mohit Anchlia <mohitanch...@gmail.com > >wrote: > > > I see then I misunderstood the behaviour. My keys are id + timestamp so > > that I can do a range type search. So what I really want is to return a > row > > where id matches the prefix. Is there a way to do this without having to > > scan large amounts of data? > > > > > > > > On Thu, Mar 28, 2013 at 8:26 AM, Jean-Marc Spaggiari < > > jean-m...@spaggiari.org> wrote: > > > > > Hi Mohit, > > > > > > "+" ascii code is 43 > > > "9" ascii code is 57. > > > > > > So "+9" is coming after "++". If you don't have any row with the exact > > > key "+++++", HBase will look for the first one after this one. And in > > > your case, it's +9hC\xFC\x82s\xABL3\xB3B\xC0\xF9\x87\x03\x7F\xFF\xF. > > > > > > JM > > > > > > 2013/3/28 Mohit Anchlia <mohitanch...@gmail.com>: > > > > My understanding is that the row key would start with +++++ for > > instance. > > > > > > > > On Thu, Mar 28, 2013 at 7:53 AM, Jean-Marc Spaggiari < > > > > jean-m...@spaggiari.org> wrote: > > > > > > > >> Hi Mohit, > > > >> > > > >> I see nothing wrong with the results below. What would I have > > expected? > > > >> > > > >> JM > > > >> > > > >> 2013/3/28 Mohit Anchlia <mohitanch...@gmail.com>: > > > >> > I am running 92.1 version and this is what happens. > > > >> > > > > >> > > > > >> > hbase(main):003:0> scan 'SESSIONID_TIMELINE', {LIMIT => 1, > STARTROW > > => > > > >> > 'sdw0'} > > > >> > ROW COLUMN+CELL > > > >> > s\xC1\xEAR\xDF\xEA&\x89\x91\xFF\x1A^\xB6d\xF0\xEC\x > > > >> > column=SID_T_MTX:\x00\x00Rc, timestamp=1363056261106, > > > >> > value=PAGE\x09\x091363056252990\x09\x09/ > > > >> > 7F\xFF\xFE\xC2\xA3\x84Z\x7F > > > >> > > > > >> > 1 row(s) in 0.0450 seconds > > > >> > hbase(main):004:0> scan 'SESSIONID_TIMELINE', {LIMIT => 1, > STARTROW > > => > > > >> > '------'} > > > >> > ROW COLUMN+CELL > > > >> > -\xA1\xAF>r\xBD\xE2L\x00\xCD*\xD7\xE8\xD6\x1Dk\x7F\ > > > >> > column=SID_T_MTX:\x00\x00hF, timestamp=1363384706714, > > > >> > value=PAGE\x09239923973\x091363384698919\x09/ > > > >> > xFF\xFE\xC2\x8F\xF0\xC1\xBF > > > >> > row(s) in 0.0500 seconds > > > >> > hbase(main):005:0> scan 'SESSIONID_TIMELINE', {LIMIT => 1, > STARTROW > > => > > > >> > '++++'} > > > >> > ROW COLUMN+CELL > > > >> > +9hC\xFC\x82s\xABL3\xB3B\xC0\xF9\x87\x03\x7F\xFF\xF > > > >> > column=SID_T_MTX:\x00\x00<2, timestamp=1364404155426, > > > >> > value=PAGE\x09\x091364404145275\x09 \x09/ > > > >> > E\xC2S-\x08\x1F > > > >> > 1 row(s) in 0.0640 seconds > > > >> > hbase(main):006:0> > > > >> > > > > >> > > > > >> > On Wed, Mar 27, 2013 at 9:23 PM, ramkrishna vasudevan < > > > >> > ramkrishna.s.vasude...@gmail.com> wrote: > > > >> > > > > >> >> Same question, same time :) > > > >> >> > > > >> >> Regards > > > >> >> Ram > > > >> >> > > > >> >> On Thu, Mar 28, 2013 at 9:53 AM, ramkrishna vasudevan < > > > >> >> ramkrishna.s.vasude...@gmail.com> wrote: > > > >> >> > > > >> >> > Could you give us some more insights on this? > > > >> >> > So you mean when you set the row key as 'azzzaaa', though this > > row > > > >> does > > > >> >> > not exist, the scanner returns some other row? Or it is giving > > > you a > > > >> row > > > >> >> > that does not exist? > > > >> >> > > > > >> >> > Or you mean it is doing a full table scan? > > > >> >> > > > > >> >> > Which version of HBase and what type of filters are you using? > > > >> >> > Regards > > > >> >> > Ram > > > >> >> > > > > >> >> > > > > >> >> > On Thu, Mar 28, 2013 at 9:45 AM, Mohit Anchlia < > > > >> mohitanch...@gmail.com > > > >> >> >wrote: > > > >> >> > > > > >> >> >> I have key in the form of "hashedid + timestamp" but when I > run > > > scan > > > >> I > > > >> >> get > > > >> >> >> rows for almost every value. For instance if I run scan for > > > 'azzzaaa' > > > >> >> that > > > >> >> >> doesn't even exist even then I get the results. > > > >> >> >> > > > >> >> >> Could someone help me understand what might be going on here? > > > >> >> >> > > > >> >> > > > > >> >> > > > > >> >> > > > >> > > > > > >