Mkay, I will look into it more for the latter. But for the limit this is still confusing to me as limit == batch and that is in he client side the number of rows. But not the number of columns. Does that mean if I had 100 columns and set batch to 10 that it would only return 10 rows with 10 columns but not what I would have expected ie. 10 rows with all columns? Is this implicitly mean batch is also the intra row batch size?
Lars On Nov 25, 2010, at 21:53, Ryan Rawson <ryano...@gmail.com> wrote: > limit is for retrieving partial results of a row. Ie: give me a row > in chunks. Filters that want to operate on the entire row cannot be > used with this mode. i forget why it's in the loop but there was a > good reason at the time. > > -ryan > > On Thu, Nov 25, 2010 at 10:51 AM, Lars George <lars.geo...@gmail.com> wrote: >> Does hbase-dev still get forwarded? Did you see the below message? >> >> ---------- Forwarded message ---------- >> From: Lars George <lars.geo...@gmail.com> >> Date: Tue, Nov 23, 2010 at 4:25 PM >> Subject: HRegion.RegionScanner.nextInternal() >> To: hbase-...@hadoop.apache.org >> >> Hi, >> >> I am officially confused: >> >> byte [] nextRow; >> do { >> this.storeHeap.next(results, limit - results.size()); >> if (limit > 0 && results.size() == limit) { >> if (this.filter != null && filter.hasFilterRow()) throw >> new IncompatibleFilterException( >> "Filter with filterRow(List<KeyValue>) incompatible >> with scan with limit!"); >> return true; // we are expecting more yes, but also >> limited to how many we can return. >> } >> } while (Bytes.equals(currentRow, nextRow = peekRow())); >> >> This is from the nextInternal() call. Questions: >> >> a) Why is that check for the filter and limit both being set inside the loop? >> >> b) if "limit" is the batch size (which for a Get is "-1", not "1" as I >> would have thought) then what does that "limit - results.size()" >> achieve? >> >> I mean, this loops gets all columns for a given row, so batch/limit >> should not be handled here, right? what if limit were set to "1" by >> the client? Then even if the Get had 3 columns to retrieve it would >> not be able to since this limit makes it bail out. So there would be >> multiple calls to nextInternal() to complete what could be done in one >> loop? >> >> Eh? >> >> Lars >>