I remember someone objecting to the scheme. :)

On Wed, Apr 23, 2008 at 6:46 AM, Clint Morgan <[EMAIL PROTECTED]> wrote:
> The way I picture it, we only need to check the 100 or 1000 regions to
> get the very first row of a order-by scan. Afterwards, we just need to
> get the next row from the region that we took the last row from,
> compare with the existing rows we have, and return the lowest.
>
> So if we have R regions to grab from, the first call to next() will
> have to fetch all R rows from the regions and do R log R comparisons
> to do the sort. Then each call to next() will cost 1 row fetch from
> region plus log R comparisons to put in a sorted tree.
>
>
>
> On Tue, Apr 22, 2008 at 1:10 PM, Bryan Duxbury <[EMAIL PROTECTED]> wrote:
> > If you just want the first 10 by a certain prefix ordered by a column, then
> > you will definitely be better off scanning them by row and ordering them
> > clientside.
> >
> >  Surely your idea of maintaining a separate SortedMap in each region would
> > work, but I don't think you should discount the cost associated with having
> > to talk to a bunch of different regions every time you want to know what the
> > next row is. You'll do a lot of extra work to get that merged view of the
> > index, and potentially the approach won't scale up to queries that might
> > cover more than a "few" regions - can you imagine having to check 100 or
> > 1000 regions for the next result every time you needed to iterate?
> >
> >
> >
> >  On Apr 22, 2008, at 12:58 PM, Clint Morgan wrote:
> >
> >
> > > Yeah, that would be an easy approach. We would need HBASE-82.
> > >
> > > The main problem I see here is that we cannot take (as much) advantage
> > > of our row key prefix in weeding out rows.
> > >
> > > Say I want the first 10 rows that start with XXX ordered by A:amount,
> > > then I would have scan through column values from rows everywhere in
> > > the table until I hit 10 with my prefix. Could be costly if table is
> > > large compared to the number of rows that start with XXX.
> > >
> > > Whereas if we have one SortedMap per Region, then I can quickly narrow
> > > down to (hopefully) a few regions based on key prefix.
> > >
> > > Though other usage / table loading patterns would surely benefit from
> > > this approach...
> > >
> > > On Tue, Apr 22, 2008 at 12:31 PM, Bryan Duxbury <[EMAIL PROTECTED]> wrote:
> > >
> > > > This doesn't have to be all that complicated.
> > > >
> > > >  Why not keep another HBase table as the index? The keys would be the
> > column
> > > > values. There'd be a single matchingRows column family, and the
> > qualifiers
> > > > and values would be the rows that match that column. Then, when you want
> > to
> > > > scan in column order instead of row order, you scan the index table,
> > find
> > > > the list of rows that match each column, and then do a random read to
> > grab
> > > > those individually. It'll for sure be slower than scanning a table
> > ordered
> > > > by rows, but it'll get you what you want. It'll also handle the case
> > where
> > > > the column values aren't unique.
> > > >
> > > >  If you need custom sorting for those values, then HBASE-82 would solve
> > that
> > > > problem.
> > > >
> > > >  -Bryan
> > > >
> > > >
> > > >
> > > >  On Apr 22, 2008, at 11:58 AM, stack wrote:
> > > >
> > > >
> > > >
> > > > > Some questions interlaced below:
> > > > >
> > > > > Clint Morgan wrote:
> > > > >
> > > > >
> > > > > > All,
> > > > > >
> > > > > > We want to put secondary indexes into hbase. The motivation is that
> > we
> > > > > > are storing data in hbase that we want to serve to users. We would
> > > > > > like to be able to serve rows sorted by column values. Our queries
> > > > > > will be over rows with a given key prefix, so we should not be
> > hitting
> > > > > > to many regions.
> > > > > >  I was thinking it would work roughly like this:
> > > > > >
> > > > > > - At table creation time, individual columns can be declared as
> > > > > > indexed. By default we could sort the column values
> > lexicographically,
> > > > > > or we can provide a WritableComparatorFactory<T> which has the
> > ability
> > > > > > to make values of type T from a byte [], as well as providing a
> > > > > > Comparator<T>. (Better than providing a Comparator<byte[]> as it
> > only
> > > > > > costs once per row insert for deserialization, rather that twice on
> > > > > > each comparison).
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > I don't follow what the Factory adds.
> > > > >
> > > > > We're talking about getting HBASE-82 into 0.2.  Does that interfere
> > with
> > > > >
> > > > this proposal?  (I'm thinking that along w/ rows becoming byte arrays
> > rather
> > > > Text with a client-supplied Comparator, column qualifiers would shift to
> > be
> > > > byte arrays also -- though yeah, implies that if your sort is not
> > > > byte-lexicographical, yes, the compares can be costly involving two
> > > > deserializations per compare).
> > > >
> > > > >
> > > > >
> > > > > > - We catch all writes/deletes and maintain a SortedMap<T, HStoreKey>
> > > > > > which keeps the column values in order, and maps them back to row
> > > > > > keys. First cut may just keep all this in memory, but it should be
> > > > > > backed with MapFile(s).
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > Would be sweet if you could leverage the HBase memcache code and
> > flusher
> > > > >
> > > > to do the above.
> > > >
> > > > >
> > > > > This Map would be global for the table?  Or per Region?
> > > > >
> > > > > A lucene index wouldn't work for you because you want ordering?
> > > > >
> > > > >
> > > > >
> > > > > > - Add to the hregion the ability to scan through keys in column
> > order.
> > > > > > Just iterate through the SortedMap, run a filter on the key, and if
> > it
> > > > > > passes do a get on the row.
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > You'd be random reading rows.  You're OK w/ current performance?  (For
> > > > >
> > > > sure it will only improve but....).
> > > >
> > > > >
> > > > >
> > > > >
> > > > > > - Add a ColumnOrderedClientScanner which will open column order
> > > > > > scanners to all applicable hregions, and continuously pick row with
> > > > > > the lowest column value from each of the client scanners.
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > This scanner would have a significant client-side component to do the
> > > > >
> > > > arbitrage between all regions to figure the lowest column value?  If you
> > had
> > > > a new type of 'region' -- one denoted by lowest and upper column then
> > the
> > > > client-side logic would fade away and your scanner would look like
> > current
> > > > scanners.
> > > >
> > > > >
> > > > >
> > > > > > - Region splits should be easy enough, just a scan through the
> > > > > > SortedMap to partition.
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > Splits would not be row-based and run as they currently do, but rather
> > > > >
> > > > sorted-column based?
> > > >
> > > > >
> > > > >
> > > > >
> > > > > > Of course, the index could also be used for more efficient querying
> > on
> > > > > > the indexed column's values.
> > > > > >
> > > > > > Do other users have a need for this functionality?
> > > > > >
> > > > > > What do developers think about this? I know hbase is more intended
> > for
> > > > > > back-end batch style processing, but we have this need.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > How are you thinking of adding in this new functionality?  Subclassing
> > > > >
> > > > HRegionServer?
> > > >
> > > > >
> > > > > St.Ack
> > > > >
> > > > >
> > > > >
> > > > > > Cheers,
> > > > > > -clint
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > >
> >
> >
>



-- 
B. Regards,
Edward J. Yoon

Reply via email to