Hi Paul, I was wondering what has done "under the covers", or, how to you set [startKey, endKey] for a Scan object. Please explain more, thanks a lot.
As a comparation, the Map-Reduce support for hBase will split a whole table scan to several sub-tasks, each task handles a region. You can check this implementation from hbase source codes, package org.apache.hadoop.hbase.mapreduce The advantage of Map-Reduce is, the task can run in more machines. And I think it is the critical to scale out. On Wed, Jan 6, 2010 at 1:52 PM, Paul Ambrose <[email protected]> wrote: > You create a Query Executor and assign it to a connection. > Subsequent queries on that connection will use the query executor such that > each key or key range of a query will be executed in the thread pool > of the executor (as a Get or Scan under the covers). So if you create > a query executor with 20 threads and perform a query with 1000 keys, > HBql will execute 20 HBase gets at a time (1000 times) and return the > results accordingly. The allocation of the gets and scans to the executor > thread pool is done under the covers. The user simply assigns the > executor, > issues the query, and iterates through the results. > > HBql does not presently expose async queries in the client API. It is > async > under the covers, so surfacing it would not be hard. I think async results > would > be nice, so I will add it to the list. > > HBql also does not leverage Map-Reduce. I am all ears on suggestions. > > Cheers, > Paul > > > On Jan 5, 2010, at 9:27 PM, Ken Yang wrote: > > > Thanks, Paul. > > > > But about the Query Executor in HBql, I just found limited info from > > http://www.hbql.com/statements/create-executor-pool.html > > And I'm not sure how does it finish a scan, I mean, how does each thread > get > > a [startKey, endKey]? > > Highly appreciated if there're more docs about the details :) > > > > And, about the operations need whole table scan, is there > > 1) asynchronous operations supported? > > 2) Map-Reduce leveraged? > > > > > > Best Regards > > Ken Yang [Yang ZhiYong] > > > > On Wed, Jan 6, 2010 at 12:58 PM, Paul Ambrose <[email protected]> wrote: > > > >> Thanks Ken. > >> > >> The laws of HBase physics still apply. If you execute a single scan > >> on all the rows of a table, it is likely to be expensive. HBql allows > you > >> to easily fire off N concurrent queries (as a series of N contiguous key > >> ranges in HBql) > >> which might give better performance. You can configure a Query > Executor > >> in HBql to have the appropriate number of threads. I know very little > >> about HBase > >> internals, so I am not sure what the expected improvement of such > >> concurrency might be. > >> > >> Cheers, > >> Paul > >> > >> > >> On Jan 5, 2010, at 7:24 PM, Ken Yang wrote: > >> > >>> Hi Paul, > >>> > >>> Very impressive Job! > >>> And I have a question, how do you handle a query which need to scan a > >> whole > >>> table? e.g. query on a Non-indexed column. > >>> > >>> Will the operation cost too much time? > >>> > >>> > >>> On Wed, Jan 6, 2010 at 4:49 AM, Paul Ambrose <[email protected]> wrote: > >>> > >>>> Hi, > >>>> > >>>> I have been working on an abstraction layer for HBase that I hope > >>>> HBase users will find helpful. > >>>> > >>>> Highlights include: > >>>> * A dialect of SQL for HBase (usable in the console, scripts, and > code) > >>>> * JDBC bindings > >>>> * JDBC-like bindings that support annotated objects and generics > >>>> * Query Executors that make threaded result reading simple > >>>> * Simplified filter writing for server and/or client > >>>> * Index support > >>>> > >>>> HBql is a work in progress and I am open to feedback and suggestions. > >>>> I am still working on the docs, so the examples and javadocs are > pretty > >>>> lame. > >>>> > >>>> Have a look at: http://www.hbql.com > >>>> > >>>> Cheers, > >>>> Paul > >>>> > >>> > >>> > >>> > >>> -- > >>> Best Regards > >>> Ken Yang [Yang ZhiYong] > >> > >> > > > > > > -- > > Best Regards > > Ken Yang [Yang ZhiYong] > > -- Best Regards Ken Yang [Yang ZhiYong]
