Thanks Luke, I'm starting out on the Thrift interface now.

-Sanjit

On Sun, Jan 16, 2011 at 7:51 PM, Luke Lu <[email protected]> wrote:

> The design looks reasonable but the callback is not an appropriate
> term for async results returned from the thrift broker, as they will
> not be called back by anyone. The standard term for the result an
> async computation is called "future". (as implemented in the
> java.util.concurrent package.) I'd suggest that you take a look at
> java Future API design. IMO, Future cancellation needs be implemented
> for long scans.
>
> __Luke
>
> On Tue, Jan 11, 2011 at 11:15 AM, Sanjit Jhala <[email protected]> wrote:
> > Currently the Hypertable client exposes only a synchronous API. This
> means
> > that if an application wants to read/write from multiple tables it has to
> > issue the calls sequentially and block till each call completes. This is
> > also true for application managed secondary index tables. Being able to
> > issue asynchronous scans and updates should greatly reduce overall
> > application latency for these cases.
> > I'd like to propose the following design for such an asynchronous API
> which
> > will cover both the C++ client as well as the Thrift interface.
> > 1. C++ Client Library
> > The C++ client library will provide an abstract callback interface.
> > Applications will implement their own callbacks to deal with the results
> > from async reads/writes.
> > For scans (reads) the callback will get called by the Scanner whenever it
> > receives a new ScanBlock from the RangeServers. For updates (writes) the
> > callback will get called whenever a update operation completes.
> > Also, auto flushing (when per-RangeServer mutator buffers fill up) will
> be
> > disabled for asynchronous mutators.
> > The interface will look like:
> >
> > class ResultCallbackInterface {
> >   public:
> > virtual scan_error(TableScannerPtr &scanner, int32 error, const String
> > &error_msg)=0;
> > virtual scan_ok(TableScannerPtr &scanner, vector<Cells> &cells)=0;
> > virtual update_ok(TableMutatorPtr &mutator, FailedMutations)=0;
> > virtual update_error(TableMutatorPtr &mutator, int32 error, String
> > error_msg)=0;
> > };
> > 2. Thrift interface
> > The ThriftBroker will implement a Callback class which will use a queue
> to
> > transform asynchronous API calls into synchronous ThriftBroker calls.
> There
> > will also be a new Result object which will encapsulate the operation
> type
> > (scan/update), results/acknowledgment and errors (if any).
> > class ThriftResultCallback {
> >   public:
> >    // synchronous method which returns results as they arrive and false
> if
> > all results have arrived
> >   bool get_result(Result &);
> >         // convenience method which blocks till all updates complete
> > bool wait_for_updates_to_complete(FailedUpdates &);
> >
> >         // These methods enqueue results as they arrive which are later
> > served to the application via get_results() calls
> > scan_error(TableScannerPtr &scanner, int32 error, const String
> &error_msg);
> > scan_ok(TableScannerPtr &scanner, vector<Cells> &cells);
> > update_ok(TableMutatorPtr &mutator, FailedMutations);
> > update_error(TableMutatorPtr &mutator, int32 error, String error_msg);
> >   private:
> > ResultQueue m_results;
> > };
> > Pseudocode for a sample Thrift application:
> > rc = create_result_callback();
> > // create some asynchronous scanners and mutators
> > m1 = create_async_mutator(…, rc);
> > m2 = create_async_mutator(…, rc);
> > // kick off scans
> > s1 = create_async_scanner(…, rc);
> > …
> > …
> > // buffer updates locally
> > m1.set_cells(…);
> > …
> > mn.set_cells(…);
> > ...
> > // issue updates
> > m1.flush();
> > m2.flush();
> > …
> > mn.flush()
> > // deal with write acks and scan results as they appear
> > while (get_results(rc, rr)) {
> > switch(rr.type) {
> > case (SCAN):
> > …
> > case (UPDATE):
> > ...
> > }
> > }
> > // issue a set of writes
> > m1.set_cells(…);
> > m2.set_cells(…);
> > m1.flush();
> > m2.flush();
> > // wait for all writes to complete
> > has_error = wait_for_updates_to_complete(rc);
> > Implementation notes:
> > The ThriftResultCallback object uses m_results to enqueue results for
> > consumption by the application. Each synchronous call to get_result()
> will
> > pop a result off the queue or return false if there are no outstanding
> > scans/updates. For scans, it will also buffer results by scanner so that
> the
> > application doesn't have to make too many Thrift calls for scans which
> > result in a small set of results from a large set of ScanBlocks.
> > For the case where a slow application is reading a massive amount of
> data,
> > the callback will have to have some way to pause the queue and scanners
> to
> > avoid being overwhelmed while the application catches up.
> > Any thoughts?
> > -Sanjit
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Hypertable Development" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> > [email protected].
> > For more options, visit this group at
> > http://groups.google.com/group/hypertable-dev?hl=en.
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "Hypertable Development" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/hypertable-dev?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en.

Reply via email to