Thanks Luke, I'm starting out on the Thrift interface now. -Sanjit
On Sun, Jan 16, 2011 at 7:51 PM, Luke Lu <[email protected]> wrote: > The design looks reasonable but the callback is not an appropriate > term for async results returned from the thrift broker, as they will > not be called back by anyone. The standard term for the result an > async computation is called "future". (as implemented in the > java.util.concurrent package.) I'd suggest that you take a look at > java Future API design. IMO, Future cancellation needs be implemented > for long scans. > > __Luke > > On Tue, Jan 11, 2011 at 11:15 AM, Sanjit Jhala <[email protected]> wrote: > > Currently the Hypertable client exposes only a synchronous API. This > means > > that if an application wants to read/write from multiple tables it has to > > issue the calls sequentially and block till each call completes. This is > > also true for application managed secondary index tables. Being able to > > issue asynchronous scans and updates should greatly reduce overall > > application latency for these cases. > > I'd like to propose the following design for such an asynchronous API > which > > will cover both the C++ client as well as the Thrift interface. > > 1. C++ Client Library > > The C++ client library will provide an abstract callback interface. > > Applications will implement their own callbacks to deal with the results > > from async reads/writes. > > For scans (reads) the callback will get called by the Scanner whenever it > > receives a new ScanBlock from the RangeServers. For updates (writes) the > > callback will get called whenever a update operation completes. > > Also, auto flushing (when per-RangeServer mutator buffers fill up) will > be > > disabled for asynchronous mutators. > > The interface will look like: > > > > class ResultCallbackInterface { > > public: > > virtual scan_error(TableScannerPtr &scanner, int32 error, const String > > &error_msg)=0; > > virtual scan_ok(TableScannerPtr &scanner, vector<Cells> &cells)=0; > > virtual update_ok(TableMutatorPtr &mutator, FailedMutations)=0; > > virtual update_error(TableMutatorPtr &mutator, int32 error, String > > error_msg)=0; > > }; > > 2. Thrift interface > > The ThriftBroker will implement a Callback class which will use a queue > to > > transform asynchronous API calls into synchronous ThriftBroker calls. > There > > will also be a new Result object which will encapsulate the operation > type > > (scan/update), results/acknowledgment and errors (if any). > > class ThriftResultCallback { > > public: > > // synchronous method which returns results as they arrive and false > if > > all results have arrived > > bool get_result(Result &); > > // convenience method which blocks till all updates complete > > bool wait_for_updates_to_complete(FailedUpdates &); > > > > // These methods enqueue results as they arrive which are later > > served to the application via get_results() calls > > scan_error(TableScannerPtr &scanner, int32 error, const String > &error_msg); > > scan_ok(TableScannerPtr &scanner, vector<Cells> &cells); > > update_ok(TableMutatorPtr &mutator, FailedMutations); > > update_error(TableMutatorPtr &mutator, int32 error, String error_msg); > > private: > > ResultQueue m_results; > > }; > > Pseudocode for a sample Thrift application: > > rc = create_result_callback(); > > // create some asynchronous scanners and mutators > > m1 = create_async_mutator(…, rc); > > m2 = create_async_mutator(…, rc); > > // kick off scans > > s1 = create_async_scanner(…, rc); > > … > > … > > // buffer updates locally > > m1.set_cells(…); > > … > > mn.set_cells(…); > > ... > > // issue updates > > m1.flush(); > > m2.flush(); > > … > > mn.flush() > > // deal with write acks and scan results as they appear > > while (get_results(rc, rr)) { > > switch(rr.type) { > > case (SCAN): > > … > > case (UPDATE): > > ... > > } > > } > > // issue a set of writes > > m1.set_cells(…); > > m2.set_cells(…); > > m1.flush(); > > m2.flush(); > > // wait for all writes to complete > > has_error = wait_for_updates_to_complete(rc); > > Implementation notes: > > The ThriftResultCallback object uses m_results to enqueue results for > > consumption by the application. Each synchronous call to get_result() > will > > pop a result off the queue or return false if there are no outstanding > > scans/updates. For scans, it will also buffer results by scanner so that > the > > application doesn't have to make too many Thrift calls for scans which > > result in a small set of results from a large set of ScanBlocks. > > For the case where a slow application is reading a massive amount of > data, > > the callback will have to have some way to pause the queue and scanners > to > > avoid being overwhelmed while the application catches up. > > Any thoughts? > > -Sanjit > > > > -- > > You received this message because you are subscribed to the Google Groups > > "Hypertable Development" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]. > > For more options, visit this group at > > http://groups.google.com/group/hypertable-dev?hl=en. > > > > -- > You received this message because you are subscribed to the Google Groups > "Hypertable Development" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/hypertable-dev?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en.
