Re: State of our RPCs

2015-12-01 Thread John R. Frank


It might be worth considering CBOR  http://cbor.io/

jrf



Re: C++ accumulo client -- native clients for Python, Go, Ruby etc

2014-10-08 Thread John R. Frank



We're running a experiment with multiple local proxies to try to better 
understand the batching and bottlenecking issues.  Will report back.


jrf


Re: C++ accumulo client -- native clients for Python, Go, Ruby etc

2014-10-06 Thread John R. Frank
Two kinds of gains:

1) single client throughput:  the extra RPC hop through the proxy deserializes 
and then reserializes the messages.  With the proxy running locally the extra 
network hop is less of an issue.  This was discussed on the user list (see link 
earlier in this thread), and 5x slow down was suggested as a possible swag 
estimate. 

2) cluster management complexity: it's clearly best to have the proxy local to 
the workers, but if you have a worker on every core of a large box (eg 32), 
then having a single proxy on each worker machine becomes a bottleneck. Running 
many proxies on a single JVM is the next thing we could try to improve this --- 
having a native client seems preferable. 


Comments?

jrf


 On Oct 6, 2014, at 4:15 PM, David Medinets david.medin...@gmail.com wrote:
 
 How far away from the theoretical maximum rate is the thrift protocol?
 What kind of gain is expected from the native C++ approach?
 
 On Sat, Oct 4, 2014 at 12:56 PM, John R. Frank j...@diffeo.com wrote:
 Accumulo Developers,
 
 We're trying to boost throughput of non-Java tools with Accumulo.  It seems 
 that the lowest hanging fruit is to stop using the thrift proxy. Per 
 discussion about Python and thrift proxy in the users list [1], I'm 
 wondering if anyone is interested in helping with a native C++ client?  
 There is a start on one here [2]. We could offer a bounty or maybe make a 
 consulting project depending who is interested in it.
 
 We also looked at trying to run a separate thrift proxy for every worker 
 thread or process.  With many cores on a box, eg 32, it just doesn't seem 
 practical to run that many proxies, even if they all run on a single JVM. 
 We'd be glad to hear ideas on that front too.
 
 A potentially big benefit of making a proper C++ accumulo client is that it 
 is straightforward to expose native interfaces in Python (via pyObject), Go 
 [3], Ruby [4], and other languages.
 
 Thanks for any advice, pointers, interest.
 
 John
 
 
 1-- http://www.mail-archive.com/user@accumulo.apache.org/msg03999.html
 
 2--
 https://github.com/phrocker/apeirogon
 
 3-- http://golang.org/cmd/cgo/
 
 4-- https://www.amberbit.com/blog/2014/6/12/calling-c-cpp-from-ruby/
 
 
 Sent from +1-617-899-2066