Proposal to add per call tracing to thrift

John Song Thu, 08 Oct 2009 15:21:01 -0700

Hi,

I am proposing an additional to the thrift implementation that enable per call 
tracing.


Objective:
Currently, a multi thread thrift service has no tracing capability built in.  A 
per call tracing works as following:
Assume that we have a thrift service MyService.  With a php client, it looks 
like

$client = new MyService(...);

// a normal call
$ret = $client->foo();

// enable tracing for the next call
Ttracer::triggerLevel(Ttracer::LEVEL_INFO);

// call again
$ret = $client->foo();

// get trace data
$data = Ttracer::getData();

// dump data
foreach ($data as $d) {
    echo $d[0] . “: “ . $d[1] . “\n”;
}
// where $d[0] is a timestamp of a float and precision in usc, and $d[1] is a 
string.

On the server side, in the service handler, we sprinkle the following calls 
(c++ server):

TTracer::trace(Ttracer::LEVEL_ERROR, “max iteration reached.  Bail out.”);

Why we need it:

 1.  For runtime debugging, profiling.
 2.  For monitoring.

Background:
While each service are free to add additional parameters to encapsulating the 
addtion of tracer flag, and tracer data, The proposed scheme will:

 1.  completely backward compatible.  Old server with new client, new server 
with old client are all interoperable.
 2.  No need to modify existing thrift definition, or change new definition to 
accommodate tracing.
 3.  only change at the protocol level.
 4.  Almost no overhead.

The proposal takes advantages of the following facts:

 1.  Thrift doesn’t defined how to encapsulate a function call, specifically, 
how function parameters and function results are encapsulated.  Thrift 
implementation uses struct for the function argument and function result, i.e. 
All function arguments are grouped together in generated structs: 
$(SERVICE)_$(FUNCNAME)_args, $(SERVICE)_$(FUNCNAME)_pargs for function 
arguments, and $(SERVICE)_$(FUNCNAME)_result for function results.
 2.  a thrift generated struct’s read function skip any field who’s field id it 
doesn’t understand so a field injection is safe with services built with older 
thrift generator.
 3.  Each service request is handled by one thread.

The proposal calls for the creation of a Ttracer class that use Thread Local 
Storage (TLS) to store tracer flag and tracer data.  Tracer flag is injected 
into a function argument if tracing is requested on the client side.  On the 
server side, that trace flag is extracted in struct reader by noticing a well 
known field id, 0x1001.  Server tracing is enabled.  Trace calls are executed 
and data collected.  When writing function result back to client, if tracing is 
enabled, and there is trace data, then trace data is injected into function 
result struct using another well known field id: 0x1002.

Proposal:

 1.  reserve field id larger than 0x1000 for internal use.  Field id is of type 
i16.  It is reasonable to assume that there will be no reasonable function that 
has more than 4096 parameters.
 2.  reserve 0x1001 field id for setting trace level, which is of type i32.
 3.  reserve 0x1002 field id for trace data, which is of type, in c++ parlance, 
vector<pair<double, string> >.
 4.  in struct reader, add a case statement such that when 0x1001 is 
encountered extract trace level on the server side and set TLS storage trace 
level.
 5.  in struct reader, add a case statement such that when 0x1002 is 
encountered extract trace data on the client side and store it in a TLS 
variable.
 6.  in struct writer, if the strut is a function result then injecting trace 
data into the function result with field id 0x1002
 7.  in struct writer, if the struct is a function argument then inject trace 
level into the function parameter with field id 0x1001.

Overhead:
The overhead of this change should be minimum to none when tracing is not 
enabled.  The injection and  extraction only happens when enabled.  Testing for 
whether or not tracing is enabled is an int var comparison.

I have completed coding on cpp parser, generator, client, server 
implementation, php generator and client (there is no php server) and am 
pending reviews.

Please let me know your thoughts.

Thanks,

-john song

Proposal to add per call tracing to thrift

Reply via email to