For #2, I'm of the opinion that this should be handled above the Thrift
level because it adds significant complexity to multiple components of
Thrift, it is not easy to add on a language-by-language basis, and I
don't think it can be done in a way that will be "right" for all users.

I have implemented a very simple client and a server in Java that wrap 'CALL' messages with a message specifying the name of the service. The latter message has a new message type: SERVICE_SELECTION, and the message name is an arbitrary identifier which is used to lookup the processor in a map on the server. On the client side I have to wrap an instance of the static inner class 'Client' with a dynamic proxy that will wrap the 'CALL' message for each invocation of a Thrift function. The responses are not modified or wrapped.

I can do mixed synchronous and asynchronous calls of different functions of different services over a single connection. If a request is made for a service that does not exist, an exception is returned and the connection is still usable for following calls because the 'CALL' message and the struct for the parameters are skipped.

A minor disadvantage of this approach is that the output gets flushed twice: by class 'Client' to flush the 'CALL' message and by the proxy to flush the end of the 'SERVICE_SELECTION' message. Depending on the protocol and transport used this may affect performance by causing two instead of one network packet to be sent.

What do you think of this approach?

For #3, I would recommend just setting a send/recv timeout in the client
transport (C++ and PHP support this for sure).  If the request times out,
an exception will be thrown.

For #4, we depend on the TCP checksum to detect corruption at this point,
though a checksumming framed transport would be a nice feature.

I have not implemented any data corruption or timeout handling yet. I am thinking about just dropping the connection on the side that detects the error. This way I do not have to invent and implement a way for the client and the server to negotiate a reset when one of them encounters an error.

Dropping a connection and reestabling a new one is expensive, but I think a custom reset would probably require a sequence of 'magic' bytes, which requires the escaping and unescaping of those sequences in the normal data. This would add extra processing to all normal operations for the unlikely event of a communication error. I also think that negotiating a custom reset will probably be at least as expensive as the TCP traffic needed for closing the current and opening a new connection.

There is no session or something similar associated with a connection, so when the client reconnects it can simply continue where it left off (after refreshing its state).

Is this a viable strategy for handling communication errors?


I attached the source files of the implementation. To try them out you have to include the generated JavaBean classes of the tutorial in the class path, and add the following to 'TMessageType':
  public static final byte SERVICE_SELECTION  = 4;

--
Kind regards,

Johan Stuyts

Reply via email to