Re: THRIFT-66 - Bidirectional communication

James E. King III Sat, 18 May 2019 04:44:21 -0700

On Sat, May 18, 2019 at 2:39 AM John Dougrez-Lewis <[email protected]> wrote:
>
> Hi,
>
> > At least PUB/SUB can be implemented solely on the transport level. No IDL 
> > change necessary.
>
> My understanding is that Thrift generally provides only synchronous RPC, 
> making with one synchronous return. How would multiple subsequent 
> Subscription updates be sent back to the Client?


It may be helpful to separate RPC behavior from client behavior:

Thrift supports "roundtrip" RPC, which means a request generates a reply.
Thrift also supports "oneway" RPC, which means a request will not
generate a reply (although the transport layer may require one, like
HTTP does).

Thrift client runtime implementations typically support blocking calls
for "roundtrip".
Thrift client runtime implementations typically block on "oneway" only
to queue the message outbound (i.e. queue full, RPC request blocks
until space or other socket activity like disconnect).
Some thrift client runtime implementations provide non-blocking call
behavior for "roundtrip" meaning they accept responsibility for the
message then return control to you, and call a callback you provided
with the message when a reply comes in.

To implement pub/sub you could:

1. Implement two transports as we discussed: A connects to B, and in
reaction B connects to A.  A always initiates the sequence.
2. Implement a "subscribe" API, using roundtrip, A subscribes to
something B will provide.
3. Implement a "publish" API going in the other direction, using
oneway.  B publishes updates to A, doesn't care if A gets them,
doesn't care if A processes them.
4. On disconnect you should automatically unsubscribe because you
don't know if the process on A is still alive but unreachable, or the
process on A died.  Same for B.

Events / Subscriptions should never be considered reliable unless
routed through a message bus, and even then they may not be reliable -
depends on the message bus and circumstances.
In you needed it, you could make publish normal (not oneway) and that
would allow you to know whether the client processed the event, but it
isn't a guarantee and will introduce delays into your event delivery.

>
>
> > async was already a keyword long ago. It's now called "oneway".
>
> But 'oneway' provides no guaranteed deterministic mechanism of indicating 
> back to the client any subsequent processing failures of the call on the 
> server side, so it appears very fragile and so would be difficult to justify 
> it being used in any serious professional applications such as Financial 
> Trading Systems.
>

I would not say it is fragile.  A "oneway" message means the sender
does not care whether the receiver even gets it, let alone can handle
it.  It is a decent choice for event delivery in pub/sub.  It is a
foundation of a distributed messaging system to agree to deliver
messages as reliably as needed.  Some systems can provide better
guarantees than others.  If you need a guarantee that a receiver
processed or at least received a message, you can use normal (not
oneway) method calls.  This does not guarantee that the sender always
gets a reply if the receiver processes the message.  Consider what
happens if the receiver processes the message and then queues up a
reply, and then crashes before it gets out.  For reasons like this,
it's better to have the application layer handle delivery/recovery
semantics, as some applications may care about knowing something was
processed, and some may not.  You would use idempotent requests and
keep sending the same request until you got a response.  This is
generally how NFS works, for example, in a simplistic sense (it also
has a "recent request cache" it uses to debounce duplicate requests in
many implementations, providing the idempotency, since deleting
something twice is usually an error on the second call).

>
> > Not sure if I can follow. What exactly would that preprocessor do? Generate 
> > IDL from IDL?
>
> Yes.
>
>
> >> [in/out/inout] parameter
>
> > There was some discussion about this a while ago, don't remember the ticket.
> > Technically that should be possible, since the data are transferred in a
> > struct anyway, so multiple out parameters (or even in/out) should really not
> > be that hard to implement.
>
> Yes, even with just a synchronous IDL extension, out parameters can be always 
> be rewritten as members of a struct in the returned values type.
>
>
> ------------------
>
> In addition, Callback methods could be passed as arguments in the A<=>B 
> extended IDL and translated into interface methods on the B=>A async response 
> path, e.g.
>
>
> A<=>B IDL:
>
> [aync] methodReturnType Method(arg1, arg2, callbackType Arg)
>
> // define callbackType signature
>
> cbReturnType callbackType(cbArgtype1, cbArgtype2)
>
> =======>
>
> generates:
>
> A=>B IDL
>
> // immediate return to provide handle for context for subsequent async return
> // requires caller to supply callbackHandle to provide context for callback
>
> handleType Method(arg1, arg2, callbackHandle)
>
> B=>A IDL
>
> // async return from method call
> void Method(handleType, methodReturnType)
>
> // call for async callback
> cbReturnType CallbackType(handleType, callbackHandle, cbArgtype1, cbArgtype2)
>

I believe the nodejs client code is "non-blocking" and supports
roundtrip as well as oneway.  The client implementation itself allows
the caller to submit a message outbound with a callback when the reply
comes in.  The other non-blocking clients would already behave the
same way.  So I still don't believe one would need any changes to the
IDL to support asynchronous client behavior.  In this case, the client
runtime for thrift handles the message and replies and deals with the
blocking read from the transport, and frees up the application to
continue and take notifications when replies arrive.

I'm not convinced an out parameter is always going to be possible with
every language.  There may be some languages where basic types are
always passed by value and/or have no concept of references or
pointers.  Those would not be able to handle "out" semantics.

- Jim

>
> -----------------
>
> Regards,
>
> John
>
>
> -----Original Message-----
> From: Jens Geyer [mailto:[email protected]]
> Sent: 17 May 2019 23:36
> To: [email protected]
> Subject: Re: THRIFT-66 - Bidirectional communication
>
> Hi,
>
> > That gets you to the point where Thrift supports and generates
> > bidirectional, async & pub/sub based on IDL
>
> At least PUB/SUB can be implemented solely on the transport level. No IDL 
> change necessary.
>
>
> > adding attributes [asyc] method/interface
>
> async was already a keyword long ago. It's now called "oneway".
>
>
> >  pre-processing the extended IDL with a new pre-processor to generate the
> > representations of the service definitions for both sides in terms of the
> > current IDL
>
> Not sure if I can follow. What exactly would that preprocessor do? Generate
> IDL from IDL?
>
> As a general statement, I would recommend to strictly separate between an
> interface contract (which the IDL is and should be) and a concrete
> implementation or  configuration setup (which the IDL should not be).
>
>
> > [in/out/inout] parameter
>
> There was some discussion about this a while ago, don't remember the ticket.
> Technically that should be possible, since the data are transferred in a
> struct anyway, so multiple out parameters (or even in/out) should really not
> be that hard to implement. I have no idea if we run into subtle problems
> with certain languages and how that could be circumvented (by using structs
> maybe) and if in/out is really supported by all of them on a language level,
> we will find out.
>
>
> Have fun,
> JensG
>
>
> -----Ursprüngliche Nachricht-----
> From: John Dougrez-Lewis
> Sent: Thursday, May 16, 2019 6:29 AM
> To: 'James E. King III'
> Cc: [email protected]
> Subject: RE: THRIFT-66 - Bidirectional communication
>
> Yes, my suggestion is for an enhancement to Thrift to make it bidirectional,
> as follows:
>
>
> 1) use 2 connections, A => B, and B => A using what is already in place.
>
> 2) write, by hand, a new, supporting choreography code to establish this
> double connection, in the first instance for a single language
>
> 3) extend the IDL support this - adding attributes [asyc] method/interface,
> [pub/sub] method/interface, [in/out/inout] parameter
>
> 4) extend the IDL generation to implement this:
>
>       i) pre-processing the extended IDL with a new pre-processor to
> generate the representations of the service definitions for both sides in
> terms of the current IDL
>
>       ii) generate the language dependent code required to hook up at of the
> two ends.
>
> 5) implement across multiple languages
>
>
> That gets you to the point where Thrift supports and generates
> bidirectional, async & pub/sub based on IDL
>
> Then, optionally:
>
> 6) Having achieved a working implementation defined by an extended IDL,
> consider re-implementing in terms of a single bidirectional transport rather
> than 2 existing unidirectional independent transports.
>
>
> The extended IDL defining a single async/pubsub A<=>B interface is used to
> generate 2 interfaces: an outgoing A=>B and an async response return channel
> B=>A, e.g.:
>
>
> A<=>B
>   [async] returntype functionName (arg1, arg2)
> =>
> A=>B
>   Handletype functionName (arg1, arg2)
> B=>A
>   void functionName (Handletype, returntype)
>
> A<=>B
>   [async] returntype functionName (arg1, arg2, [out] arg3)
> =>
> A=>B
>   Handletype functionName (arg1, arg2)
> B=>A
>   void functionName (Handletype, returntype, arg3)
>
>
> A<=>B
>   [async] returntype functionName (arg1, arg2, [inout] arg3)
> =>
> A=>B
>   Handletype functionName (arg1, arg2, arg3in)
> B=>A
>   void functionName (Handletype, returntype, arg3out)
>
>
>
> -----Original Message-----
> From: James E. King III [mailto:[email protected]]
> Sent: 15 May 2019 19:02
> To: [email protected]
> Cc: James E. King III; [email protected]
> Subject: Re: THRIFT-66 - Bidirectional communication
>
> Re: Visio - There are no "endpoint" implementations that allow for
> bi-directional communication over a single transport connection today.
> Is that the Visio you were referring to?  The only implementation of that
> concept is buried in the THRIFT-66 attachments, and only for C#, and it was
> done on a codebase about 8 years past...
>
> With today's thrift code if you want either end to be a client (make
> requests) or a server (reply to requests) you would need to separately
> instantiate a client or server on each end and have them connect to
> each-other, i.e.
>
> A ---> B (A sends requests to a thrift server on B, B replies, on a
> transport) A <--- B (B sends requests to a thrift server on A, A replies, on
> a transport separate from the last one)
>
> It sounds like what you'd like to get to is:
>
> A <--> B (A and B can function as a client or as a server over the same
> transport)
>
> Thrift cannot do the latter today, so I would recommend the former, using
> one transport for each direction.
>
> Given they will both be on the same system, if your languages support it,
> unix domain sockets are quite fast.
> Otherwise if you want a shared memory solution you need to write your own
> transport for that.
>
> - Jim
>
> On Wed, May 15, 2019 at 1:12 PM John Dougrez-Lewis <[email protected]>
> wrote:
> >
> > Hi Jim,
> >
> > The "oneway" route looks a bit fragile in the face of failures in the
> > subsequent server-side processing which then cannot be signalled back to
> > the client.
> >
> > The 2-way connection would be the way forward for me, particularly since
> > it would work with across multiple languages.
> >
> > My primary use case would be a simple language bridge mechanism for a
> > library to allow processes coded in one language to call, with potentially
> > asynchronously returns, and pub/sub, to another process hosting a library
> > coded in another language running (in the first instance) on the same box,
> > communicating via IPC, preferably fast shared memory, but failing that
> > sockets would do.
> >
> > You put a Visio diagram up back in 2010. Is the underlying source code for
> > that available now ?
> >
> > Rather than hand-rolling the 2-way connection setup/teardown and
> > supporting code, for each and every language each time, it would be nice
> > if Framework code for that could be generated automatically from an
> > enhanced and extended version of the IDL.
> >
> > Regards,
> >
> > John
> >
> > -----Original Message-----
> > From: James E. King III [mailto:[email protected]]
> > Sent: 15 May 2019 12:05
> > To: [email protected]; [email protected]
> > Subject: Re: THRIFT-66 - Bidirectional communication
> >
> > Hello!
> >
> > Thrift is still a dedicated client/server model environment where clients
> > can request and servers reply.  The easiest way to make it 2-way today is
> > to open a connection both ways.  If you don't have firewalls in the way
> > then you can do this effectively.  The more difficult and more correct way
> > to do it would be to rewrite the transport layer to use endpoints in which
> > each side can be a client and/or server for any number of services (using
> > TMultiplexedProtocol on top of another protocol, like TBinaryProtocol).
> > This design allows one end to be a "listener", one end to be an
> > "initiator" (starts the connection), and after they connect they are equal
> > peers with the ability to request or reply of each-other.
> >
> > You can approximate asynchronous behavior by exclusively using "oneway"
> > requests in your design.  I'd suggest avoiding use of oneway requests with
> > THttpProtocol varieties however as today there are some issues, since Http
> > transport requires a response to be sent, and "oneway" dictates there is
> > no reply, and most languages do not handle it well right now (there are
> > open backlog issues for this).
> >
> > For a matrix of supported languages, protocols, transports, and server
> > types, see the file LANGUAGES.md at the root of the github repository.
> >
> > Another idea I was toying with a while ago was to add a message bus
> > transport to Thrift which would allow for things like reliable delivery
> > and broadcast semantics but that also does not exist today.
> >
> > - Jim
> >
> > On Wed, May 15, 2019 at 1:05 AM John Dougrez-Lewis <[email protected]>
> > wrote:
> > >
> > > Hi,
> > >
> > >
> > >
> > > I was looking for a mechanism to be able to provide
> > > language-agnostic API support to a hobby project I've been working on
> > > for some time.
> > >
> > >
> > >
> > > By following a trail of papers, books and references, I eventually
> > > came across Apache Thrift and have found and started going through
> > > Randy Abernethy's new book.
> > >
> > >
> > >
> > >
> > >
> > > Essentially what I was looking for was support for asynchronous
> > > calls, and by extension, pub/sub and two way communication across
> > > and between multiple languages over some channel, preferably IPC but in
> > > the worst case sockets.
> > >
> > >
> > >
> > >
> > >
> > > Having read the book, I can see that there is support for basic
> > > synchronous RPC between a client and a server over a significant
> > > number of languages and for just a very few languages, such as java,
> > > some element of support for asynchronous callbacks, and otherwise
> > > one-way methods that do not provide indication of subsequent failure.
> > >
> > >
> > >
> > > It appeared to me one way of extending bi-directional asynchronous
> > > support would be to have the client to set itself up as a server for
> > > the server at the other end to connect to, and then it would just be
> > > a question of choreographing the setting up of a pair of RPC channels.
> > >
> > >
> > >
> > > An asynchronous call could be implemented by providing a synchronous
> > > method that simply immediately returns a handle to the caller, and
> > > the server would then continue to process the call request on a
> > > background threadpool thread on the server, and the async result
> > > would then be signalled by a call from the server back to the client
> > > on the 2nd channel with the handle providing a context to lookup the
> > > result.
> > >
> > >
> > >
> > > Pub/sub would just then be multiple calls from the server back to
> > > the client.
> > >
> > >
> > >
> > > The whole thing could sit on top of the existing unidirectional RPC
> > > implementation and provide full asynchronous calls & pub/sub across
> > > *ALL* supported languages at probably very little additional effort,
> > > with no changes to the existing code.
> > >
> > >
> > >
> > > You could then have a framework that extended the existing IDL to
> > > include decoration with attributes for async & pub/sub methods & in/out
> > > parameters.
> > >
> > >
> > >
> > > This extended IDL could then be pre-processed to generate
> > > client-server and server-client service definitions in the existing
> > > base IDL language, together with generating supporting glue code to
> > > compile to provide the support for hooking up the channels between each
> > > side.
> > >
> > >
> > >
> > > I note that THRIFT-66 was raised 10 years ago, but it looks like the
> > > C# code was never made available for release by Dell.
> > >
> > >
> > >
> > > I have some questions:
> > >
> > >
> > >
> > > 1)      What is the current state of plans for this supporting this sort
> > > of
> > > functionality? What issues have been encountered ?
> > >
> > >
> > >
> > > 2)      Is there a document/spreadsheet somewhere showing a matrix of
> > > what
> > > Transports and Protocols are supported for each language?
> > >
> > >
> > >
> > >
> > >
> > > Regards,
> > >
> > >
> > >
> > > John
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
>
>

Re: THRIFT-66 - Bidirectional communication

Reply via email to