[
https://issues.apache.org/jira/browse/THRIFT-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305990#comment-14305990
]
Randy Abernethy commented on THRIFT-2932:
-----------------------------------------
Hey Andrew,
Thanks for the consolidated patch. Several nice improvements here. I am
concerned about the new sequence id modification though. It propagates the
sequencing data into the transport. This may seem reasonable in isolation but
the transport should not be involved at this level of abstraction.
In Apache Thrift, Protocols are responsible for the physical layout of bits (on
the wire, on disk or in memory). This includes where, how and if sequence
numbers are packaged. Transports are responsible for transmission of the data
only. This may involve packetizing, framing, encrypting or what have you, but
the key is that the Protocol and the Transport are isolated. The Transport
receives/returns an opaque block of bits from/to the Protocol.
The current node implementation is not clean in this regards and ultimately
needs some structural refactoring. For example the multiplexing read
implementation is a hack (paying no attention to the service name in the
payload and rather wiring everything to seqid lookups). I think we should try
to migrate things in the direction of the generalized Apache Thrift model. This
is non-trivial due to the pure async approach of JavaScript (a late model
option in the reference CPP and Java implementations).
The heart of the node problem here is that Apache Thrift is a layered system.
Lower layers provide services to upper layers and services do what you tell
them, they don’t call you with demands. The node implementation wires callbacks
from the bottom up (with several call chains back down from proto to trans in
the process). These conflicting approaches are merged with unsatisfactory
results in my opinion. To respect the Apache Thrift layered approach the “data”
event needs to be handled by the client (or by a much more focused connection
helper) and this piece of code needs to then call down the layered stack like
all other languages (client->proto->trans) for reads or writes.
In short, getting the node client side call return in order is a fair size
chunk of work and should fall into a separate issue.
Specific comments (line nos post patch):
----------------------------------------------------------
* lib/nodejs/lib/thrift/http_connection.js LINE 157: This is an exotic case
where the client does not implement a receive handler for the server message.
An extreme edge case that by definition can have no callback, so emit error is
probably the only reasonable out. Also keep in mind that prior to deserializing
at the protocol level we have no idea what the transmitted seqid is. Without a
seqid we cannot know the callback to invoke.
* lib/nodejs/lib/thrift/http_connection.js LINE 184: Here we are communicating
normal application behavior using exceptions, which always smells a bit (this
is not your code but it is making things murky here). If a real exception
arises here it is appropriately scoped to the client rather than the call. It
will be something like “the connection failed”. Call context exceptions should
be transmitted by the protocol, received successfully around line 145 and
presented to the callback in the err position. Anything else should be handled
at a higher level of abstraction. If I am writing a client I don’t want to have
to handle call failure in every callback, that logic should be provided in a
single error event handler across all calls.
* lib/nodejs/lib/thrift/thrift.js LINE 148: This appears to be an in house
utility, also "self.called" never appears to be set anywhere.
If you are fixing a specific problem in this code or if I am missing the point
let me know and maybe give me an example case to look over. In the meantime I’m
going to patch commit the hunks that are clear improvements here (nextTick
additions and the like). You can close this ticket and open new specific
tickets for individual issues not handled here or we can carry on here, up to
you. I will attach the partial patch committed.
Best,
Randy
> Node.js Thrift connection libraries throw Exceptions into event emitter
> -----------------------------------------------------------------------
>
> Key: THRIFT-2932
> URL: https://issues.apache.org/jira/browse/THRIFT-2932
> Project: Thrift
> Issue Type: Bug
> Components: Node.js - Library
> Affects Versions: 0.9.2
> Reporter: Tom Croucher
> Assignee: Randy Abernethy
> Priority: Critical
> Attachments:
> 0001-nodejs-exceptions-to-callbacks-instead-of-throws-int.patch
>
>
> I've been investigating using the Node.js client in a project however it
> seems like there are instances which don't follow Node.js best practices.
> In particular http_connection.js and connection.js throw errors during
> callbacks. This is considered an anti-pattern in Node because it both removes
> the Exception from the context of the callback making it hard to associate
> with a request as well as throwing it in the context of the EventEmitter code
> which can cause inconsistencies in the Node process.
> This means under some error conditions an uncaught exception would be thrown
> or at least an 'error' event on the singleton client (again removing it from
> the request context).
> Both transport receivers share the same copy-pasta code which contains:
> {code:javascript}
> catch (e) {
> if (e instanceof ttransport.InputBufferUnderrunError) {
> transport_with_data.rollbackPosition();
> }
> else {
> throw e;
> }
> }
> {code}
> I'm working on a patch, but I'm curious about some of the history of the
> code. In particular the exception based loop flow control and the using the
> seqid to track the callback which makes it hard to properly associate it with
> exception handling.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)