[
https://issues.apache.org/jira/browse/THRIFT-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333631#comment-16333631
]
Buğra Gedik edited comment on THRIFT-4465 at 1/21/18 7:09 PM:
--------------------------------------------------------------
[~jking3]: While removing the {{workSocket()}} fixes the problem for the
regular sockets, it breaks SSL based sockets. It turns out the SSL sockets
circumvent the spurious exceptions via the following code.
{code:java}
} catch (TTransportException& te) {
//In Nonblocking SSLSocket some operations need to be retried again.
//Current approach is parsing exception message, but a better solution
needs to be investigated.
if(!strstr(te.what(), "retry")) {
GlobalOutput.printf("TConnection::workSocket(): %s", te.what());
close();
return;
}
{code}
This looks wrong at so many levels if you ask me. Handling retries with an
exception is a performance killer. Then there is the issue of searching for a
message string in the exception. In any case, the original code before the SSL
changes assumed that you only consume from a socket once it is ready for
reading, so you never get an exception. By uncommenting the premature
workStatus() call, the SSL PR changed this and handled the resulting exception
with the above code, but only for the SSL setup, and not for regular sockets!!!
I'll see if I can figure out:
1) Why does SSL server depend on the premature workStatus() call?
2) Can the SSL server be changed to work without having to perform retries with
an exception?
I don't see Divya Thaluru in the system, who is the original author of the SSL
extensions.
was (Author: bgedik):
[~jking3]: While removing the {{workSocket()}} fixes the problem for the
regular sockets, it breaks SSL based sockets. It turns out the SSL sockets
circumvent the spurious exceptions via the following code.
{code:java}
} catch (TTransportException& te) {
//In Nonblocking SSLSocket some operations need to be retried again.
//Current approach is parsing exception message, but a better solution
needs to be investigated.
if(!strstr(te.what(), "retry")) {
GlobalOutput.printf("TConnection::workSocket(): %s", te.what());
close();
return;
}
{code}
This looks wrong at so many levels if you ask me. Handling retries with an
exception is a performance killer. Then there is the issue of searching for a
message string in the exception. In any case, the original code before the SSL
changes assumed that you only consume from a socket once it is ready for
reading, so you never get an exception. By uncommenting the premature
workStatus() call, the SSL PR changed this and handled the resulting exception
with the above code, but only for the SSL setup, and not for regular sockets!!!
I'll see if I can figure out:
1) Why does SSL server depend on the premature workStatus() call?
2) Can the SSL server be changed to work without having to perform retries with
an exception?
> TNonblockingServer throwing THRIFT LOGGER: TConnection::workSocket():
> THRIFT_EAGAIN (unavailable resources)
> -----------------------------------------------------------------------------------------------------------
>
> Key: THRIFT-4465
> URL: https://issues.apache.org/jira/browse/THRIFT-4465
> Project: Thrift
> Issue Type: Bug
> Components: C++ - Library
> Affects Versions: 0.11.0
> Reporter: Buğra Gedik
> Priority: Critical
>
> Once I upgraded to 0.11.0, I'm getting the following error occasionally:
> THRIFT LOGGER: TConnection::workSocket(): THRIFT_EAGAIN (unavailable
> resources)
> I tracked this to the following change:
> [https://github.com/apache/thrift/commit/808d143245f4f5c30600fab31cf9db854cbf5b48#diff-fe8fec8ec38ea35df64cfcc305e3ab08]
>
> {code:java}
> // Work the socket right away
> - // workSocket();
> + workSocket();
> {code}
> While adding SSL support, @dthaluru has re-activated the above line. From my
> own testing, this causes occasional THRIFT_EAGAIN exceptions. It seems like
> this is due to calling workSocket() too early and the socket gets a read call
> in non-blocking mode before it has data.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)