[
https://issues.apache.org/jira/browse/THRIFT-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106191#comment-17106191
]
Max commented on THRIFT-5186:
-----------------------------
Also, I'd like to inspect the setup of docker environment, to see if cause of
the issue is rooted there.
*TServerIntegrationTest* — which you mention is failing in that environment —
is passing in my local build on 55680af.
{code}
ulidtko@pasocon ~/s/thrift (master)> BUILD/bin/TServerIntegrationTest -r
detailed
Running 13 test cases...
Test module "TServerIntegrationTest" has passed with:
13 test cases out of 13 passed
21 assertions out of 21 passed
Test suite "constructors" has passed with:
10 test cases out of 10 passed
10 assertions out of 10 passed
Test case "constructors/test_simple_factory" has passed with:
1 assertion out of 1 passed
Test case "constructors/test_simple" has passed with:
1 assertion out of 1 passed
Test case "constructors/test_threaded_factory" has passed with:
1 assertion out of 1 passed
Test case "constructors/test_threaded" has passed with:
1 assertion out of 1 passed
Test case "constructors/test_threaded_bound" has passed with:
1 assertion out of 1 passed
Test case "constructors/test_threaded_stress" has passed with:
1 assertion out of 1 passed
Test case "constructors/test_threadpool_factory" has passed with:
1 assertion out of 1 passed
Test case "constructors/test_threadpool" has passed with:
1 assertion out of 1 passed
Test case "constructors/test_threadpool_bound" has passed with:
1 assertion out of 1 passed
Test case "constructors/test_threadpool_stress" has passed with:
1 assertion out of 1 passed
Test suite "TServerIntegrationTest" has passed with:
3 test cases out of 3 passed
11 assertions out of 11 passed
Test case
"TServerIntegrationTest/test_stop_with_interruptable_clients_connected" has
passed with:
2 assertions out of 2 passed
Test case
"TServerIntegrationTest/test_stop_with_uninterruptable_clients_connected" has
passed with:
1 assertion out of 1 passed
Test case "TServerIntegrationTest/test_concurrent_client_limit" has passed
with:
8 assertions out of 8 passed
{code}
I've run this 10 times, and got 10 consequent passes.
Likewise for *BUILD/bin/UnitTests* — which actually do call listen on
{{("localhost", 0)}} (see in {{lib/cpp/test/TServerSocketTest.cpp}}). Passes
every time for me. Moreover, TTransportException is correctly raised by
{{TServerSocket("257.258.259.260", 0)}}, as it should, on line 47 of
TServerSocketTest.cpp.
Travis doesn't run these unit tests (?!?), at least it seems so from the build
logs: https://travis-ci.org/github/apache/thrift/builds/680570631
This difference in behavior makes me think that it's not the patch which is
causing the failures, but an environment difference. [~emmenlau] please tell me
which scripts/steps to follow to replicate — I can take a look into that.
> AI_ADDRCONFIG: Thrift libraries crash with localhost-only network.
> ------------------------------------------------------------------
>
> Key: THRIFT-5186
> URL: https://issues.apache.org/jira/browse/THRIFT-5186
> Project: Thrift
> Issue Type: Bug
> Components: C++ - Library, Delphi - Library, Python - Library
> Affects Versions: 0.13.0
> Environment: Red Hat Enterprise Linux 8.0
> Reporter: Max
> Assignee: Max
> Priority: Major
> Labels: getaddrinfo, localhost, sockets
> Fix For: 0.14.0
>
> Attachments:
> 0001-THRIFT-5186-Dont-pass-AI_ADDRCONFIG-to-getaddrinfo.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> THRIFT-2539 has been reported, and fixed — but for win32 only, for no
> apparent reason. The exact same problem reproduces on POSIX.
> Namely, when no network interfaces besides {{lo}} (the 127.0.0.1 loopback
> interface) are up, C++ and Python apps linked with Thrift-generated code,
> both clients and servers — *crash by throwing an exception*. Even when the
> intention is exactly to run them on localhost only.
> This happens because Thrift library code for TSocket, TServerSocket,
> TNonblockingServerSocket calls
> [{{getaddrinfo()}}|http://man7.org/linux/man-pages/man3/getaddrinfo.3.html]
> to resolve target hostname to connect to/listen on, into concrete IP address
> (v4 or v6, whichever the system is configured for). To that call, it *passes
> the {{AI_ADDRCONFIG}} hint* which effectively turns a localhost-only
> situation into:
> {quote}{{Could not resolve host for client socket.}}
> {quote}
> and into this (server-side):
> {code:java}
> гру 23 13:52:13 localhost.localdomain systemd[1]: db_cache.service: Main
> process exited, code=dumped, status=6/ABRT
> гру 23 13:52:13 localhost.localdomain systemd[1]: db_cache.service: Failed
> with result 'core-dump'.
> гру 23 13:52:17 localhost.localdomain db_cache[12912]: Thrift: Mon Dec 23
> 13:52:15 2019 TSocket::open() getaddrinfo() <Host: 127.0.0.1 Port:
> 1302>Address family for hostname not supported
> гру 23 13:52:17 localhost.localdomain db_cache[12912]: Thrift: Mon Dec 23
> 13:52:15 2019 TSocket::open() getaddrinfo() <Host: 127.0.0.1 Port:
> 8345>Address family for hostname not supported
> гру 23 13:52:17 localhost.localdomain db_cache[12912]: Thrift: Mon Dec 23
> 13:52:15 2019 TNonblocking: using dedicated listener thread, io threads: 16
> гру 23 13:52:17 localhost.localdomain db_cache[12912]: Thrift: Mon Dec 23
> 13:52:15 2019 getaddrinfo -9: Address family for hostname not supported
> гру 23 13:52:17 localhost.localdomain db_cache[12912]: terminate called after
> throwing an instance of 'apache::thrift::transport::TTransportException'
> гру 23 13:52:17 localhost.localdomain db_cache[12912]: what(): Could not
> resolve host for server socket.
> {code}
> I fail to understand the original reason to pass that {{AI_ADDRCONFIG}} hint.
> It shouldn't be there as I see it.
> Further, since Thrift 0.9.2, windows builds of thrift apps don't pass that
> hint anymore (see THRIFT-2539), and it seems to be okay.
> For comprehension, I'm attaching a sample patch to remove {{AI_ADDRCONFIG}}
> from {{lib/cpp}} and {{lib/py}}. The main change will be landing via GitHub,
> per Thrift's contribution process, so please follow there too.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)