[ 
https://issues.apache.org/jira/browse/THRIFT-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106172#comment-17106172
 ] 

Mario Emmenlauer commented on THRIFT-5186:
------------------------------------------

Hey,

thanks for the feedback! I was also quite surprised to see this problem because 
we run tests on a number of platforms and I can only trigger the issue in a 
dockerized test. Windows, MacOS and Linux without docker work fine for me too.

The failing tests are mostly default Apache Thrift tests that I run unmodified. 
I would assume that the use of port 0 is intentional and should select a random 
free port - at least this is something we also use a lot in production with 
Thrift servers. Also, without your patch these same tests work fine, so there 
seems to be some causality.

I assume the issue must be related to the Docker network setup. That would make 
some sense to me, as your PR also mentions specifically cases where hosts may 
have an untypical small network setup, like no IPv4/no IPv6/only loopback. This 
may be true for my Docker containers.

The problem is I'm using a relatively default docker setup for gitlab runner. 
So I did not configure the network myself. I can't say much about how its 
configured. Is there something specific that may be helpful for you to know? I 
can run Linux commands in the test setup, albeit with some effort. Would it 
help to list the available devices and the routing table? Or do you have access 
to a Docker container?


> AI_ADDRCONFIG: Thrift libraries crash with localhost-only network.
> ------------------------------------------------------------------
>
>                 Key: THRIFT-5186
>                 URL: https://issues.apache.org/jira/browse/THRIFT-5186
>             Project: Thrift
>          Issue Type: Bug
>          Components: C++ - Library, Delphi - Library, Python - Library
>    Affects Versions: 0.13.0
>         Environment: Red Hat Enterprise Linux 8.0
>            Reporter: Max
>            Assignee: Max
>            Priority: Major
>              Labels: getaddrinfo, localhost, sockets
>             Fix For: 0.14.0
>
>         Attachments: 
> 0001-THRIFT-5186-Dont-pass-AI_ADDRCONFIG-to-getaddrinfo.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> THRIFT-2539 has been reported, and fixed — but for win32 only, for no 
> apparent reason. The exact same problem reproduces on POSIX.
> Namely, when no network interfaces besides {{lo}} (the 127.0.0.1 loopback 
> interface) are up, C++ and Python apps linked with Thrift-generated code, 
> both clients and servers — *crash by throwing an exception*. Even when the 
> intention is exactly to run them on localhost only.
> This happens because Thrift library code for TSocket, TServerSocket, 
> TNonblockingServerSocket calls 
> [{{getaddrinfo()}}|http://man7.org/linux/man-pages/man3/getaddrinfo.3.html] 
> to resolve target hostname to connect to/listen on, into concrete IP address 
> (v4 or v6, whichever the system is configured for). To that call, it *passes 
> the {{AI_ADDRCONFIG}} hint* which effectively turns a localhost-only 
> situation into:
> {quote}{{Could not resolve host for client socket.}}
> {quote}
> and into this (server-side):
> {code:java}
> гру 23 13:52:13 localhost.localdomain systemd[1]: db_cache.service: Main 
> process exited, code=dumped, status=6/ABRT
> гру 23 13:52:13 localhost.localdomain systemd[1]: db_cache.service: Failed 
> with result 'core-dump'.
> гру 23 13:52:17 localhost.localdomain db_cache[12912]: Thrift: Mon Dec 23 
> 13:52:15 2019 TSocket::open() getaddrinfo() <Host: 127.0.0.1 Port: 
> 1302>Address family for hostname not supported
> гру 23 13:52:17 localhost.localdomain db_cache[12912]: Thrift: Mon Dec 23 
> 13:52:15 2019 TSocket::open() getaddrinfo() <Host: 127.0.0.1 Port: 
> 8345>Address family for hostname not supported
> гру 23 13:52:17 localhost.localdomain db_cache[12912]: Thrift: Mon Dec 23 
> 13:52:15 2019 TNonblocking: using dedicated listener thread, io threads: 16
> гру 23 13:52:17 localhost.localdomain db_cache[12912]: Thrift: Mon Dec 23 
> 13:52:15 2019 getaddrinfo -9: Address family for hostname not supported
> гру 23 13:52:17 localhost.localdomain db_cache[12912]: terminate called after 
> throwing an instance of 'apache::thrift::transport::TTransportException'
> гру 23 13:52:17 localhost.localdomain db_cache[12912]:   what():  Could not 
> resolve host for server socket.
> {code}
> I fail to understand the original reason to pass that {{AI_ADDRCONFIG}} hint. 
> It shouldn't be there as I see it.
> Further, since Thrift 0.9.2, windows builds of thrift apps don't pass that 
> hint anymore (see THRIFT-2539), and it seems to be okay.
> For comprehension, I'm attaching a sample patch to remove {{AI_ADDRCONFIG}} 
> from {{lib/cpp}} and {{lib/py}}. The main change will be landing via GitHub, 
> per Thrift's contribution process, so please follow there too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to