[ https://issues.apache.org/jira/browse/ARROW-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16979172#comment-16979172 ]
Chengxin Ma commented on ARROW-7200: ------------------------------------ gRPC verbose logging shows: {code:java} ubuntu@ip-172-31-3-53:~/arrow/cpp/for_flight/debug$ env GRPC_VERBOSITY=debug GRPC_TRACE=subchannel ./arrow-flight-benchmark -server_host 172.31.11.18 Using remote server: true Testing method: DoGet Server host: 172.31.11.18 Server port: 31337 ... I1121 09:28:58.051069998 14713 subchannel.cc:960] Connect failed: {"created":"@1574328538.050990277","description":"Failed to connect to remote host: Connection refused","errno":111,"file":"src/core/lib/iomgr/tcp_client_posix.cc","file_line":207,"os_error":"Connection refused","syscall":"connect","target_address":"ipv6:[::1]:31337"} I1121 09:28:58.051209019 14713 subchannel.cc:960] Connect failed: {"created":"@1574328538.051185929","description":"Failed to connect to remote host: Connection refused","errno":111,"file":"src/core/lib/iomgr/tcp_client_posix.cc","file_line":207,"os_error":"Connection refused","syscall":"connect","target_address":"ipv4:127.0.0.1:31337"} ... {code} It seems that the client wanted to connect to the loopback address of the server, which would certainly fail. The problem is perhaps in the {{RunPerformanceTest}} function of [flight_benchmark.cc|https://github.com/apache/arrow/blob/maint-0.15.x/cpp/src/arrow/flight/flight_benchmark.cc]. At [line 209|https://github.com/apache/arrow/blob/maint-0.15.x/cpp/src/arrow/flight/flight_benchmark.cc#L209] the client will try to connect to the server again, but it will use {{localhost}} instead of the server's IP address: If I add {code:java} for (const auto& location : endpoint.locations) { std::cout << location.ToString() << std::endl; } {code} after [line 233|https://github.com/apache/arrow/blob/maint-0.15.x/cpp/src/arrow/flight/flight_benchmark.cc#L233], and run the client again, it will show: {code} ubuntu@ip-172-31-3-53:~/arrow/cpp/for_flight/debug$ ./arrow-flight-benchmark --server_host 172.31.11.18 Using remote server: true Testing method: DoGet Server host: 172.31.11.18 Server port: 31337 grpc+tcp://localhost:31337 grpc+tcp://localhost:31337 grpc+tcp://localhost:31337 grpc+tcp://localhost:31337 Failed with error: << IOError: gRPC returned unavailable error, with message: Connect Failed. Detail: Unavailable {code} > Running Arrow Flight benchmark on two hosts doesn't work > -------------------------------------------------------- > > Key: ARROW-7200 > URL: https://issues.apache.org/jira/browse/ARROW-7200 > Project: Apache Arrow > Issue Type: Bug > Components: Benchmarking, C++, FlightRPC > Affects Versions: 0.15.0, 0.15.1 > Environment: AWS EC2 > Instance type: t3a.xlarge > AMI: ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20191002 > Number of instances: 2 > They are capable of pinging each other. > Reporter: Chengxin Ma > Priority: Major > Attachments: Screen Shot 2019-11-18 at 16.00.38.png, Screen Shot > 2019-11-19 at 14.41.40.png > > > I was trying to evaluate the performance of Apache Arrow Flight on two hosts > (one as the client and the other one as the server), using [the official > benchmark|[https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/flight_benchmark.cc]]. > Flags I used to build the project were: > > {code:java} > -DARROW_FLIGHT=ON > -DCMAKE_BUILD_TYPE=Debug > -DARROW_BUILD_BENCHMARKS=ON > {code} > > The branch I used was maint-0.15.x since there was a build error on the > master branch. _(The build error on master only existed in the environment > where I set up two hosts: AWS. On my local environment (macOS) the build was > successful on the master branch. I don't think this build error is relevant > to the issue since there is no difference in the cpp source code.)_ > On the host acting as the server, I ran > {code:java} > ./arrow-flight-perf-server{code} > On the host acting as the client, I ran > {code:java} > ./arrow-flight-benchmark --server_host ip-172-31-11-18{code} > It gives the following error: > {code:java} > Failed with error: << IOError: gRPC returned unavailable error, with message: > Connect Failed. Detail: Unavailable{code} > > If I ran > {code:java} > ./arrow-flight-benchmark --server_host ip-172-31-11-17{code} > the error will be different: > {code:java} > IOError: Server was not available after 10 attempts{code} > This is understandable since this host doesn't exist at all. > This indicates that Flight is able to find the existing host > (ip-172-31-11-18), but the communication somehow didn't succeed. > The benchmark works fine if I run it with the localhost, either by not > specifying the server_host flag or running the server in another process on > the same host. > I am not sure if the problem is in the environment or in the code itself. > Could someone please give me some hint on how to resolve the problem? -- This message was sent by Atlassian Jira (v8.3.4#803005)