Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23466 )

Change subject: IMPALA-14440: Set default http_socket_timeout_s
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/23466/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23466/1//COMMIT_MSG@9
PS1, Line 9: Sets a default http_socket_timeout_s of 600s
> The more I dig into this, the less I like this option.
>1. Implement HTTP Keep-Alive so we re-use the same TCP connection

I have implemented it in a WIP change, but I think that it is slightly broken
https://gerrit.cloudera.org/#/c/23226/
The problem is that it has no keepalive for TCP, so it can timeout, and I tried 
handling this with retries, but this can be problematic if the issue comes at 
the middle of a non-idempotent rpc request.

So I agree that we need both http and tcp keepalive to avoid needing to retry 
in case of dead connections.

I don't see a documented way to do this with the http_client module. switching 
to requests should make this simpler as it pools connections for sessions: 
https://requests.readthedocs.io/en/latest/user/advanced/
Found an example in the comment at the bottom of this blog:
https://blog.panagiks.com/2019/05/python-tcp-keepalive-on-http-request.html

This is a bit bigger change, but I think that the right direction is to drop 
python 2 support -> switch to requests/urllib3 -> implement http/tcp keepalive

Keeping tcp connections alive forever can be also problematic, as it can lock 
resources for already finished sessions that were just not closed properly. My 
understanding is that connection pooling from urllib3 does this properly and 
keeps connections only for a limited time and then reestablish tcp if they 
timed out.



--
To view, visit http://gerrit.cloudera.org:8080/23466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09f62188552caf65786f57a190722bfe020fc82d
Gerrit-Change-Number: 23466
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Comment-Date: Mon, 06 Oct 2025 12:50:26 +0000
Gerrit-HasComments: Yes

Reply via email to