Hi, attached is a patch that I think is correct and it did solve a bug I was seeing, but I'm not sure if it's the right thing to do since the code was stable since 2009..
What I was seeing was a scenario where an LDAP server was listening on port 389, so we were able to connect there with a socket, but then all searches timed out during rootDSE discovery.. What happened in the sdap_id_ops.c code was that we hit this part: 810 switch (retval) { 811 case EIO: 812 ---> case ETIMEDOUT: 813 /* this currently the only possible communication error after connection is established */ 814 communication_error = true; 815 break; 816 817 default: 818 communication_error = false; 819 break; 820 } 821 And then we went here, because the connection was already established: 822 if (communication_error && current_conn != 0 823 && current_conn == op->conn_cache->cached_connection) { 824 /* do not reuse failed connection */ 825 op->conn_cache->cached_connection = NULL; 826 827 DEBUG(SSSDBG_FUNC_DATA, 828 "communication error on cached connection, moving to next server\n"); 829 ----> be_fo_try_next_server(op->conn_cache->id_conn->id_ctx->be, 830 op->conn_cache->id_conn->service->name); 831 } But I admit I don't understand why does be_fo_try_next_server() set the port status to NEUTRAL. That caused the connection code to run again, hit the same timeout issue and then cycle again and again.. Can anyone parse from the code why do we set the port to neutral instead of not_working in be_fo_try_next_server() ?
>From 37806e08b5bc7a972466abd26f843006a0e2513e Mon Sep 17 00:00:00 2001 From: Jakub Hrozek <jhro...@redhat.com> Date: Tue, 10 May 2016 14:11:36 +0200 Subject: [PATCH] FO: Set port to NOT_WORKING when trying a next server --- src/providers/fail_over.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/providers/fail_over.c b/src/providers/fail_over.c index e945c9924597c7addeeb11090e1c1aee5596cb71..1d88d2aa54bfdebd4b648e2b13fa8d03e2be3973 100644 --- a/src/providers/fail_over.c +++ b/src/providers/fail_over.c @@ -1546,7 +1546,7 @@ void fo_try_next_server(struct fo_service *service) service->active_server = 0; if (server->port_status == PORT_WORKING) { - server->port_status = PORT_NEUTRAL; + server->port_status = PORT_NOT_WORKING; } } -- 2.4.11
_______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org