Hi,

attached is a patch that I think is correct and it did solve a bug I was
seeing, but I'm not sure if it's the right thing to do since the code
was stable since 2009..

What I was seeing was a scenario where an LDAP server was listening on port
389, so we were able to connect there with a socket, but then all searches
timed out during rootDSE discovery.. What happened in the sdap_id_ops.c
code was that we hit this part:

810     switch (retval) {
811         case EIO:
812 --->    case ETIMEDOUT:
813             /* this currently the only possible communication error after 
connection is established */
814             communication_error = true;
815             break;
816
817         default:
818             communication_error = false;
819             break;
820     }
821

And then we went here, because the connection was already established:

822     if (communication_error && current_conn != 0
823             && current_conn == op->conn_cache->cached_connection) {
824         /* do not reuse failed connection */
825         op->conn_cache->cached_connection = NULL;
826
827         DEBUG(SSSDBG_FUNC_DATA,
828               "communication error on cached connection, moving to next 
server\n");
829 ---->   be_fo_try_next_server(op->conn_cache->id_conn->id_ctx->be,
830                               op->conn_cache->id_conn->service->name);
831     }

But I admit I don't understand why does be_fo_try_next_server() set the
port status to NEUTRAL. That caused the connection code to run again,
hit the same timeout issue and then cycle again and again..

Can anyone parse from the code why do we set the port to neutral instead
of not_working in be_fo_try_next_server() ?
>From 37806e08b5bc7a972466abd26f843006a0e2513e Mon Sep 17 00:00:00 2001
From: Jakub Hrozek <jhro...@redhat.com>
Date: Tue, 10 May 2016 14:11:36 +0200
Subject: [PATCH] FO: Set port to NOT_WORKING when trying a next server

---
 src/providers/fail_over.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/providers/fail_over.c b/src/providers/fail_over.c
index 
e945c9924597c7addeeb11090e1c1aee5596cb71..1d88d2aa54bfdebd4b648e2b13fa8d03e2be3973
 100644
--- a/src/providers/fail_over.c
+++ b/src/providers/fail_over.c
@@ -1546,7 +1546,7 @@ void fo_try_next_server(struct fo_service *service)
     service->active_server = 0;
 
     if (server->port_status == PORT_WORKING) {
-        server->port_status = PORT_NEUTRAL;
+        server->port_status = PORT_NOT_WORKING;
     }
 }
 
-- 
2.4.11

_______________________________________________
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org

Reply via email to