On 11/19/2016 08:46 PM, The Lee-Man wrote:
> In this wonderful new world of systemd, I have an issue with stopping
> the iscsid service when the daemon has died or been killed.
> 
> My setup:
> * I have an iscsid.socket unit file, which is enabled and started
> * I have an iscsid.service unit file, which controls the iscsid daemon.
> This is disabled and not started
> 
> Normally, if I run a command like "iscsiadm -m discovery -t st -p
> SOME-TARGET", systemd notices that iscsiadm is trying to talk to iscsid
> through the socket, and it starts up iscsid. This is the cool part
> (IMHO) of systemd socket activation.
> 
> When I want to stop iscsid, I can just tell systemctl to do it via
> "systemctl stop iscsid", and it runs the "ExecStop=" command in the
> service unit file, which is "iscsiadm -k 0 2" before terminating the
> daemon process(es).
> 
> [NOTE: the "2" here in this command actually does nothing and is
> ignored, but I copied this from someplace else long ago, and the "2" was
> present there.]
> 
> It is of importance, in this case, that the ExecStop command actually
> sends an IPC message to the daemon (iscsid) requesting it to cleanly
> shut itself down. Herein lies the rub.
> 
> All of this works great until the daemon happens not to be running. You
> can simulate this with "kill -TERM $(pidof iscsid)" when the daemon is
> running. Now you are in a situation where systemd started the service
> and knows it is now not running, so it seems to send the ExecStop
> command to cleanly shut it down. This command hangs! It seems to be
> stuck in an infinite loop trying to send the shutdown command to the
> daemon, waiting for it to timeout, then trying again. The daemon starts
> up, sees an IPC error, and exits.
> 
> While this clearly seems like a systemd issue, I have found a workaround
> that looks clean. Well, as clean as the shutdown command is, anyway.

No, that's not the case.
This is a plain issue with iscsiadm; it's waiting in 'recv' to receive
an response packet from iscsid which of course never will come.
The reason why this becomes an issue now is that systemd holds the
socket for us, so the socket is available; hence 'recv()' will not
return with ENOTCONN or somesuch, indicating that the socket isn't present.
The socket _is_ present, it's just not sending any responses back.

Attached is a patch for open-iscsi to use 'poll()' before recv(), and
terminating it if we didn't get a response in time.

This doesn't do anything from the mentioned socket activation issue for
iscsid, but at least iscsiadm doesn't hang anymore.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Teamlead Storage & Networking
h...@suse.de                                   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at https://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.
>From 74a63d90b9fc4ce0c4c329b0562bb1358804385c Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <h...@suse.de>
Date: Mon, 21 Nov 2016 10:07:39 +0100
Subject: [PATCH] Use timeout when waiting for responses from iscsid

The server might already been terminated when iscsiadm tries to
send a request to it, hence we might be waiting forever for a reply.
With this patchset we're waiting at most one minute before giving up,
avoiding a hang in iscsiadm.

Signed-off-by: Hannes Reinecke <h...@suse.com>
---
 usr/config.h       |  1 +
 usr/discovery.c    | 16 ++++++++--------
 usr/host.c         |  2 +-
 usr/iscsi_sysfs.c  |  1 +
 usr/iscsiadm.c     | 11 ++++++-----
 usr/iscsid.c       |  2 +-
 usr/iscsid_req.c   | 51 +++++++++++++++++++++++++++++++++++++++++----------
 usr/iscsid_req.h   | 15 +++++++++------
 usr/iscsistart.c   |  4 ++--
 usr/session_info.c | 14 ++++++++------
 usr/session_info.h |  5 +++--
 11 files changed, 81 insertions(+), 41 deletions(-)

diff --git a/usr/config.h b/usr/config.h
index 2e36a0a..a464cfd 100644
--- a/usr/config.h
+++ b/usr/config.h
@@ -316,6 +316,7 @@ typedef struct discovery_rec {
 	discovery_type_e	type;
 	char			address[NI_MAXHOST];
 	int			port;
+	int			iscsid_req_tmo;
 	union {
 		struct iscsi_sendtargets_config	sendtargets;
 		struct iscsi_slp_config		slp;
diff --git a/usr/discovery.c b/usr/discovery.c
index ee26a90..9e17f07 100644
--- a/usr/discovery.c
+++ b/usr/discovery.c
@@ -66,7 +66,7 @@ static char initiator_name[TARGET_NAME_MAXLEN + 1];
 static char initiator_alias[TARGET_NAME_MAXLEN + 1];
 static struct iscsi_ev_context ipc_ev_context;
 
-static int request_initiator_name(void)
+static int request_initiator_name(int tmo)
 {
 	int rc;
 	iscsiadm_req_t req;
@@ -80,7 +80,7 @@ static int request_initiator_name(void)
 	memset(&req, 0, sizeof(req));
 	req.command = MGMT_IPC_CONFIG_INAME;
 
-	rc = iscsid_exec_req(&req, &rsp, 1);
+	rc = iscsid_exec_req(&req, &rsp, 1, tmo);
 	if (rc)
 		return rc;
 
@@ -90,7 +90,7 @@ static int request_initiator_name(void)
 	memset(&req, 0, sizeof(req));
 	req.command = MGMT_IPC_CONFIG_IALIAS;
 
-	rc = iscsid_exec_req(&req, &rsp, 0);
+	rc = iscsid_exec_req(&req, &rsp, 0, tmo);
 	if (rc)
 		/* alias is optional so return ok */
 		return 0;
@@ -347,7 +347,7 @@ int discovery_isns(void *data, struct iface_rec *iface,
 	if (iface && strlen(iface->iname))
 		iname = iface->iname;
 	else {
-		rc = request_initiator_name();
+		rc = request_initiator_name(drec->iscsid_req_tmo);
 		if (rc) {
 			log_error("Cannot perform discovery. Initiatorname "
 				  "required.");
@@ -458,7 +458,7 @@ int discovery_offload_sendtargets(int host_no, int do_login,
 	 * and get back the results. We should do this since it would
 	 * allows us to then process the results like software iscsi.
 	 */
-	rc = iscsid_exec_req(&req, &rsp, 1);
+	rc = iscsid_exec_req(&req, &rsp, 1, drec->iscsid_req_tmo);
 	if (rc) {
 		log_error("Could not offload sendtargets to %s.",
 			  drec->address);
@@ -806,7 +806,7 @@ static void iscsi_free_session(struct iscsi_session *session)
 
 static iscsi_session_t *
 iscsi_alloc_session(struct iscsi_sendtargets_config *config,
-		    struct iface_rec *iface, int *rc)
+		    struct iface_rec *iface, int *rc, int tmo)
 {
 	iscsi_session_t *session;
 
@@ -852,7 +852,7 @@ iscsi_alloc_session(struct iscsi_sendtargets_config *config,
 		strcpy(initiator_name, iface->iname);
 		/* MNC TODO add iface alias */
 	} else {
-		*rc = request_initiator_name();
+		*rc = request_initiator_name(tmo);
 		if (*rc) {
 			log_error("Cannot perform discovery. Initiatorname "
 				  "required.");
@@ -1575,7 +1575,7 @@ int discovery_sendtargets(void *fndata, struct iface_rec *iface,
 	iscsi_timer_clear(&connection_timer);
 
 	/* allocate a new session, and initialize default values */
-	session = iscsi_alloc_session(config, iface, &rc);
+	session = iscsi_alloc_session(config, iface, &rc, drec->iscsid_req_tmo);
 	if (rc)
 		return rc;
 
diff --git a/usr/host.c b/usr/host.c
index f2052d3..9b65fe0 100644
--- a/usr/host.c
+++ b/usr/host.c
@@ -251,7 +251,7 @@ static int host_info_print_tree(void *data, struct host_info *hinfo)
 	printf("\tSessions:\n");
 	printf("\t*********\n");
 
-	session_info_print_tree(&sessions, "\t", session_info_flags, 0);
+	session_info_print_tree(&sessions, "\t", session_info_flags, 0, -1);
 	session_info_free_list(&sessions);
 	return 0;
 }
diff --git a/usr/iscsi_sysfs.c b/usr/iscsi_sysfs.c
index 3a37a48..d87250c 100644
--- a/usr/iscsi_sysfs.c
+++ b/usr/iscsi_sysfs.c
@@ -1420,6 +1420,7 @@ int iscsi_sysfs_for_each_session(void *data, int *nr_found,
 	if (!info)
 		return ISCSI_ERR_NOMEM;
 
+	info->iscsid_req_tmo = -1;
 	n = scandir(ISCSI_SESSION_DIR, &namelist, trans_filter,
 		    alphasort);
 	if (n <= 0)
diff --git a/usr/iscsiadm.c b/usr/iscsiadm.c
index d7abef8..7f95096 100644
--- a/usr/iscsiadm.c
+++ b/usr/iscsiadm.c
@@ -256,7 +256,7 @@ str_to_portal_type(char *str)
 	return ptype;
 }
 
-static void kill_iscsid(int priority)
+static void kill_iscsid(int priority, int tmo)
 {
 	iscsiadm_req_t req;
 	iscsiadm_rsp_t rsp;
@@ -278,7 +278,7 @@ static void kill_iscsid(int priority)
 
 	memset(&req, 0, sizeof(req));
 	req.command = MGMT_IPC_IMMEDIATE_STOP;
-	rc = iscsid_exec_req(&req, &rsp, 0);
+	rc = iscsid_exec_req(&req, &rsp, 0, tmo);
 	if (rc) {
 		iscsi_err_print_msg(rc);
 		log_error("Could not stop iscsid. Trying sending iscsid "
@@ -745,7 +745,7 @@ static char *get_config_file(void)
 	memset(&req, 0, sizeof(req));
 	req.command = MGMT_IPC_CONFIG_FILE;
 
-	rc = iscsid_exec_req(&req, &rsp, 1);
+	rc = iscsid_exec_req(&req, &rsp, 1, ISCSID_REQ_TIMEOUT);
 	if (rc)
 		return NULL;
 
@@ -795,7 +795,7 @@ session_stats(void *data, struct session_info *info)
 	req.command = MGMT_IPC_SESSION_STATS;
 	req.u.session.sid = info->sid;
 
-	rc = iscsid_exec_req(&req, &rsp, 1);
+	rc = iscsid_exec_req(&req, &rsp, 1, info->iscsid_req_tmo);
 	if (rc)
 		return rc;
 
@@ -3256,6 +3256,7 @@ main(int argc, char **argv)
 	int packet_size=32, ping_count=1, ping_interval=0;
 	int do_discover = 0, sub_mode = -1;
 	int portal_type = -1;
+	int timeout = ISCSID_REQ_TIMEOUT;
 	struct sigaction sa_old;
 	struct sigaction sa_new;
 	struct list_head ifaces;
@@ -3441,7 +3442,7 @@ main(int argc, char **argv)
 	}
 
 	if (killiscsid >= 0) {
-		kill_iscsid(killiscsid);
+		kill_iscsid(killiscsid, timeout);
 		goto free_ifaces;
 	}
 
diff --git a/usr/iscsid.c b/usr/iscsid.c
index f8ffd23..c866432 100644
--- a/usr/iscsid.c
+++ b/usr/iscsid.c
@@ -274,7 +274,7 @@ static int sync_session(void *data, struct session_info *info)
 	memcpy(&req.u.session.rec, &rec, sizeof(node_rec_t));
 
 retry:
-	rc = iscsid_exec_req(&req, &rsp, 0);
+	rc = iscsid_exec_req(&req, &rsp, 0, info->iscsid_req_tmo);
 	if (rc == ISCSI_ERR_ISCSID_NOTCONN && retries < 30) {
 		retries++;
 		sleep(1);
diff --git a/usr/iscsid_req.c b/usr/iscsid_req.c
index 2950d74..7f06fe8 100644
--- a/usr/iscsid_req.c
+++ b/usr/iscsid_req.c
@@ -24,6 +24,7 @@
 #include <errno.h>
 #include <fcntl.h>
 #include <sys/un.h>
+#include <poll.h>
 #include <sys/types.h>
 #include <sys/socket.h>
 
@@ -33,6 +34,7 @@
 #include "iscsi_util.h"
 #include "config.h"
 #include "iscsi_err.h"
+#include "iscsid_req.h"
 #include "uip_mgmt_ipc.h"
 
 static void iscsid_startup(void)
@@ -118,16 +120,45 @@ int iscsid_request(int *fd, iscsiadm_req_t *req, int start_iscsid)
 	return ISCSI_SUCCESS;
 }
 
-int iscsid_response(int fd, iscsiadm_cmd_e cmd, iscsiadm_rsp_t *rsp)
+int iscsid_response(int fd, iscsiadm_cmd_e cmd, iscsiadm_rsp_t *rsp,
+		    int timeout)
 {
-	int iscsi_err;
+	size_t len = sizeof(*rsp);
+	int iscsi_err = ISCSI_ERR_ISCSID_COMM_ERR;
 	int err;
+	int poll_wait = 0;
 
-	if ((err = recv(fd, rsp, sizeof(*rsp), MSG_WAITALL)) != sizeof(*rsp)) {
-		log_error("got read error (%d/%d), daemon died?", err, errno);
-		iscsi_err = ISCSI_ERR_ISCSID_COMM_ERR;
-	} else
-		iscsi_err = rsp->err;
+	if (timeout == -1) {
+		timeout = ISCSID_REQ_TIMEOUT;
+		poll_wait = 1;
+	}
+	while (len) {
+		struct pollfd pfd;
+
+		pfd.fd = fd;
+		pfd.events = POLLIN;
+		err = poll(&pfd, 1, timeout);
+		if (!err) {
+			if (poll_wait)
+				continue;
+			return ISCSI_ERR_ISCSID_NOTCONN;
+		} else if (err < 0) {
+			if (errno == EINTR)
+				continue;
+			log_error("got poll error (%d/%d), daemon died?",
+				  err, errno);
+			return ISCSI_ERR_ISCSID_COMM_ERR;
+		} else if (!pfd.revents & POLLIN)
+			continue;
+		err = recv(fd, rsp, sizeof(*rsp), MSG_WAITALL);
+		if (err < 0) {
+			log_error("got read error (%d/%d), daemon died?",
+				  err, errno);
+		} else {
+			len -= err;
+			iscsi_err = rsp->err;
+		}
+	}
 	close(fd);
 
 	if (!iscsi_err && cmd != rsp->command)
@@ -136,7 +167,7 @@ int iscsid_response(int fd, iscsiadm_cmd_e cmd, iscsiadm_rsp_t *rsp)
 }
 
 int iscsid_exec_req(iscsiadm_req_t *req, iscsiadm_rsp_t *rsp,
-				int start_iscsid)
+		    int start_iscsid, int tmo)
 {
 	int fd;
 	int err;
@@ -145,7 +176,7 @@ int iscsid_exec_req(iscsiadm_req_t *req, iscsiadm_rsp_t *rsp,
 	if (err)
 		return err;
 
-	return iscsid_response(fd, req->command, rsp);
+	return iscsid_response(fd, req->command, rsp, tmo);
 }
 
 int iscsid_req_wait(iscsiadm_cmd_e cmd, int fd)
@@ -153,7 +184,7 @@ int iscsid_req_wait(iscsiadm_cmd_e cmd, int fd)
 	iscsiadm_rsp_t rsp;
 
 	memset(&rsp, 0, sizeof(iscsiadm_rsp_t));
-	return iscsid_response(fd, cmd, &rsp);
+	return iscsid_response(fd, cmd, &rsp, -1);
 }
 
 int iscsid_req_by_rec_async(iscsiadm_cmd_e cmd, node_rec_t *rec, int *fd)
diff --git a/usr/iscsid_req.h b/usr/iscsid_req.h
index 8cb4a92..67e509e 100644
--- a/usr/iscsid_req.h
+++ b/usr/iscsid_req.h
@@ -21,17 +21,20 @@
 #ifndef ISCSID_REQ_H_
 #define ISCSID_REQ_H
 
+#define ISCSID_REQ_TIMEOUT 1000
+
 struct iscsiadm_req;
 struct iscsiadm_rsp;
 struct node_rec;
 
 extern int iscsid_exec_req(struct iscsiadm_req *req, struct iscsiadm_rsp *rsp,
-			   int iscsid_start);
-extern int iscsid_req_wait(int cmd, int fd);
-extern int iscsid_req_by_rec_async(int cmd, struct node_rec *rec, int *fd);
-extern int iscsid_req_by_rec(int cmd, struct node_rec *rec);
-extern int iscsid_req_by_sid_async(int cmd, int sid, int *fd);
-extern int iscsid_req_by_sid(int cmd, int sid);
+			   int iscsid_start, int tmo);
+extern int iscsid_req_wait(iscsiadm_cmd_e cmd, int fd);
+extern int iscsid_req_by_rec_async(iscsiadm_cmd_e cmd, struct node_rec *rec,
+				   int *fd);
+extern int iscsid_req_by_rec(iscsiadm_cmd_e cmd, struct node_rec *rec);
+extern int iscsid_req_by_sid_async(iscsiadm_cmd_e cmd, int sid, int *fd);
+extern int iscsid_req_by_sid(iscsiadm_cmd_e cmd, int sid);
 
 extern int uip_broadcast(void *buf, size_t buf_len, int fd_flags,
 			 uint32_t *status);
diff --git a/usr/iscsistart.c b/usr/iscsistart.c
index 7ff2236..14b1eb5 100644
--- a/usr/iscsistart.c
+++ b/usr/iscsistart.c
@@ -123,7 +123,7 @@ static int stop_event_loop(void)
 
 	memset(&req, 0, sizeof(req));
 	req.command = MGMT_IPC_IMMEDIATE_STOP;
-	rc = iscsid_exec_req(&req, &rsp, 0);
+	rc = iscsid_exec_req(&req, &rsp, 0, -1);
 	if (rc) {
 		iscsi_err_print_msg(rc);
 		log_error("Could not stop event_loop");
@@ -235,7 +235,7 @@ static int login_session(struct node_rec *rec)
 	memcpy(&req.u.session.rec, rec, sizeof(*rec));
 
 retry:
-	rc = iscsid_exec_req(&req, &rsp, 0);
+	rc = iscsid_exec_req(&req, &rsp, 0, ISCSID_REQ_TIMEOUT);
 	/*
 	 * handle race where iscsid proc is starting up while we are
 	 * trying to connect.
diff --git a/usr/session_info.c b/usr/session_info.c
index 2f48e65..9a89bfc 100644
--- a/usr/session_info.c
+++ b/usr/session_info.c
@@ -93,7 +93,7 @@ static int session_info_print_flat(void *data, struct session_info *info)
 	return 0;
 }
 
-static int print_iscsi_state(int sid, char *prefix)
+static int print_iscsi_state(int sid, char *prefix, int tmo)
 {
 	iscsiadm_req_t req;
 	iscsiadm_rsp_t rsp;
@@ -120,7 +120,7 @@ static int print_iscsi_state(int sid, char *prefix)
 	req.command = MGMT_IPC_SESSION_INFO;
 	req.u.session.sid = sid;
 
-	err = iscsid_exec_req(&req, &rsp, 1);
+	err = iscsid_exec_req(&req, &rsp, 1, tmo);
 	/*
 	 * for drivers like qla4xxx, iscsid does not display
 	 * anything here since it does not know about it.
@@ -236,7 +236,7 @@ static int print_scsi_state(int sid, char *prefix, unsigned int flags)
 }
 
 void session_info_print_tree(struct list_head *list, char *prefix,
-			     unsigned int flags, int do_show)
+			     unsigned int flags, int do_show, int tmo)
 {
 	struct session_info *curr, *prev = NULL;
 
@@ -289,7 +289,7 @@ void session_info_print_tree(struct list_head *list, char *prefix,
 
 		if (flags & SESSION_INFO_ISCSI_STATE) {
 			printf("%s\t\tSID: %d\n", prefix, curr->sid);
-			print_iscsi_state(curr->sid, prefix);
+			print_iscsi_state(curr->sid, prefix, tmo);
 		}
 
 		if (flags & SESSION_INFO_ISCSI_TIM) {
@@ -406,7 +406,8 @@ int session_info_print(int info_level, struct session_info *info, int do_show)
 		if (info) {
 			INIT_LIST_HEAD(&info->list);
 			list_add_tail(&list, &info->list);
-			session_info_print_tree(&list, "", flags, do_show);
+			session_info_print_tree(&list, "", flags, do_show,
+						info->iscsid_req_tmo);
 			num_found = 1;
 			break;
 		}
@@ -421,7 +422,8 @@ int session_info_print(int info_level, struct session_info *info, int do_show)
 		if (err || !num_found)
 			break;
 
-		session_info_print_tree(&list, "", flags, do_show);
+		session_info_print_tree(&list, "", flags, do_show,
+					info->iscsid_req_tmo);
 		session_info_free_list(&list);
 		break;
 	default:
diff --git a/usr/session_info.h b/usr/session_info.h
index 726aefd..179b088 100644
--- a/usr/session_info.h
+++ b/usr/session_info.h
@@ -28,6 +28,7 @@ struct session_info {
 	/* local info */
 	struct iface_rec iface;
 	int sid;
+	int iscsid_req_tmo;
 
 	struct session_timeout tmo;
 	struct session_CHAP chap;
@@ -60,8 +61,8 @@ struct session_link_info {
 extern int session_info_create_list(void *data, struct session_info *info);
 extern void session_info_free_list(struct list_head *list);
 extern int session_info_print(int info_level, struct session_info *match_info,
-				int do_show);
+			      int do_show);
 extern void session_info_print_tree(struct list_head *list, char *prefix,
-				    unsigned int flags, int do_show);
+				    unsigned int flags, int do_show, int tmo);
 
 #endif
-- 
1.8.5.6

Reply via email to