Friendly ping, really hoping someone can take a look. Diffs re-attached.
Thanks,
Ash
Ashton Fagg <[email protected]> writes:
> Updated diffs attached.
>
> - I read style(9) a little more closely and worked out I had some
> issues, so I corrected those.
>
> - Revisited the wording in my proposed documentation to make things a
> clearer.
>
> - My mandoc changes were not lint clean - also fixed.
>
> No functional changes to the original diffs.
>
> Thanks,
> ajf
>
> Ashton Fagg <[email protected]> writes:
>
>> I have been working on fixing an issue (which was partially fixed by a
>> diff I sent in earlier this year) with iscsid. With iscsi disks in
>> /etc/fstab, sometimes the devices aren't fully up and ready before fsck
>> tries to run - causing the machine to go into single user mode on boot.
>>
>> The diff that was merged earlier in the year added a poll routine which
>> monitors for connection success before letting iscsictl return. This
>> fixed the issue in some cases, however it's still only half the fix. The
>> principled fix is to, additionally, wait until the scsi_probe calls are
>> complete - at which point we can reasonably assume the disk device are
>> ready for use. This requires some work to the vscsi driver to make this
>> happen, as well as changes to both iscsid and iscsictl. The diffs
>> attached here are a full implementation of this.
>>
>> I was still encountering this issue on one of my machines (much slower
>> than my normal machine), where the connections would be up but the
>> scsi_probe calls would not have completed - even with my earlier diff
>> this would still cause the machine to go to single user mode. However,
>> this indeed fixes that problem completely and I've been running it for a
>> couple of weeks with no problems.
>>
>> The diffs are designed to be applied in the order they appear. In
>> summary, the proposed changes are:
>>
>> (1) taskq.diff adds a taskq_empty function, which lets us check to see
>> if a taskq is, well, empty. This is used in (2). Updates the man pages
>> for taskq/task_add to reflect this.
>>
>> (2) vscsi.diff adds an ioctl to the vscsi driver which lets us monitor
>> for device event queue completion. To aid in this it also separates
>> calls to scsi_probe and scsi_detach to a dedicated taskq, rather than
>> using systq. Updates the man pages for vscsi to reflect this.
>>
>> (3) iscsid.diff does the plumbing to actually call the new ioctl and
>> provide that information back to iscsictl during status poll.
>>
>> (4) iscsictl.diff integrates the device queue information into the
>> polling routine. Updates the man page for iscsictl to describe the new
>> behavior.
>>
>> Based on commmit messages around vscsi it seems there was some plan to
>> do this quite some years ago but it's hard for me to know what the
>> proposed method was (though I assume what was envisaged might be similar
>> to something like this).
>>
>> Feedback sought and greatly welcomed. I am basically certain there are
>> ways this can be improved.
Index: sys/kern/kern_task.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_task.c,v
retrieving revision 1.31
diff -u -p -u -p -r1.31 kern_task.c
--- sys/kern/kern_task.c 1 Aug 2020 08:40:20 -0000 1.31
+++ sys/kern/kern_task.c 20 Jun 2021 22:05:13 -0000
@@ -394,6 +394,18 @@ task_del(struct taskq *tq, struct task *
}
int
+taskq_empty(struct taskq *tq)
+{
+ int rv;
+
+ mtx_enter(&tq->tq_mtx);
+ rv = TAILQ_EMPTY(&tq->tq_worklist);
+ mtx_leave(&tq->tq_mtx);
+
+ return (rv);
+}
+
+int
taskq_next_work(struct taskq *tq, struct task *work)
{
struct task *next;
Index: share/man/man9/task_add.9
===================================================================
RCS file: /cvs/src/share/man/man9/task_add.9,v
retrieving revision 1.22
diff -u -p -u -p -r1.22 task_add.9
--- share/man/man9/task_add.9 8 Jun 2020 00:29:51 -0000 1.22
+++ share/man/man9/task_add.9 20 Jun 2021 22:05:30 -0000
@@ -20,6 +20,7 @@
.Sh NAME
.Nm taskq_create ,
.Nm taskq_destroy ,
+.Nm taskq_empty ,
.Nm taskq_barrier ,
.Nm taskq_del_barrier ,
.Nm task_set ,
@@ -43,6 +44,8 @@
.Fn taskq_barrier "struct taskq *tq"
.Ft void
.Fn taskq_del_barrier "struct taskq *tq" "struct task *t"
+.Ft int
+.Fn taskq_empty "struct taskq *tq"
.Ft void
.Fn task_set "struct task *t" "void (*fn)(void *)" "void *arg"
.Ft int
@@ -86,6 +89,9 @@ argument:
The threads servicing the taskq will be run without the kernel big lock.
.El
.Pp
+.Fn taskq_empty
+indicates whether there are any queued tasks.
+.Pp
.Fn taskq_destroy
causes the resources associated with a previously created taskq to be freed.
It will wait till all the tasks in the work queue are completed before
@@ -220,6 +226,9 @@ or 0 if the task was not already on the
.Fn task_pending
will return non-zero if the task is queued to run, or 0 if the task
is not queued.
+.Pp
+.Fn taskq_empty
+will return 1 if there are no tasks queued, or 0 otherwise.
.Sh SEE ALSO
.Xr autoconf 9 ,
.Xr spl 9
Index: sys/dev/vscsi.c
===================================================================
RCS file: /cvs/src/sys/dev/vscsi.c,v
retrieving revision 1.58
diff -u -p -u -p -r1.58 vscsi.c
--- sys/dev/vscsi.c 25 Dec 2020 12:59:52 -0000 1.58
+++ sys/dev/vscsi.c 20 Jun 2021 22:04:02 -0000
@@ -73,6 +73,8 @@ struct vscsi_softc {
struct selinfo sc_sel;
struct mutex sc_sel_mtx;
+
+ struct taskq *sc_tq;
};
#define DEVNAME(_s) ((_s)->sc_dev.dv_xname)
@@ -104,6 +106,7 @@ int vscsi_t2i(struct vscsi_softc *, str
int vscsi_devevent(struct vscsi_softc *, u_long,
struct vscsi_ioc_devevent *);
void vscsi_devevent_task(void *);
+int vscsi_devevent_poll(struct vscsi_softc *, int *);
void vscsi_done(struct vscsi_softc *, struct vscsi_ccb *);
void * vscsi_ccb_get(void *);
@@ -156,6 +159,10 @@ vscsi_attach(struct device *parent, stru
sc->sc_scsibus = (struct scsibus_softc *)config_found(&sc->sc_dev,
&saa, scsiprint);
+
+ sc->sc_tq = taskq_create("vscsi", 1, IPL_BIO, 0);
+ if (sc->sc_tq == NULL)
+ panic("vscsi: Couldn't allocate task queue.");
}
void
@@ -317,7 +324,9 @@ vscsiioctl(dev_t dev, u_long cmd, caddr_
err = vscsi_devevent(sc, cmd,
(struct vscsi_ioc_devevent *)addr);
break;
-
+ case VSCSI_DEVEVENT_POLL:
+ err = vscsi_devevent_poll(sc, (int *)addr);
+ break;
default:
err = ENOTTY;
break;
@@ -487,7 +496,7 @@ vscsi_devevent(struct vscsi_softc *sc, u
dt->cmd = cmd;
device_ref(&sc->sc_dev);
- task_add(systq, &dt->t);
+ task_add(sc->sc_tq, &dt->t);
return (0);
}
@@ -525,6 +534,17 @@ gone:
device_unref(&sc->sc_dev);
free(dt, M_TEMP, sizeof(*dt));
+}
+
+int
+vscsi_devevent_poll(struct vscsi_softc *sc, int *tq_status)
+{
+ if (tq_status == NULL)
+ return (EINVAL);
+
+ *tq_status = taskq_empty(sc->sc_tq);
+
+ return (0);
}
int
Index: sys/dev/vscsivar.h
===================================================================
RCS file: /cvs/src/sys/dev/vscsivar.h,v
retrieving revision 1.5
diff -u -p -u -p -r1.5 vscsivar.h
--- sys/dev/vscsivar.h 5 Apr 2011 15:28:49 -0000 1.5
+++ sys/dev/vscsivar.h 20 Jun 2021 22:04:02 -0000
@@ -67,5 +67,6 @@ struct vscsi_ioc_devevent {
#define VSCSI_REQPROBE _IOW('I', 4, struct vscsi_ioc_devevent)
#define VSCSI_REQDETACH _IOW('I', 5, struct vscsi_ioc_devevent)
+#define VSCSI_DEVEVENT_POLL _IOR('I', 6, int)
#endif /* _SYS_DEV_VSCSIVAR_H */
Index: share/man/man4/vscsi.4
===================================================================
RCS file: /cvs/src/share/man/man4/vscsi.4,v
retrieving revision 1.14
diff -u -p -u -p -r1.14 vscsi.4
--- share/man/man4/vscsi.4 12 Sep 2017 15:06:20 -0000 1.14
+++ share/man/man4/vscsi.4 20 Jun 2021 22:04:38 -0000
@@ -108,6 +108,14 @@ struct vscsi_ioc_devevent {
u_int lun;
};
.Ed
+.Pp
+.It Dv VSCSI_DEVEVENT_POLL Fa "int *"
+Report status of device event queue (used by VSCSI_REQPROBE and
+VSCSI_REQDETACH).
+If the queue has been fully processed, the flag will
+be non-zero.
+If there are outstanding events, which user-space
+applications may wish to wait for, the flag will be zero.
.El
.Sh FILES
.Bl -tag -width /dev/vscsi0
Index: usr.sbin/iscsid/iscsid.h
===================================================================
RCS file: /cvs/src/usr.sbin/iscsid/iscsid.h,v
retrieving revision 1.17
diff -u -p -u -p -r1.17 iscsid.h
--- usr.sbin/iscsid/iscsid.h 16 Apr 2021 14:37:06 -0000 1.17
+++ usr.sbin/iscsid/iscsid.h 20 Jun 2021 22:06:08 -0000
@@ -259,6 +259,7 @@ struct session_poll {
int conn_failed_count;
int conn_waiting_count;
int sess_conn_status;
+ int devev_queue_status;
};
struct connection {
@@ -399,6 +400,7 @@ void vscsi_dispatch(int, short, void *);
void vscsi_data(unsigned long, int, void *, size_t);
void vscsi_status(int, int, void *, size_t);
void vscsi_event(unsigned long, u_int, u_int);
+void vscsi_poll_devevent_queue(int *);
struct vscsi_stats *vscsi_stats(void);
/* Session polling */
Index: usr.sbin/iscsid/poll.c
===================================================================
RCS file: /cvs/src/usr.sbin/iscsid/poll.c,v
retrieving revision 1.1
diff -u -p -u -p -r1.1 poll.c
--- usr.sbin/iscsid/poll.c 16 Apr 2021 14:37:06 -0000 1.1
+++ usr.sbin/iscsid/poll.c 20 Jun 2021 22:06:08 -0000
@@ -82,13 +82,24 @@ poll_session(struct session_poll *p, str
}
/*
- * Set session_poll sess_conn_status based on the number of sessions
- * versus number of logged in or failed sessions. If the numbers are
- * equal iscsictl can stop polling since all sessions reached a stable state.
+ * Here, we finalize the polling by checking to see if
+ * the requested probe/detach ops have been completed
+ * by the vscsi driver. We only bother to do this once
+ * we know all the sessions have settled.
*/
void
poll_finalize(struct session_poll *p)
{
+
+ /*
+ * If we aren't awaiting anything else to connect, the combined total of
+ * logged in and failed connections should equal the number of
+ * configured sessions.
+ */
p->sess_conn_status = (p->session_count == p->conn_logged_in_count +
p->conn_failed_count);
+
+ /* All connections up, so check the dev event queue. */
+ if (p->sess_conn_status)
+ vscsi_poll_devevent_queue(&p->devev_queue_status);
}
Index: usr.sbin/iscsid/vscsi.c
===================================================================
RCS file: /cvs/src/usr.sbin/iscsid/vscsi.c,v
retrieving revision 1.17
diff -u -p -u -p -r1.17 vscsi.c
--- usr.sbin/iscsid/vscsi.c 16 Aug 2016 18:41:57 -0000 1.17
+++ usr.sbin/iscsid/vscsi.c 20 Jun 2021 22:06:08 -0000
@@ -338,3 +338,10 @@ vscsi_stats(void)
{
return &v.stats;
}
+
+void
+vscsi_poll_devevent_queue(int *arg)
+{
+ if (ioctl(v.fd, VSCSI_DEVEVENT_POLL, arg) != 0)
+ fatal("vscsi_devevent_poll");
+}
Index: usr.sbin/iscsictl/iscsictl.8
===================================================================
RCS file: /cvs/src/usr.sbin/iscsictl/iscsictl.8,v
retrieving revision 1.7
diff -u -p -u -p -r1.7 iscsictl.8
--- usr.sbin/iscsictl/iscsictl.8 5 May 2021 12:34:12 -0000 1.7
+++ usr.sbin/iscsictl/iscsictl.8 20 Jun 2021 22:07:14 -0000
@@ -47,8 +47,9 @@ to communicate with
The following commands are available:
.Bl -tag -width Ds
.It Cm reload
-Reload the configuration file and wait to return until iscsid reports all
-connections have completed (successfully or otherwise), or for up to 10
+Reload the configuration file and wait to return until iscsid reports
+all connections have completed (successfully or otherwise) and all
+scsi_probe and scsi_detach requests have completed, or for up to 10
seconds.
.It Cm show Op Cm summary
Show a list of all configured sessions.
Index: usr.sbin/iscsictl/iscsictl.c
===================================================================
RCS file: /cvs/src/usr.sbin/iscsictl/iscsictl.c,v
retrieving revision 1.12
diff -u -p -u -p -r1.12 iscsictl.c
--- usr.sbin/iscsictl/iscsictl.c 16 Apr 2021 14:39:33 -0000 1.12
+++ usr.sbin/iscsictl/iscsictl.c 20 Jun 2021 22:07:14 -0000
@@ -431,7 +431,7 @@ poll_and_wait(void)
poll_session_status();
/* Poll says we are good to go. */
- if (poll_result.sess_conn_status != 0) {
+ if (poll_result.sess_conn_status != 0 && poll_result.devev_queue_status != 0) {
printf("ok\n");
/* wait a bit longer so all is settled. */
sleep(POLL_DELAY_SEC);