Friendly ping, really hoping someone can take a look. Diffs re-attached. Thanks,
Ash Ashton Fagg <ash...@fagg.id.au> writes: > Updated diffs attached. > > - I read style(9) a little more closely and worked out I had some > issues, so I corrected those. > > - Revisited the wording in my proposed documentation to make things a > clearer. > > - My mandoc changes were not lint clean - also fixed. > > No functional changes to the original diffs. > > Thanks, > ajf > > Ashton Fagg <ash...@fagg.id.au> writes: > >> I have been working on fixing an issue (which was partially fixed by a >> diff I sent in earlier this year) with iscsid. With iscsi disks in >> /etc/fstab, sometimes the devices aren't fully up and ready before fsck >> tries to run - causing the machine to go into single user mode on boot. >> >> The diff that was merged earlier in the year added a poll routine which >> monitors for connection success before letting iscsictl return. This >> fixed the issue in some cases, however it's still only half the fix. The >> principled fix is to, additionally, wait until the scsi_probe calls are >> complete - at which point we can reasonably assume the disk device are >> ready for use. This requires some work to the vscsi driver to make this >> happen, as well as changes to both iscsid and iscsictl. The diffs >> attached here are a full implementation of this. >> >> I was still encountering this issue on one of my machines (much slower >> than my normal machine), where the connections would be up but the >> scsi_probe calls would not have completed - even with my earlier diff >> this would still cause the machine to go to single user mode. However, >> this indeed fixes that problem completely and I've been running it for a >> couple of weeks with no problems. >> >> The diffs are designed to be applied in the order they appear. In >> summary, the proposed changes are: >> >> (1) taskq.diff adds a taskq_empty function, which lets us check to see >> if a taskq is, well, empty. This is used in (2). Updates the man pages >> for taskq/task_add to reflect this. >> >> (2) vscsi.diff adds an ioctl to the vscsi driver which lets us monitor >> for device event queue completion. To aid in this it also separates >> calls to scsi_probe and scsi_detach to a dedicated taskq, rather than >> using systq. Updates the man pages for vscsi to reflect this. >> >> (3) iscsid.diff does the plumbing to actually call the new ioctl and >> provide that information back to iscsictl during status poll. >> >> (4) iscsictl.diff integrates the device queue information into the >> polling routine. Updates the man page for iscsictl to describe the new >> behavior. >> >> Based on commmit messages around vscsi it seems there was some plan to >> do this quite some years ago but it's hard for me to know what the >> proposed method was (though I assume what was envisaged might be similar >> to something like this). >> >> Feedback sought and greatly welcomed. I am basically certain there are >> ways this can be improved.
Index: sys/kern/kern_task.c =================================================================== RCS file: /cvs/src/sys/kern/kern_task.c,v retrieving revision 1.31 diff -u -p -u -p -r1.31 kern_task.c --- sys/kern/kern_task.c 1 Aug 2020 08:40:20 -0000 1.31 +++ sys/kern/kern_task.c 20 Jun 2021 22:05:13 -0000 @@ -394,6 +394,18 @@ task_del(struct taskq *tq, struct task * } int +taskq_empty(struct taskq *tq) +{ + int rv; + + mtx_enter(&tq->tq_mtx); + rv = TAILQ_EMPTY(&tq->tq_worklist); + mtx_leave(&tq->tq_mtx); + + return (rv); +} + +int taskq_next_work(struct taskq *tq, struct task *work) { struct task *next; Index: share/man/man9/task_add.9 =================================================================== RCS file: /cvs/src/share/man/man9/task_add.9,v retrieving revision 1.22 diff -u -p -u -p -r1.22 task_add.9 --- share/man/man9/task_add.9 8 Jun 2020 00:29:51 -0000 1.22 +++ share/man/man9/task_add.9 20 Jun 2021 22:05:30 -0000 @@ -20,6 +20,7 @@ .Sh NAME .Nm taskq_create , .Nm taskq_destroy , +.Nm taskq_empty , .Nm taskq_barrier , .Nm taskq_del_barrier , .Nm task_set , @@ -43,6 +44,8 @@ .Fn taskq_barrier "struct taskq *tq" .Ft void .Fn taskq_del_barrier "struct taskq *tq" "struct task *t" +.Ft int +.Fn taskq_empty "struct taskq *tq" .Ft void .Fn task_set "struct task *t" "void (*fn)(void *)" "void *arg" .Ft int @@ -86,6 +89,9 @@ argument: The threads servicing the taskq will be run without the kernel big lock. .El .Pp +.Fn taskq_empty +indicates whether there are any queued tasks. +.Pp .Fn taskq_destroy causes the resources associated with a previously created taskq to be freed. It will wait till all the tasks in the work queue are completed before @@ -220,6 +226,9 @@ or 0 if the task was not already on the .Fn task_pending will return non-zero if the task is queued to run, or 0 if the task is not queued. +.Pp +.Fn taskq_empty +will return 1 if there are no tasks queued, or 0 otherwise. .Sh SEE ALSO .Xr autoconf 9 , .Xr spl 9
Index: sys/dev/vscsi.c =================================================================== RCS file: /cvs/src/sys/dev/vscsi.c,v retrieving revision 1.58 diff -u -p -u -p -r1.58 vscsi.c --- sys/dev/vscsi.c 25 Dec 2020 12:59:52 -0000 1.58 +++ sys/dev/vscsi.c 20 Jun 2021 22:04:02 -0000 @@ -73,6 +73,8 @@ struct vscsi_softc { struct selinfo sc_sel; struct mutex sc_sel_mtx; + + struct taskq *sc_tq; }; #define DEVNAME(_s) ((_s)->sc_dev.dv_xname) @@ -104,6 +106,7 @@ int vscsi_t2i(struct vscsi_softc *, str int vscsi_devevent(struct vscsi_softc *, u_long, struct vscsi_ioc_devevent *); void vscsi_devevent_task(void *); +int vscsi_devevent_poll(struct vscsi_softc *, int *); void vscsi_done(struct vscsi_softc *, struct vscsi_ccb *); void * vscsi_ccb_get(void *); @@ -156,6 +159,10 @@ vscsi_attach(struct device *parent, stru sc->sc_scsibus = (struct scsibus_softc *)config_found(&sc->sc_dev, &saa, scsiprint); + + sc->sc_tq = taskq_create("vscsi", 1, IPL_BIO, 0); + if (sc->sc_tq == NULL) + panic("vscsi: Couldn't allocate task queue."); } void @@ -317,7 +324,9 @@ vscsiioctl(dev_t dev, u_long cmd, caddr_ err = vscsi_devevent(sc, cmd, (struct vscsi_ioc_devevent *)addr); break; - + case VSCSI_DEVEVENT_POLL: + err = vscsi_devevent_poll(sc, (int *)addr); + break; default: err = ENOTTY; break; @@ -487,7 +496,7 @@ vscsi_devevent(struct vscsi_softc *sc, u dt->cmd = cmd; device_ref(&sc->sc_dev); - task_add(systq, &dt->t); + task_add(sc->sc_tq, &dt->t); return (0); } @@ -525,6 +534,17 @@ gone: device_unref(&sc->sc_dev); free(dt, M_TEMP, sizeof(*dt)); +} + +int +vscsi_devevent_poll(struct vscsi_softc *sc, int *tq_status) +{ + if (tq_status == NULL) + return (EINVAL); + + *tq_status = taskq_empty(sc->sc_tq); + + return (0); } int Index: sys/dev/vscsivar.h =================================================================== RCS file: /cvs/src/sys/dev/vscsivar.h,v retrieving revision 1.5 diff -u -p -u -p -r1.5 vscsivar.h --- sys/dev/vscsivar.h 5 Apr 2011 15:28:49 -0000 1.5 +++ sys/dev/vscsivar.h 20 Jun 2021 22:04:02 -0000 @@ -67,5 +67,6 @@ struct vscsi_ioc_devevent { #define VSCSI_REQPROBE _IOW('I', 4, struct vscsi_ioc_devevent) #define VSCSI_REQDETACH _IOW('I', 5, struct vscsi_ioc_devevent) +#define VSCSI_DEVEVENT_POLL _IOR('I', 6, int) #endif /* _SYS_DEV_VSCSIVAR_H */ Index: share/man/man4/vscsi.4 =================================================================== RCS file: /cvs/src/share/man/man4/vscsi.4,v retrieving revision 1.14 diff -u -p -u -p -r1.14 vscsi.4 --- share/man/man4/vscsi.4 12 Sep 2017 15:06:20 -0000 1.14 +++ share/man/man4/vscsi.4 20 Jun 2021 22:04:38 -0000 @@ -108,6 +108,14 @@ struct vscsi_ioc_devevent { u_int lun; }; .Ed +.Pp +.It Dv VSCSI_DEVEVENT_POLL Fa "int *" +Report status of device event queue (used by VSCSI_REQPROBE and +VSCSI_REQDETACH). +If the queue has been fully processed, the flag will +be non-zero. +If there are outstanding events, which user-space +applications may wish to wait for, the flag will be zero. .El .Sh FILES .Bl -tag -width /dev/vscsi0
Index: usr.sbin/iscsid/iscsid.h =================================================================== RCS file: /cvs/src/usr.sbin/iscsid/iscsid.h,v retrieving revision 1.17 diff -u -p -u -p -r1.17 iscsid.h --- usr.sbin/iscsid/iscsid.h 16 Apr 2021 14:37:06 -0000 1.17 +++ usr.sbin/iscsid/iscsid.h 20 Jun 2021 22:06:08 -0000 @@ -259,6 +259,7 @@ struct session_poll { int conn_failed_count; int conn_waiting_count; int sess_conn_status; + int devev_queue_status; }; struct connection { @@ -399,6 +400,7 @@ void vscsi_dispatch(int, short, void *); void vscsi_data(unsigned long, int, void *, size_t); void vscsi_status(int, int, void *, size_t); void vscsi_event(unsigned long, u_int, u_int); +void vscsi_poll_devevent_queue(int *); struct vscsi_stats *vscsi_stats(void); /* Session polling */ Index: usr.sbin/iscsid/poll.c =================================================================== RCS file: /cvs/src/usr.sbin/iscsid/poll.c,v retrieving revision 1.1 diff -u -p -u -p -r1.1 poll.c --- usr.sbin/iscsid/poll.c 16 Apr 2021 14:37:06 -0000 1.1 +++ usr.sbin/iscsid/poll.c 20 Jun 2021 22:06:08 -0000 @@ -82,13 +82,24 @@ poll_session(struct session_poll *p, str } /* - * Set session_poll sess_conn_status based on the number of sessions - * versus number of logged in or failed sessions. If the numbers are - * equal iscsictl can stop polling since all sessions reached a stable state. + * Here, we finalize the polling by checking to see if + * the requested probe/detach ops have been completed + * by the vscsi driver. We only bother to do this once + * we know all the sessions have settled. */ void poll_finalize(struct session_poll *p) { + + /* + * If we aren't awaiting anything else to connect, the combined total of + * logged in and failed connections should equal the number of + * configured sessions. + */ p->sess_conn_status = (p->session_count == p->conn_logged_in_count + p->conn_failed_count); + + /* All connections up, so check the dev event queue. */ + if (p->sess_conn_status) + vscsi_poll_devevent_queue(&p->devev_queue_status); } Index: usr.sbin/iscsid/vscsi.c =================================================================== RCS file: /cvs/src/usr.sbin/iscsid/vscsi.c,v retrieving revision 1.17 diff -u -p -u -p -r1.17 vscsi.c --- usr.sbin/iscsid/vscsi.c 16 Aug 2016 18:41:57 -0000 1.17 +++ usr.sbin/iscsid/vscsi.c 20 Jun 2021 22:06:08 -0000 @@ -338,3 +338,10 @@ vscsi_stats(void) { return &v.stats; } + +void +vscsi_poll_devevent_queue(int *arg) +{ + if (ioctl(v.fd, VSCSI_DEVEVENT_POLL, arg) != 0) + fatal("vscsi_devevent_poll"); +}
Index: usr.sbin/iscsictl/iscsictl.8 =================================================================== RCS file: /cvs/src/usr.sbin/iscsictl/iscsictl.8,v retrieving revision 1.7 diff -u -p -u -p -r1.7 iscsictl.8 --- usr.sbin/iscsictl/iscsictl.8 5 May 2021 12:34:12 -0000 1.7 +++ usr.sbin/iscsictl/iscsictl.8 20 Jun 2021 22:07:14 -0000 @@ -47,8 +47,9 @@ to communicate with The following commands are available: .Bl -tag -width Ds .It Cm reload -Reload the configuration file and wait to return until iscsid reports all -connections have completed (successfully or otherwise), or for up to 10 +Reload the configuration file and wait to return until iscsid reports +all connections have completed (successfully or otherwise) and all +scsi_probe and scsi_detach requests have completed, or for up to 10 seconds. .It Cm show Op Cm summary Show a list of all configured sessions. Index: usr.sbin/iscsictl/iscsictl.c =================================================================== RCS file: /cvs/src/usr.sbin/iscsictl/iscsictl.c,v retrieving revision 1.12 diff -u -p -u -p -r1.12 iscsictl.c --- usr.sbin/iscsictl/iscsictl.c 16 Apr 2021 14:39:33 -0000 1.12 +++ usr.sbin/iscsictl/iscsictl.c 20 Jun 2021 22:07:14 -0000 @@ -431,7 +431,7 @@ poll_and_wait(void) poll_session_status(); /* Poll says we are good to go. */ - if (poll_result.sess_conn_status != 0) { + if (poll_result.sess_conn_status != 0 && poll_result.devev_queue_status != 0) { printf("ok\n"); /* wait a bit longer so all is settled. */ sleep(POLL_DELAY_SEC);