Friendly weekly ping. Patches reattached. - ajf
Ashton Fagg <ash...@fagg.id.au> writes: > I have been working on fixing an issue (which was partially fixed by a > diff I sent in earlier this year) with iscsid. With iscsi disks in > /etc/fstab, sometimes the devices aren't fully up and ready before fsck > tries to run - causing the machine to go into single user mode on boot. > > The diff that was merged earlier in the year added a poll routine which > monitors for connection success before letting iscsictl return. This > fixed the issue in some cases, however it's still only half the fix. The > principled fix is to, additionally, wait until the scsi_probe calls are > complete - at which point we can reasonably assume the disk device are > ready for use. This requires some work to the vscsi driver to make this > happen, as well as changes to both iscsid and iscsictl. The diffs > attached here are a full implementation of this. > > I was still encountering this issue on one of my machines (much slower > than my normal machine), where the connections would be up but the > scsi_probe calls would not have completed - even with my earlier diff > this would still cause the machine to go to single user mode. However, > this indeed fixes that problem completely and I've been running it for a > couple of weeks with no problems. > > The diffs are designed to be applied in the order they appear. In > summary, the proposed changes are: > > (1) taskq.diff adds a taskq_empty function, which lets us check to see > if a taskq is, well, empty. This is used in (2). Updates the man pages > for taskq/task_add to reflect this. > > (2) vscsi.diff adds an ioctl to the vscsi driver which lets us monitor > for device event queue completion. To aid in this it also separates > calls to scsi_probe and scsi_detach to a dedicated taskq, rather than > using systq. Updates the man pages for vscsi to reflect this. > > (3) iscsid.diff does the plumbing to actually call the new ioctl and > provide that information back to iscsictl during status poll. > > (4) iscsictl.diff integrates the device queue information into the > polling routine. Updates the man page for iscsictl to describe the new > behavior. > > Based on commmit messages around vscsi it seems there was some plan to > do this quite some years ago but it's hard for me to know what the > proposed method was (though I assume what was envisaged might be similar > to something like this). > > Feedback sought and greatly welcomed. I am basically certain there are > ways this can be improved. > > Thanks, > > Ash
Index: share/man/man9/task_add.9 =================================================================== RCS file: /cvs/src/share/man/man9/task_add.9,v retrieving revision 1.22 diff -u -p -u -p -r1.22 task_add.9 --- share/man/man9/task_add.9 8 Jun 2020 00:29:51 -0000 1.22 +++ share/man/man9/task_add.9 13 Jul 2021 23:45:59 -0000 @@ -20,6 +20,7 @@ .Sh NAME .Nm taskq_create , .Nm taskq_destroy , +.Nm taskq_empty , .Nm taskq_barrier , .Nm taskq_del_barrier , .Nm task_set , @@ -43,6 +44,8 @@ .Fn taskq_barrier "struct taskq *tq" .Ft void .Fn taskq_del_barrier "struct taskq *tq" "struct task *t" +.Ft int +.Fn taskq_empty "struct taskq *tq" .Ft void .Fn task_set "struct task *t" "void (*fn)(void *)" "void *arg" .Ft int @@ -86,6 +89,9 @@ argument: The threads servicing the taskq will be run without the kernel big lock. .El .Pp +.Fn taskq_empty +indicates whether there are any queued tasks. +.Pp .Fn taskq_destroy causes the resources associated with a previously created taskq to be freed. It will wait till all the tasks in the work queue are completed before @@ -220,6 +226,9 @@ or 0 if the task was not already on the .Fn task_pending will return non-zero if the task is queued to run, or 0 if the task is not queued. +.Pp +.Fn taskq_empty +will return 1 if there are no tasks queued, or 0 otherwise. .Sh SEE ALSO .Xr autoconf 9 , .Xr spl 9 Index: sys/kern/kern_task.c =================================================================== RCS file: /cvs/src/sys/kern/kern_task.c,v retrieving revision 1.31 diff -u -p -u -p -r1.31 kern_task.c --- sys/kern/kern_task.c 1 Aug 2020 08:40:20 -0000 1.31 +++ sys/kern/kern_task.c 13 Jul 2021 23:46:00 -0000 @@ -394,6 +394,18 @@ task_del(struct taskq *tq, struct task * } int +taskq_empty(struct taskq *tq) +{ + int rv; + + mtx_enter(&tq->tq_mtx); + rv = TAILQ_EMPTY(&tq->tq_worklist); + mtx_leave(&tq->tq_mtx); + + return (rv); +} + +int taskq_next_work(struct taskq *tq, struct task *work) { struct task *next; Index: sys/sys/task.h =================================================================== RCS file: /cvs/src/sys/sys/task.h,v retrieving revision 1.18 diff -u -p -u -p -r1.18 task.h --- sys/sys/task.h 1 Aug 2020 08:40:20 -0000 1.18 +++ sys/sys/task.h 13 Jul 2021 23:46:00 -0000 @@ -51,6 +51,8 @@ void taskq_barrier(struct taskq *); void taskq_del_barrier(struct taskq *, struct task *); +int taskq_empty(struct taskq *); + void task_set(struct task *, void (*)(void *), void *); int task_add(struct taskq *, struct task *); int task_del(struct taskq *, struct task *);
Index: share/man/man4/vscsi.4 =================================================================== RCS file: /cvs/src/share/man/man4/vscsi.4,v retrieving revision 1.14 diff -u -p -u -p -r1.14 vscsi.4 --- share/man/man4/vscsi.4 12 Sep 2017 15:06:20 -0000 1.14 +++ share/man/man4/vscsi.4 13 Jul 2021 23:45:59 -0000 @@ -108,6 +108,14 @@ struct vscsi_ioc_devevent { u_int lun; }; .Ed +.Pp +.It Dv VSCSI_DEVEVENT_POLL Fa "int *" +Report status of device event queue (used by VSCSI_REQPROBE and +VSCSI_REQDETACH). +If the queue has been fully processed, the flag will +be non-zero. +If there are outstanding events, which user-space +applications may wish to wait for, the flag will be zero. .El .Sh FILES .Bl -tag -width /dev/vscsi0 Index: sys/dev/vscsi.c =================================================================== RCS file: /cvs/src/sys/dev/vscsi.c,v retrieving revision 1.58 diff -u -p -u -p -r1.58 vscsi.c --- sys/dev/vscsi.c 25 Dec 2020 12:59:52 -0000 1.58 +++ sys/dev/vscsi.c 13 Jul 2021 23:45:59 -0000 @@ -73,6 +73,8 @@ struct vscsi_softc { struct selinfo sc_sel; struct mutex sc_sel_mtx; + + struct taskq *sc_tq; }; #define DEVNAME(_s) ((_s)->sc_dev.dv_xname) @@ -104,6 +106,7 @@ int vscsi_t2i(struct vscsi_softc *, str int vscsi_devevent(struct vscsi_softc *, u_long, struct vscsi_ioc_devevent *); void vscsi_devevent_task(void *); +int vscsi_devevent_poll(struct vscsi_softc *, int *); void vscsi_done(struct vscsi_softc *, struct vscsi_ccb *); void * vscsi_ccb_get(void *); @@ -156,6 +159,10 @@ vscsi_attach(struct device *parent, stru sc->sc_scsibus = (struct scsibus_softc *)config_found(&sc->sc_dev, &saa, scsiprint); + + sc->sc_tq = taskq_create("vscsi", 1, IPL_BIO, 0); + if (sc->sc_tq == NULL) + panic("vscsi: Couldn't allocate task queue."); } void @@ -317,7 +324,9 @@ vscsiioctl(dev_t dev, u_long cmd, caddr_ err = vscsi_devevent(sc, cmd, (struct vscsi_ioc_devevent *)addr); break; - + case VSCSI_DEVEVENT_POLL: + err = vscsi_devevent_poll(sc, (int *)addr); + break; default: err = ENOTTY; break; @@ -487,7 +496,7 @@ vscsi_devevent(struct vscsi_softc *sc, u dt->cmd = cmd; device_ref(&sc->sc_dev); - task_add(systq, &dt->t); + task_add(sc->sc_tq, &dt->t); return (0); } @@ -525,6 +534,17 @@ gone: device_unref(&sc->sc_dev); free(dt, M_TEMP, sizeof(*dt)); +} + +int +vscsi_devevent_poll(struct vscsi_softc *sc, int *tq_status) +{ + if (tq_status == NULL) + return (EINVAL); + + *tq_status = taskq_empty(sc->sc_tq); + + return (0); } int Index: sys/dev/vscsivar.h =================================================================== RCS file: /cvs/src/sys/dev/vscsivar.h,v retrieving revision 1.5 diff -u -p -u -p -r1.5 vscsivar.h --- sys/dev/vscsivar.h 5 Apr 2011 15:28:49 -0000 1.5 +++ sys/dev/vscsivar.h 13 Jul 2021 23:45:59 -0000 @@ -67,5 +67,6 @@ struct vscsi_ioc_devevent { #define VSCSI_REQPROBE _IOW('I', 4, struct vscsi_ioc_devevent) #define VSCSI_REQDETACH _IOW('I', 5, struct vscsi_ioc_devevent) +#define VSCSI_DEVEVENT_POLL _IOR('I', 6, int) #endif /* _SYS_DEV_VSCSIVAR_H */
Index: usr.sbin/iscsid/iscsid.h =================================================================== RCS file: /cvs/src/usr.sbin/iscsid/iscsid.h,v retrieving revision 1.17 diff -u -p -u -p -r1.17 iscsid.h --- usr.sbin/iscsid/iscsid.h 16 Apr 2021 14:37:06 -0000 1.17 +++ usr.sbin/iscsid/iscsid.h 13 Jul 2021 23:46:01 -0000 @@ -259,6 +259,7 @@ struct session_poll { int conn_failed_count; int conn_waiting_count; int sess_conn_status; + int devev_queue_status; }; struct connection { @@ -399,6 +400,7 @@ void vscsi_dispatch(int, short, void *); void vscsi_data(unsigned long, int, void *, size_t); void vscsi_status(int, int, void *, size_t); void vscsi_event(unsigned long, u_int, u_int); +void vscsi_poll_devevent_queue(int *); struct vscsi_stats *vscsi_stats(void); /* Session polling */ Index: usr.sbin/iscsid/poll.c =================================================================== RCS file: /cvs/src/usr.sbin/iscsid/poll.c,v retrieving revision 1.1 diff -u -p -u -p -r1.1 poll.c --- usr.sbin/iscsid/poll.c 16 Apr 2021 14:37:06 -0000 1.1 +++ usr.sbin/iscsid/poll.c 13 Jul 2021 23:46:01 -0000 @@ -82,13 +82,24 @@ poll_session(struct session_poll *p, str } /* - * Set session_poll sess_conn_status based on the number of sessions - * versus number of logged in or failed sessions. If the numbers are - * equal iscsictl can stop polling since all sessions reached a stable state. + * Here, we finalize the polling by checking to see if + * the requested probe/detach ops have been completed + * by the vscsi driver. We only bother to do this once + * we know all the sessions have settled. */ void poll_finalize(struct session_poll *p) { + + /* + * If we aren't awaiting anything else to connect, the combined total of + * logged in and failed connections should equal the number of + * configured sessions. + */ p->sess_conn_status = (p->session_count == p->conn_logged_in_count + p->conn_failed_count); + + /* All connections up, so check the dev event queue. */ + if (p->sess_conn_status) + vscsi_poll_devevent_queue(&p->devev_queue_status); } Index: usr.sbin/iscsid/vscsi.c =================================================================== RCS file: /cvs/src/usr.sbin/iscsid/vscsi.c,v retrieving revision 1.17 diff -u -p -u -p -r1.17 vscsi.c --- usr.sbin/iscsid/vscsi.c 16 Aug 2016 18:41:57 -0000 1.17 +++ usr.sbin/iscsid/vscsi.c 13 Jul 2021 23:46:01 -0000 @@ -338,3 +338,10 @@ vscsi_stats(void) { return &v.stats; } + +void +vscsi_poll_devevent_queue(int *arg) +{ + if (ioctl(v.fd, VSCSI_DEVEVENT_POLL, arg) != 0) + fatal("vscsi_devevent_poll"); +}
Index: usr.sbin/iscsictl/iscsictl.8 =================================================================== RCS file: /cvs/src/usr.sbin/iscsictl/iscsictl.8,v retrieving revision 1.7 diff -u -p -u -p -r1.7 iscsictl.8 --- usr.sbin/iscsictl/iscsictl.8 5 May 2021 12:34:12 -0000 1.7 +++ usr.sbin/iscsictl/iscsictl.8 13 Jul 2021 23:46:01 -0000 @@ -47,8 +47,9 @@ to communicate with The following commands are available: .Bl -tag -width Ds .It Cm reload -Reload the configuration file and wait to return until iscsid reports all -connections have completed (successfully or otherwise), or for up to 10 +Reload the configuration file and wait to return until iscsid reports +all connections have completed (successfully or otherwise) and all +scsi_probe and scsi_detach requests have completed, or for up to 10 seconds. .It Cm show Op Cm summary Show a list of all configured sessions. Index: usr.sbin/iscsictl/iscsictl.c =================================================================== RCS file: /cvs/src/usr.sbin/iscsictl/iscsictl.c,v retrieving revision 1.12 diff -u -p -u -p -r1.12 iscsictl.c --- usr.sbin/iscsictl/iscsictl.c 16 Apr 2021 14:39:33 -0000 1.12 +++ usr.sbin/iscsictl/iscsictl.c 13 Jul 2021 23:46:01 -0000 @@ -431,7 +431,7 @@ poll_and_wait(void) poll_session_status(); /* Poll says we are good to go. */ - if (poll_result.sess_conn_status != 0) { + if (poll_result.sess_conn_status != 0 && poll_result.devev_queue_status != 0) { printf("ok\n"); /* wait a bit longer so all is settled. */ sleep(POLL_DELAY_SEC);