On Wed, Jul 13, 2022 at 02:18:53PM +0200, Otto Moerbeek wrote:

> Hi,
> 
> after a prompt from stsp@ and florian@, reporting that newfs_msdos
> fails when given an $duid.i argument, I set down to see what could be
> going on. My test using an USB stick failed to reprdouce the problem.
> Then I started using a vnd, and that shows the issue (once in a
> while). The feeling is that any disk devcied created on the fly might
> show this issue.
> 
> What I have found out so far: sometimes, disk_subr.c:disk_map()
> fails, because the device in question does not have the DKF_LABELVALID
> flag set, so it is skipped in the duid search.
> 
> The only place where DKF_LABELVALID is set is in
> disk_attach_callback().
> 
> Often the disk_readlabel() call in this function succeeds and the flag
> is set, but also often it does not succeed. 
> 
> I have observed it failing setting the message to "cannot read disk
> label, 0xe32/0x2932, error 25", which seems to come from the
> DIOCGPDINFO NOTTY case in dev/vnc.c
> 
> This is how far I got.

Looking a bit more, I see that the attachment is done from a task. I
suppose that if the open happens before the task is run, we end up
with a flag not set. 


> 
> I am using the test script below. In my case, instrumenting
> subr_disk.c with printf made the problem occur less often, so I
> suspect a race between the disk attachment and the label becoming
> available.
> 
> The vnconfig triggers the disk attachment message (but only the first time a
> vnd is created, it seems, I am not seeing any messages after a vnconfig
> -u and re-running the script).
> 
> 
> === script ===
> set -e
> dd if=/dev/random of=/tmp/image$1 bs=1m count=100
> vnconfig vnd$1 /tmp/image$1
> fdisk -ig vnd$1
> echo "a\ni\n\n\n\nw\nl\nq\n" | disklabel -E vnd$1
> ==============
> 
> Try to run newfs_msdos on the duid.i reported. Note that you need to
> reboot to see a change a behaviour on a specific vnd. If it starts out
> OK, it keeps on being OK, if it fails, it keeps on failing. This
> corresponds to my observation that the disk_attach_callback() is
> called only once for a specifc vnd, even after unconfig and reconfig.
> 
> Hope somebody who knows how this all shoudl work can see where the
> issue is.
> 
>       -Otto
> 
> Index: subr_disk.c
> ===================================================================
> RCS file: /cvs/src/sys/kern/subr_disk.c,v
> retrieving revision 1.248
> diff -u -p -r1.248 subr_disk.c
> --- subr_disk.c       2 Jan 2022 17:26:14 -0000       1.248
> +++ subr_disk.c       13 Jul 2022 12:02:05 -0000
> @@ -1118,8 +1118,10 @@ disk_attach_callback(void *xdat)
>       /* Read disklabel. */
>       if (disk_readlabel(&dl, dk->dk_devno, errbuf, sizeof(errbuf)) == NULL) {
>               enqueue_randomness(dl.d_checksum);
> +             printf("disk_attach_callback %s: seting DKF_LABELVALID\n", 
> dk->dk_name);
>               dk->dk_flags |= DKF_LABELVALID;
>       }
> +     else printf("disk_attach_callback %s: NOT seting DKF_LABELVALID 
> (%s)\n", dk->dk_name, errbuf);
>  
>  done:
>       dk->dk_flags |= DKF_OPENED;
> @@ -1785,6 +1787,7 @@ disk_map(char *path, char *mappath, int 
>  
>       mdk = NULL;
>       TAILQ_FOREACH(dk, &disklist, dk_link) {
> +             printf("YYY %p %s %x %p\n", dk, dk->dk_name, dk->dk_flags, 
> dk->dk_label);
>               if ((dk->dk_flags & DKF_LABELVALID) && dk->dk_label &&
>                   memcmp(dk->dk_label->d_uid, uid,
>                   sizeof(dk->dk_label->d_uid)) == 0) {
> @@ -1795,8 +1798,10 @@ disk_map(char *path, char *mappath, int 
>               }
>       }
>  
> -     if (mdk == NULL || mdk->dk_name == NULL)
> +     if (mdk == NULL || mdk->dk_name == NULL) {
> +             printf("XXX %p\n", mdk);
>               return -1;
> +     }
>  
>       snprintf(mappath, size, "/dev/%s%s%c",
>           (flags & DM_OPENBLCK) ? "" : "r", mdk->dk_name, part);
> 

Reply via email to