Here are two other possible approaches to fix the issue in question without
disrupting special "sector after the last one" handling logic. One
patches DIOCGDELETE  code to not accept requests clearly beyond the
provider's media boundary and the other one extends g_delete_data() to
convert bio_length == 0 after biowait() into an error. Both effectively
achieve the same result in my tests, but the g_delete_data() version has a
chance to disrupt other kernel users. I am leaning towards DIOCGDELETE
patch being the least evil. I would open phabricator request once I collect
some input.

Thanks!

-Max

On Wed, Nov 7, 2018 at 11:46 AM Warner Losh <i...@bsdimp.com> wrote:

>
>
> On Wed, Nov 7, 2018 at 9:07 AM Rodney W. Grimes <
> free...@pdx.rh.cn85.dnsmgr.net> wrote:
>
>> > On Tue, 2018-11-06 at 16:17 -0800, Rodney W. Grimes wrote:
>> > > >
>> > > > Rodney, this was actually my original intention, however then I
>> noticed in
>> > > > the GEOM code there is at least one case when BIO_FLUSH request is
>> being
>> > > > generated internally with bio_offset == mediasize and bio_lenth ==
>> 0, so I
>> > > > thought there might be some need to allow such requests through.
>> But I'd
>> > > > happily go with the stricter rule if it does no harm. I simply
>> don't know
>> > > > enough about the intended use and the logic behind zero-length
>> transfers to
>> > > > make that call.
>> > > I am not sure enough on if mediasize is 0 based or not,
>> > > if it is then the error case should be fixed, and the
>> > > code you show below should also be fixed as it is
>> > > technically making a request beyond the end of device.
>> > >
>> > > I am also murky on why we are even doing a 0 size
>> > > operation and end of device, is that to validate
>> > > we can access all the media???If so then this wrong
>> > > code and wrong error return should be fixed as it
>> > > is off by 1.
>> > >
>> > > >
>> > > >
>> > > > -Max
>> > > >
>> > > > int
>> > > > g_io_flush(struct g_consumer *cp)
>> > > > {
>> > > > ...
>> > > > ????????bp = g_alloc_bio();
>> > > > ????????bp->bio_cmd = BIO_FLUSH;
>> > > > ...
>> > > > ????????bp->bio_offset = cp->provider->mediasize;
>> > > The above should have a - 1 on it.
>> > >
>> >
>> > Unless offset > mediasize is specifically a signal to downstream code
>> > in some way about how the flush is to be performed.
>>
>> Could very well be, should be documented some place though.
>>
>> > Nearly identical code to create a BIO_FLUSH bio appears in ufs softdeps
>> > and in zfs. Before starting to arbitrarily change code that has worked
>> > since 2006, it might be a good idea to track down why these values are
>> > set the way they are. Unfortunately, there is no clue in the commit
>> > logs, but maybe the author (pjd@, cc'd) can englighten us.
>>
>> I agree with that take on the situation, and it is why I asked
>> for a revert and investigation, rather than trying to solve
>> why we suddenly fail some regression tests.
>>
>
> BIO_FLUSH is primarily done to force ordering points, since they are one
> of the only users of BIO_ORDERED in the system (the other is one place in
> UFS that shouldn't be doing BIO_ORDERED in theory, but in practice it's
> hard to fix. The exact values of the request don't really matter for the
> most part, though flushing past the end is seems ill defined. There's no
> way we'd force out blocks past the end. NVME, SCSI and ATA all implement
> BIO_FLUSH as either a NOP, or as simple command to flush all caches to
> stable media. Other drivers that I looked at appear to do something
> similar, though I've not looked at every single one of them in detail.
>
> Warner
>

Attachment: patch2.diff
Description: Binary data

Attachment: patch1.diff
Description: Binary data

_______________________________________________
freebsd-geom@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-geom
To unsubscribe, send any mail to "freebsd-geom-unsubscr...@freebsd.org"

Reply via email to