On Wed, 11 Oct 2023 08:42:20 -0400
Jason Andryuk <jandr...@gmail.com> wrote:

> Hi, Glenn,
> 
> On Tue, Oct 10, 2023 at 2:23 PM Glenn Washburn
> <developm...@efficientek.com> wrote:
> >
> > On Tue, 10 Oct 2023 12:24:24 -0400
> > Jason Andryuk <jandr...@gmail.com> wrote:
> >
> > > grub_gzio_read_real() uses grub_errno to identify decompression failures
> > > when the function completes.  Its callees in gzio.c are void-returning
> > > functions that set grub_errno on their exit paths.
> > >
> > > A Fedora 38 machine was observed to fail `multiboot2 /xen.gz` with
> > > "premature end of file".  xen.gz was well formed and could be
> > > successfully gunzip-ed in Linux.
> > >
> > > The TPM is flaky and produces errors when the verifier tries to extend
> > > PCRs with measurements.  Logs show "Unkown TPM error" and grub_errno is
> > > set to GRUB_ERR_UNKNOWN_DEVICE.  This pre-existing grub_errno causes an
> > > otherwise successful grub_gzio_read_real() call to return error.
> > >
> > > Clear grub_errno at the start of the function to avoid such errors.
> >
> > This seems to be a somewhat common theme. This is similar to the issue
> > that I have seen in grub_disk_read() (see the first patch in the "More
> > ls improvements" series). I think GRUB should have an explicit policy
> > about grub_errno usage. Vladimir has stated "We generally prefer to have
> > grub_print_error() (better) or resetting grib_errno after the error is
> > produced rather than blanketly reset grub_errno at the beginning". That
> > then suggests that its the callers responsibility to ensure that
> > grub_errno is cleared before the call. I don't particularly like that,
> > as it seems more economical for grub_errno to be cleared in the callee.
> > Since GRUB seems to try to abide by some POSIX/GNU libc semantics, it
> > seems reasonable to me to try to offer the same behavior as libc's
> > errno. So caller's can not expect grub_errno to be preserved across
> > library calls (ie all exported functions). And the callee is free to
> > modify grub_errno at will. If the caller wants to preserve grub_errno
> > across exported function calls, then the caller should save grub_errno
> > in a local.
> >
> > So I'm for this approach in principle.
> >
> > > Signed-off-by: Jason Andryuk <jandr...@gmail.com>
> > > ---
> > >  grub-core/io/gzio.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > >
> > > diff --git a/grub-core/io/gzio.c b/grub-core/io/gzio.c
> > > index ca9355751..1e31a8255 100644
> > > --- a/grub-core/io/gzio.c
> > > +++ b/grub-core/io/gzio.c
> > > @@ -1294,6 +1294,14 @@ grub_gzio_read_real (grub_gzio_t gzio, grub_off_t 
> > > offset,
> > >  {
> > >    grub_ssize_t ret = 0;
> > >
> > > +  /* Avoid spurious failures on exit when grub_errno is already set. */
> > > +  if (grub_errno != GRUB_ERR_NONE)
> > > +    {
> > > +      grub_dprintf ("gzio", "%s: clearing pre-exising errmsg %s\n",
> > > +                  __func__, grub_errmsg);
> > > +      grub_errno = GRUB_ERR_NONE;
> > > +    }
> >
> > I think it makes sense to make this an inline called
> > "grub_print_uncleared_error" or something of the like to be used in
> > other places where this is appropriate. And the debug conditional
> > should be "errno" so we can see where this is happening in other areas
> > like grub_disk_read() to help in debugging. I suspect this type of
> > issue has caused some strange behavior I've seen in the past. Also
> > using a different debug conditional allows that type of issue to be
> > ignored without ignoring other debug messages from, in this case, gzio.
> >
> > And, I would have the "grub_errno = GRUB_ERR_NONE;" outside the if
> > block, which signals the expectation that grub_errno is cleared
> > unconditionally more clearly (if only slightly). I'm kinda ambivalent
> > about this though.
> 
> I think this all sounds good.  However, I think `grub_clear_errno` is
> a better name since that is the main purpose of the function.  Maybe
> `grub_reset_errno`.  The dprintf call I mainly added as a debugging
> aid and isn't the purpose of the function.  In the (hopefully small)

Yeah, I got that. Just having a one line change of "grub_errno =
GRUB_ERR_NONE;" is all that is needed. And if that were the change, I
wouldn't have suggested the inline function. For me the main purpose of
having the function is to have one line to be added to certain grub
functions that shows the debug statement for those debugging this kind
of issue. If its as Vladimir says and grub_errno should be cleared after
its produced, then we want to know when that expectation fails, as its
has been and I suspect will be the source of errors.

Another option is to have the function just be a macro called
"grub_debug_errno" defined as something like "if(grub_errno !=
GRUB_ERR_NONE) { grub_drpintf(...); }" and have always add two
statements to the start of exported functions, "grub_debug_errno();
grub_errno = GRUB_ERR_NONE;". This might be more clear to the casual
reader.

> chance that there was a stale, unprinted value in grub_errno, it seems
> better to print it out before clearing.

I think it depends on what the expectations are for grub_errno usage.
If it should be have like libc errno, then not really. Anyway, before
making the next revision, it would be good to hear from Daniel or
Vladimir.

Glenn

_______________________________________________
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel

Reply via email to