On Mon,  2 Dec 2013 10:19:37 -0500 Prarit Bhargava <pra...@redhat.com> wrote:

> Logging and tracking firmware bugs in the kernel has long been an issue
> for system administrators.  The current kernel does not have a good
> uniform method of reporting firmware bugs and the code in the kernel is a
> mix of printk's and WARN_ONs.  This causes problems for both system
> administrators and QA engineers who attempt to diagnose problems within
> the kernel.
> 
> Using printk's is somewhat effective but lacks information useful for
> reporting a bug such as the system vendor or model, BIOS revision, etc.
> Using WARN_ONs is also questionable because the data like the backtrace
> and the list of modules is usually unnecessary for firmware issues as the
> warning stems from one call path during system or driver initialization.
> We have heard many complaints from users about the excess verbosity and
> confusing stacktraces for these messages.
> 
> I'm proposing with this patch to do something similar to the WARN()
> mechanism that is currently implemented in the kernel.  This
> patchset introduces FW_INFO() and FW_INFO_DEV() which logs output like:
> 
> [  230.661137] [Firmware Info]: pci_bus 0000:00: at
> /home/prarit_modules/prarit.c:21 Your BIOS is broken because it is
> -ENOWORKY.
> [  230.671076] [Firmware Info]: Intel Corporation SandyBridge Platform/To
> be filled by O.E.M., BIOS RMLCRB.86I.R3.27.D685.1305151733 05/15/2013
> 
> instead of the verbose back traces we are currently seeing.  These messages
> can be easily gleaned from /var/log/messages, etc., by automatic bug
> reporting tools and system administrators to properly report bugs to
> hardware vendors.
> 
> I found an improperly classified FW_INFO in arch/x86/kernel/cpu/amd.c
> which that should be a FW_BUG.

A slight simplification:

> +static inline char *dump_hadware_arch_desc(void)
> +{
> +     return NULL;
> +}

        return "unavailable";

> +void warn_slowpath_fmt_dev(const char *file, int line,
> +                        struct device *dev, int level, const char *fmt, ...)
> +{
> +     struct slowpath_args args;
> +     static char fw_str[16] = "\0";
> +
> +     switch (level) {
> +     case 1:
> +             strcpy(fw_str, "[Firmware Info]");
> +             break;
> +     case 2:
> +             strcpy(fw_str, "[Firmware Warn]");
> +             break;
> +     case 3:
> +             strcpy(fw_str, "[Firmware Bug]");

What's With The Crazy Capitalization In This Code?  It's Illiterate!

> +             add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
> +             break;
> +     default:
> +             strcpy(fw_str, "[Firmware Bug]");
> +             break;
> +     }
> +
> +     if (dev)
> +             pr_info("%s: %s %s ", fw_str,
> +                     dev_driver_string(dev), dev_name(dev));
> +     pr_info("%s: at %s:%d\n", fw_str, file, line);
> +
> +     args.fmt = fmt;
> +     va_start(args.args, fmt);
> +     printk(KERN_WARNING "%s: %pV", fw_str, &args);
> +     va_end(args.args);
> +
> +     if (dump_hardware_arch_desc())
> +             pr_info("%s: System Info: %s\n", fw_str,
> +                     dump_hardware_arch_desc());
> +     else
> +             pr_info("%s: System Info: Hardware Unidentified\n", fw_str);

        pr_info(""%s: system info: %s\n", fw_str, dump_hardware_arch_desc());

> +}
> +EXPORT_SYMBOL(warn_slowpath_fmt_dev);
> +
> +

> +char *dump_hardware_arch_desc(void)
> +{
> +     if (dump_stack_arch_desc_str[0] != '\0')
> +             return dump_stack_arch_desc_str;
> +     return NULL;

        return "unavaliable";

> +}


The general thrust of the patchset seems useful and important.  I do
agree with Joe's suggestion regarding the presentation, although it
isn't a show-stopper.

I do wonder if it all should be generalised a bit - it creates a bunch
of infrastructure which is specific to system firmware issues, but
what's so special about firmware?  Why can't I use this infrastructure
to log netdev errors or acpi errors or PM errors or...?  But I didn't
think about it much ;)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to