On Fri, Jun 10, 2016 at 11:23:56PM -0400, David Kershner wrote:
> visorbus is currently located at drivers/staging/visorbus,
> this patch moves it to drivers/virt.
> 
> Signed-off-by: David Kershner <david.kersh...@unisys.com>
> Reviewed-by: Tim Sell <timothy.s...@unisys.com>
> ---
>  drivers/staging/unisys/Kconfig                                        | 3 +--
>  drivers/staging/unisys/Makefile                                       | 1 -
>  drivers/virt/Kconfig                                                  | 2 ++
>  drivers/virt/Makefile                                                 | 1 +
>  drivers/{staging/unisys => virt}/visorbus/Kconfig                     | 0
>  drivers/{staging/unisys => virt}/visorbus/Makefile                    | 0
>  drivers/{staging/unisys => virt}/visorbus/controlvmchannel.h          | 0
>  drivers/{staging/unisys => virt}/visorbus/controlvmcompletionstatus.h | 0
>  drivers/{staging/unisys => virt}/visorbus/iovmcall_gnuc.h             | 0
>  drivers/{staging/unisys => virt}/visorbus/vbuschannel.h               | 0
>  drivers/{staging/unisys => virt}/visorbus/vbusdeviceinfo.h            | 0
>  drivers/{staging/unisys => virt}/visorbus/vbushelper.h                | 0
>  drivers/{staging/unisys => virt}/visorbus/visorbus_main.c             | 0
>  drivers/{staging/unisys => virt}/visorbus/visorbus_private.h          | 0
>  drivers/{staging/unisys => virt}/visorbus/visorchannel.c              | 0
>  drivers/{staging/unisys => virt}/visorbus/visorchipset.c              | 0

I picked one random file here, this last one, and found a number of
"odd" things in it.

So, given that I can't really comment on the patch itself, I'm going to
include the file below, quote it, and then provide some comments, ok?


> /* visorchipset_main.c
> >  *
>  * Copyright (C) 2010 - 2015 UNISYS CORPORATION
>  * All rights reserved.
>  *
>  * This program is free software; you can redistribute it and/or modify it
>  * under the terms and conditions of the GNU General Public License,
>  * version 2, as published by the Free Software Foundation.
>  *
>  * This program is distributed in the hope that it will be useful, but
>  * WITHOUT ANY WARRANTY; without even the implied warranty of
>  * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
>  * NON INFRINGEMENT.  See the GNU General Public License for more
>  * details.
>  */
> 
> #include <linux/acpi.h>
> #include <linux/cdev.h>
> #include <linux/ctype.h>
> #include <linux/fs.h>
> #include <linux/mm.h>
> #include <linux/nls.h>
> #include <linux/netdevice.h>
> #include <linux/platform_device.h>
> #include <linux/uuid.h>
> #include <linux/crash_dump.h>
> 
> #include "channel_guid.h"
> #include "controlvmchannel.h"
> #include "controlvmcompletionstatus.h"
> #include "guestlinuxdebug.h"
> #include "version.h"

Why do you have this "version.h" file anyway?  It should be deleted, as
parts of it are obviously wrong :(

> #include "visorbus.h"
> #include "visorbus_private.h"
> #include "vmcallinterface.h"

That's loads of "private" .h files, are they all really needed?

> 
> #define CURRENT_FILE_PC VISOR_CHIPSET_PC_visorchipset_main_c

???

> 
> #define MAX_NAME_SIZE 128

name of what?  I don't think you even use this in the file!

> #define MAX_IP_SIZE   50

Huh?  You don't use this either?

> #define MAXOUTSTANDINGCHANNELCOMMAND 256

Here, have a '_', they are free.

But again, I don't see this being used.

> #define POLLJIFFIES_CONTROLVMCHANNEL_FAST   1
> #define POLLJIFFIES_CONTROLVMCHANNEL_SLOW 100

Right-hand alignment?  That's a glutton for punishment.

> 
> #define MAX_CONTROLVM_PAYLOAD_BYTES (1024 * 128)
> 
> #define VISORCHIPSET_MMAP_CONTROLCHANOFFSET   0x00000000
> 
> #define UNISYS_SPAR_LEAF_ID 0x40000000
> 
> /* The s-Par leaf ID returns "UnisysSpar64" encoded across ebx, ecx, edx */
> #define UNISYS_SPAR_ID_EBX 0x73696e55
> #define UNISYS_SPAR_ID_ECX 0x70537379
> #define UNISYS_SPAR_ID_EDX 0x34367261
> 
> /*
>  * Module parameters
>  */
> static int visorchipset_major;

major number should not be a module option, just grab a dynamic one and
run with it please.

> static int visorchipset_visorbusregwait = 1;  /* default is on */

Why even have this option?  Who is going to ever change it?  Why would
they?

> static unsigned long controlvm_payload_bytes_buffered;

Not a module option :(

> static u32 dump_vhba_bus;

Not an option, only ever set, never tested :(

> 
> static int
> visorchipset_open(struct inode *inode, struct file *file)
> {
>       unsigned int minor_number = iminor(inode);
> 
>       if (minor_number)
>               return -ENODEV;

What is this check for?  You only ever want one minor number?  If so,
why do you want a whole major number?

>       file->private_data = NULL;

Isn't this already true?

>       return 0;
> }
> 
> static int
> visorchipset_release(struct inode *inode, struct file *file)
> {
>       return 0;
> }

If you do nothing in a release function, then don't provide it at all.

> 
> /*
>  * When the controlvm channel is idle for at least MIN_IDLE_SECONDS,
>  * we switch to slow polling mode. As soon as we get a controlvm
>  * message, we switch back to fast polling mode.
>  */
> #define MIN_IDLE_SECONDS 10
> static unsigned long poll_jiffies = POLLJIFFIES_CONTROLVMCHANNEL_FAST;
> /* when we got our last controlvm message */
> static unsigned long most_recent_message_jiffies;
> 
> struct parser_context {
>       unsigned long allocbytes;
>       unsigned long param_bytes;
>       u8 *curr;
>       unsigned long bytes_remaining;
>       bool byte_stream;
>       char data[0];
> };
> 
> static struct delayed_work periodic_controlvm_work;
> 
> static struct cdev file_cdev;

Again, it looks like you only want 1 minor number, just use the
miscdevice interface please, not a "full cdev" interface.

> static struct visorchannel **file_controlvm_channel;

This is only ever set to the same thing, why have this at all?

> static struct controlvm_message_packet g_devicechangestate_packet;

Only ever set, never used :(

> static LIST_HEAD(bus_info_list);

Never used.

> static LIST_HEAD(dev_info_list);

Never used.

Why don't these throw build warnings?

> static struct visorchannel *controlvm_channel;

What is this for?  You never seem to set it.

> 
> /* Manages the request payload in the controlvm channel */
> struct visor_controlvm_payload_info {
>       u8 *ptr;                /* pointer to base address of payload pool */
>       u64 offset;             /*
>                                * offset from beginning of controlvm
>                                * channel to beginning of payload * pool
>                                */
>       u32 bytes;              /* number of bytes in payload pool */
> };
> 
> static struct visor_controlvm_payload_info controlvm_payload_info;

As this is already set to 0, no need to set it again way down below in
the file.

> 
> /*
>  * The following globals are used to handle the scenario where we are unable 
> to
>  * offload the payload from a controlvm message due to memory requirements. In
>  * this scenario, we simply stash the controlvm message, then attempt to
>  * process it again the next time controlvm_periodic_work() runs.
>  */
> static struct controlvm_message controlvm_pending_msg;
> static bool controlvm_pending_msg_valid;

These "global" variables seem odd to me, that they aren't "device
specific", but oh well...

> /*
>  * This identifies a data buffer that has been received via a controlvm 
> messages
>  * in a remote --> local CONTROLVM_TRANSMIT_FILE conversation.
>  */
> struct putfile_buffer_entry {
>       struct list_head next;  /* putfile_buffer_entry list */
>       struct parser_context *parser_ctx; /* points to input data buffer */
> };
> 
> /*
>  * List of struct putfile_request *, via next_putfile_request member.
>  * Each entry in this list identifies an outstanding TRANSMIT_FILE
>  * conversation.
>  */
> static LIST_HEAD(putfile_request_list);

Never used???  What am I missing here?

> /*
>  * This describes a buffer and its current state of transfer (e.g., how many
>  * bytes have already been supplied as putfile data, and how many bytes are
>  * remaining) for a putfile_request.
>  */
> struct putfile_active_buffer {
>       /* a payload from a controlvm message, containing a file data buffer */
>       struct parser_context *parser_ctx;
>       /* points within data area of parser_ctx to next byte of data */
>       size_t bytes_remaining;
> };
> 
> #define PUTFILE_REQUEST_SIG 0x0906101302281211

What is that signature for?

> /*
>  * This identifies a single remote --> local CONTROLVM_TRANSMIT_FILE
>  * conversation. Structs of this type are dynamically linked into
>  * <Putfile_request_list>.
>  */
> struct putfile_request {
>       u64 sig;                /* PUTFILE_REQUEST_SIG */
> 
>       /* header from original TransmitFile request */
>       struct controlvm_message_header controlvm_header;
> 
>       /* link to next struct putfile_request */
>       struct list_head next_putfile_request;
> 
>       /*
>        * head of putfile_buffer_entry list, which describes the data to be
>        * supplied as putfile data;
>        * - this list is added to when controlvm messages come in that supply
>        * file data
>        * - this list is removed from via the hotplug program that is actually
>        * consuming these buffers to write as file data
>        */
>       struct list_head input_buffer_list;
>       spinlock_t req_list_lock;       /* lock for input_buffer_list */
> 
>       /* waiters for input_buffer_list to go non-empty */
>       wait_queue_head_t input_buffer_wq;
> 
>       /* data not yet read within current putfile_buffer_entry */
>       struct putfile_active_buffer active_buf;
> 
>       /*
>        * <0 = failed, 0 = in-progress, >0 = successful;
>        * note that this must be set with req_list_lock, and if you set <0,
>        * it is your responsibility to also free up all of the other objects
>        * in this struct (like input_buffer_list, active_buf.parser_ctx)
>        * before releasing the lock
>        */
>       int completion_status;
> };
> 
> struct parahotplug_request {
>       struct list_head list;
>       int id;
>       unsigned long expiration;
>       struct controlvm_message msg;
> };
> 
> static LIST_HEAD(parahotplug_request_list);

Yeah, you use that one!

> static DEFINE_SPINLOCK(parahotplug_request_list_lock);        /* lock for 
> above */
> static void parahotplug_process_list(void);

definition not needed, you define it before you use it.

> 
> /* info for /dev/visorchipset */
> static dev_t major_dev = -1; /*< indicates major num for device */

No, it's not a major number, it's a dev_t, why set it?

And the fact that you are setting it to -1 is odd as you test it for 0
when you start up.

> 
> /* prototypes for attributes */
> static ssize_t toolaction_show(struct device *dev,
>                              struct device_attribute *attr, char *buf);
> static ssize_t toolaction_store(struct device *dev,
>                               struct device_attribute *attr,
>                               const char *buf, size_t count);

Both prototypes not needed.

> static DEVICE_ATTR_RW(toolaction);

Put this below the functions below.

> 
> static ssize_t boottotool_show(struct device *dev,
>                              struct device_attribute *attr, char *buf);
> static ssize_t boottotool_store(struct device *dev,
>                               struct device_attribute *attr, const char *buf,
>                               size_t count);
> static DEVICE_ATTR_RW(boottotool);
> 
> static ssize_t error_show(struct device *dev, struct device_attribute *attr,
>                         char *buf);
> static ssize_t error_store(struct device *dev, struct device_attribute *attr,
>                          const char *buf, size_t count);
> static DEVICE_ATTR_RW(error);
> 
> static ssize_t textid_show(struct device *dev, struct device_attribute *attr,
>                          char *buf);
> static ssize_t textid_store(struct device *dev, struct device_attribute *attr,
>                           const char *buf, size_t count);
> static DEVICE_ATTR_RW(textid);
> 
> static ssize_t remaining_steps_show(struct device *dev,
>                                   struct device_attribute *attr, char *buf);
> static ssize_t remaining_steps_store(struct device *dev,
>                                    struct device_attribute *attr,
>                                    const char *buf, size_t count);
> static DEVICE_ATTR_RW(remaining_steps);
> 
> static ssize_t devicedisabled_store(struct device *dev,
>                                   struct device_attribute *attr,
>                                   const char *buf, size_t count);
> static DEVICE_ATTR_WO(devicedisabled);
> 
> static ssize_t deviceenabled_store(struct device *dev,
>                                  struct device_attribute *attr,
>                                  const char *buf, size_t count);
> static DEVICE_ATTR_WO(deviceenabled);

Odds are all of these prototypes aren't needed, then:

> 
> static struct attribute *visorchipset_install_attrs[] = {
>       &dev_attr_toolaction.attr,
>       &dev_attr_boottotool.attr,
>       &dev_attr_error.attr,
>       &dev_attr_textid.attr,
>       &dev_attr_remaining_steps.attr,
>       NULL
> };

You can move that lower.

> 
> static struct attribute_group visorchipset_install_group = {
>       .name = "install",
>       .attrs = visorchipset_install_attrs
> };
> 
> static struct attribute *visorchipset_parahotplug_attrs[] = {
>       &dev_attr_devicedisabled.attr,
>       &dev_attr_deviceenabled.attr,
>       NULL
> };
> 
> static struct attribute_group visorchipset_parahotplug_group = {
>       .name = "parahotplug",
>       .attrs = visorchipset_parahotplug_attrs
> };
> 
> static const struct attribute_group *visorchipset_dev_groups[] = {
>       &visorchipset_install_group,
>       &visorchipset_parahotplug_group,
>       NULL
> };
> 
> static void visorchipset_dev_release(struct device *dev)
> {
> }

As per the documentation in the Linux kernel source tree, I get to
make fun of you here in public.  You are herby mocked and when I see
this I want to jump up and down and scream.

Do you think that the kernel was giving you a message of "no release
function" for grins and giggles?  No, that code was there to save your
from yourself.  By providing an empty function, did you think that you
were ok in that you outsmarted it?

NO!

I should just delete the whole tree...

ugh.

> 
> /* /sys/devices/platform/visorchipset */
> static struct platform_device visorchipset_platform_device = {
>       .name = "visorchipset",
>       .id = -1,
>       .dev.groups = visorchipset_dev_groups,
>       .dev.release = visorchipset_dev_release,
> };

A static platform device?  why?  This is a dynamic bus, why are you
messing with platform devices?  Damm I hate those things, people abuse
them in horrid ways...

Like this way.

ick.
> 
> /* Function prototypes */
> static void controlvm_respond(struct controlvm_message_header *msg_hdr,
>                             int response);
> static void controlvm_respond_chipset_init(
>               struct controlvm_message_header *msg_hdr, int response,
>               enum ultra_chipset_feature features);
> static void controlvm_respond_physdev_changestate(
>               struct controlvm_message_header *msg_hdr, int response,
>               struct spar_segment_state state);
> 
> static void parser_done(struct parser_context *ctx);

I doubt you need all of those prototypes.

> 
> static struct parser_context *
> parser_init_byte_stream(u64 addr, u32 bytes, bool local, bool *retry)
> {
>       int allocbytes = sizeof(struct parser_context) + bytes;
>       struct parser_context *ctx;
> 
>       if (retry)
>               *retry = false;

Why would retry be NULL?  Doesn't look like you ever have it set that
way, so why check it?

> 
>       /*
>        * alloc an 0 extra byte to ensure payload is
>        * '\0'-terminated
>        */
>       allocbytes++;
>       if ((controlvm_payload_bytes_buffered + bytes)
>           > MAX_CONTROLVM_PAYLOAD_BYTES) {
>               if (retry)
>                       *retry = true;
>               return NULL;
>       }
>       ctx = kzalloc(allocbytes, GFP_KERNEL | __GFP_NORETRY);

Why no retry?

>       if (!ctx) {
>               if (retry)
>                       *retry = true;
>               return NULL;
>       }
> 
>       ctx->allocbytes = allocbytes;
>       ctx->param_bytes = bytes;
>       ctx->curr = NULL;
>       ctx->bytes_remaining = 0;
>       ctx->byte_stream = false;
>       if (local) {
>               void *p;
> 
>               if (addr > virt_to_phys(high_memory - 1))
>                       goto err_finish_ctx;
>               p = __va((unsigned long)(addr));
>               memcpy(ctx->data, p, bytes);

What does "local" mean?

>       } else {
>               void *mapping = memremap(addr, bytes, MEMREMAP_WB);
> 
>               if (!mapping)
>                       goto err_finish_ctx;
>               memcpy(ctx->data, mapping, bytes);
>               memunmap(mapping);
>       }
> 
>       ctx->byte_stream = true;
>       controlvm_payload_bytes_buffered += ctx->param_bytes;
> 
>       return ctx;
> 
> err_finish_ctx:
>       parser_done(ctx);
>       return NULL;
> }
> 
> static uuid_le
> parser_id_get(struct parser_context *ctx)
> {
>       struct spar_controlvm_parameters_header *phdr = NULL;
> 
>       if (!ctx)
>               return NULL_UUID_LE;

How can that ever be true?

>       phdr = (struct spar_controlvm_parameters_header *)(ctx->data);

Why isn't data already this structure?

>       return phdr->id;
> }
> 
> /*
>  * Describes the state from the perspective of which controlvm messages have
>  * been received for a bus or device.
>  */
> 
> enum PARSER_WHICH_STRING {
>       PARSERSTRING_INITIATOR,
>       PARSERSTRING_TARGET,
>       PARSERSTRING_CONNECTION,
>       PARSERSTRING_NAME, /* TODO: only PARSERSTRING_NAME is used ? */
> };
> 
> static void
> parser_param_start(struct parser_context *ctx,
>                  enum PARSER_WHICH_STRING which_string)
> {
>       struct spar_controlvm_parameters_header *phdr = NULL;
> 
>       if (!ctx)
>               return;

How could that happen?

> 
>       phdr = (struct spar_controlvm_parameters_header *)(ctx->data);
>       switch (which_string) {
>       case PARSERSTRING_INITIATOR:
>               ctx->curr = ctx->data + phdr->initiator_offset;
>               ctx->bytes_remaining = phdr->initiator_length;
>               break;
>       case PARSERSTRING_TARGET:
>               ctx->curr = ctx->data + phdr->target_offset;
>               ctx->bytes_remaining = phdr->target_length;
>               break;
>       case PARSERSTRING_CONNECTION:
>               ctx->curr = ctx->data + phdr->connection_offset;
>               ctx->bytes_remaining = phdr->connection_length;
>               break;
>       case PARSERSTRING_NAME:
>               ctx->curr = ctx->data + phdr->name_offset;
>               ctx->bytes_remaining = phdr->name_length;
>               break;

Someone is validating that all of those values are "acceptable", right?

If so, where?  I missed it.  No overflows?  No underflows?  All correct
lengths?  Seems really trusting...

>       default:
>               break;
>       }
> }
> 
> static void parser_done(struct parser_context *ctx)
> {
>       if (!ctx)
>               return;

How can this happen?

>       controlvm_payload_bytes_buffered -= ctx->param_bytes;
>       kfree(ctx);
> }
> 
> static void *
> parser_string_get(struct parser_context *ctx)
> {
>       u8 *pscan;
>       unsigned long nscan;
>       int value_length = -1;
>       void *value = NULL;
>       int i;
> 
>       if (!ctx)
>               return NULL;

Again, how?

>       pscan = ctx->curr;
>       nscan = ctx->bytes_remaining;
>       if (nscan == 0)
>               return NULL;
>       if (!pscan)
>               return NULL;
>       for (i = 0, value_length = -1; i < nscan; i++)
>               if (pscan[i] == '\0') {
>                       value_length = i;
>                       break;
>               }
>       if (value_length < 0)   /* '\0' was not included in the length */
>               value_length = nscan;
>       value = kmalloc(value_length + 1, GFP_KERNEL | __GFP_NORETRY);
>       if (!value)
>               return NULL;
>       if (value_length > 0)
>               memcpy(value, pscan, value_length);
>       ((u8 *)(value))[value_length] = '\0';
>       return value;
> }
> 
> static ssize_t toolaction_show(struct device *dev,
>                              struct device_attribute *attr,
>                              char *buf)
> {
>       u8 tool_action = 0;
> 
>       visorchannel_read(controlvm_channel,
>                         offsetof(struct spar_controlvm_channel_protocol,
>                                  tool_action), &tool_action, sizeof(u8));
>       return scnprintf(buf, PAGE_SIZE, "%u\n", tool_action);

return sprintf(buf, "%u\n", tool_action);
i
You can't ever be larger than a page, no need to pretend you could be.

> }
> 
> static ssize_t toolaction_store(struct device *dev,
>                               struct device_attribute *attr,
>                               const char *buf, size_t count)
> {
>       u8 tool_action;
>       int ret;
> 
>       if (kstrtou8(buf, 10, &tool_action))
>               return -EINVAL;
> 
>       ret = visorchannel_write
>               (controlvm_channel,
>                offsetof(struct spar_controlvm_channel_protocol,
>                         tool_action),
>                &tool_action, sizeof(u8));
> 
>       if (ret)
>               return ret;
>       return count;
> }
> 
> static ssize_t boottotool_show(struct device *dev,
>                              struct device_attribute *attr,
>                              char *buf)
> {
>       struct efi_spar_indication efi_spar_indication;
> 
>       visorchannel_read(controlvm_channel,
>                         offsetof(struct spar_controlvm_channel_protocol,
>                                  efi_spar_ind), &efi_spar_indication,
>                         sizeof(struct efi_spar_indication));
>       return scnprintf(buf, PAGE_SIZE, "%u\n",
>                        efi_spar_indication.boot_to_tool);

same here with scnprintf.  Everywhere else too, I'm not going to point
them all out.

> }
> 
> static ssize_t boottotool_store(struct device *dev,
>                               struct device_attribute *attr,
>                               const char *buf, size_t count)
> {
>       int val, ret;
>       struct efi_spar_indication efi_spar_indication;
> 
>       if (kstrtoint(buf, 10, &val))
>               return -EINVAL;
> 
>       efi_spar_indication.boot_to_tool = val;
>       ret = visorchannel_write
>               (controlvm_channel,
>                offsetof(struct spar_controlvm_channel_protocol,
>                         efi_spar_ind), &(efi_spar_indication),
>                sizeof(struct efi_spar_indication));
> 
>       if (ret)
>               return ret;
>       return count;
> }
> 
> static ssize_t error_show(struct device *dev, struct device_attribute *attr,
>                         char *buf)
> {
>       u32 error = 0;
> 
>       visorchannel_read(controlvm_channel,
>                         offsetof(struct spar_controlvm_channel_protocol,
>                                  installation_error),
>                         &error, sizeof(u32));
>       return scnprintf(buf, PAGE_SIZE, "%i\n", error);
> }
> 
> static ssize_t error_store(struct device *dev, struct device_attribute *attr,
>                          const char *buf, size_t count)
> {
>       u32 error;
>       int ret;
> 
>       if (kstrtou32(buf, 10, &error))
>               return -EINVAL;
> 
>       ret = visorchannel_write
>               (controlvm_channel,
>                offsetof(struct spar_controlvm_channel_protocol,
>                         installation_error),
>                &error, sizeof(u32));
>       if (ret)
>               return ret;
>       return count;
> }
> 
> static ssize_t textid_show(struct device *dev, struct device_attribute *attr,
>                          char *buf)
> {
>       u32 text_id = 0;
> 
>       visorchannel_read
>               (controlvm_channel,
>                offsetof(struct spar_controlvm_channel_protocol,
>                         installation_text_id),
>                &text_id, sizeof(u32));
>       return scnprintf(buf, PAGE_SIZE, "%i\n", text_id);
> }
> 
> static ssize_t textid_store(struct device *dev, struct device_attribute *attr,
>                           const char *buf, size_t count)
> {
>       u32 text_id;
>       int ret;
> 
>       if (kstrtou32(buf, 10, &text_id))
>               return -EINVAL;
> 
>       ret = visorchannel_write
>               (controlvm_channel,
>                offsetof(struct spar_controlvm_channel_protocol,
>                         installation_text_id),
>                &text_id, sizeof(u32));
>       if (ret)
>               return ret;
>       return count;
> }
> 
> static ssize_t remaining_steps_show(struct device *dev,
>                                   struct device_attribute *attr, char *buf)
> {
>       u16 remaining_steps = 0;
> 
>       visorchannel_read(controlvm_channel,
>                         offsetof(struct spar_controlvm_channel_protocol,
>                                  installation_remaining_steps),
>                         &remaining_steps, sizeof(u16));
>       return scnprintf(buf, PAGE_SIZE, "%hu\n", remaining_steps);
> }
> 
> static ssize_t remaining_steps_store(struct device *dev,
>                                    struct device_attribute *attr,
>                                    const char *buf, size_t count)
> {
>       u16 remaining_steps;
>       int ret;
> 
>       if (kstrtou16(buf, 10, &remaining_steps))
>               return -EINVAL;
> 
>       ret = visorchannel_write
>               (controlvm_channel,
>                offsetof(struct spar_controlvm_channel_protocol,
>                         installation_remaining_steps),
>                &remaining_steps, sizeof(u16));
>       if (ret)
>               return ret;
>       return count;
> }
> 
> struct visor_busdev {
>       u32 bus_no;
>       u32 dev_no;
> };
> 
> static int match_visorbus_dev_by_id(struct device *dev, void *data)
> {
>       struct visor_device *vdev = to_visor_device(dev);
>       struct visor_busdev *id = data;
>       u32 bus_no = id->bus_no;
>       u32 dev_no = id->dev_no;
> 
>       if ((vdev->chipset_bus_no == bus_no) &&
>           (vdev->chipset_dev_no == dev_no))
>               return 1;
> 
>       return 0;
> }
> 
> struct visor_device *visorbus_get_device_by_id(u32 bus_no, u32 dev_no,
>                                              struct visor_device *from)
> {
>       struct device *dev;
>       struct device *dev_start = NULL;
>       struct visor_device *vdev = NULL;
>       struct visor_busdev id = {
>                       .bus_no = bus_no,
>                       .dev_no = dev_no
>               };
> 
>       if (from)
>               dev_start = &from->device;
>       dev = bus_find_device(&visorbus_type, dev_start, (void *)&id,
>                             match_visorbus_dev_by_id);
>       if (dev)
>               vdev = to_visor_device(dev);
>       return vdev;
> }
> 
> static void
> chipset_init(struct controlvm_message *inmsg)
> {
>       static int chipset_inited;
>       enum ultra_chipset_feature features = 0;
>       int rc = CONTROLVM_RESP_SUCCESS;

What's with the use of this odd return value in lots of places?

It should just be 0 or a -error number.  Don't mess with odd
non-standard values.  You do that a lot in this file.

> 
>       POSTCODE_LINUX_2(CHIPSET_INIT_ENTRY_PC, POSTCODE_SEVERITY_INFO);

Huh?  What's that macro for?  I doubt it's needed, or if so, why not _1
or _42?

Just use ftrace if you want to trace things, don't use odd stuff like
this.  You aren't special.

>       if (chipset_inited) {
>               rc = -CONTROLVM_RESP_ERROR_ALREADY_DONE;

See, odd errors :(

>               goto out_respond;
>       }
>       chipset_inited = 1;

where's the locking to keep this from being called twice and you
accidentally initializing things twice?

>       POSTCODE_LINUX_2(CHIPSET_INIT_EXIT_PC, POSTCODE_SEVERITY_INFO);
> 
>       /*
>        * Set features to indicate we support parahotplug (if Command
>        * also supports it).
>        */
>       features =
>           inmsg->cmd.init_chipset.
>           features & ULTRA_CHIPSET_FEATURE_PARA_HOTPLUG;

Worse
line
break
of
all
time

Come on...


> 
>       /*
>        * Set the "reply" bit so Command knows this is a
>        * features-aware driver.
>        */
>       features |= ULTRA_CHIPSET_FEATURE_REPLY;
> 
> out_respond:
>       if (inmsg->hdr.flags.response_expected)
>               controlvm_respond_chipset_init(&inmsg->hdr, rc, features);
> }
> 
> static void
> controlvm_init_response(struct controlvm_message *msg,
>                       struct controlvm_message_header *msg_hdr, int response)
> {
>       memset(msg, 0, sizeof(struct controlvm_message));
>       memcpy(&msg->hdr, msg_hdr, sizeof(struct controlvm_message_header));
>       msg->hdr.payload_bytes = 0;
>       msg->hdr.payload_vm_offset = 0;
>       msg->hdr.payload_max_bytes = 0;
>       if (response < 0) {
>               msg->hdr.flags.failed = 1;
>               msg->hdr.completion_status = (u32)(-response);
>       }
> }
> 
> static void
> Billy(struct controlvm_message_header *msg_hdr, int response)

Not John?  Bob?  Sally?  Alice?  Bernise?  Jean?  Molly?

> {
>       struct controlvm_message outmsg;
> 
>       controlvm_init_response(&outmsg, msg_hdr, response);
>       if (outmsg.hdr.flags.test_message == 1)
>               return;
> 
>       if (!visorchannel_signalinsert(controlvm_channel,
>                                      CONTROLVM_QUEUE_REQUEST, &outmsg)) {
>               return;
>       }
> }
> 
> static void
> controlvm_respond_chipset_init(struct controlvm_message_header *msg_hdr,
>                              int response,
>                              enum ultra_chipset_feature features)
> {
>       struct controlvm_message outmsg;
> 
>       controlvm_init_response(&outmsg, msg_hdr, response);
>       outmsg.cmd.init_chipset.features = features;
>       if (!visorchannel_signalinsert(controlvm_channel,
>                                      CONTROLVM_QUEUE_REQUEST, &outmsg)) {
>               return;
>       }

No {} needed.

> }
> 
> static void controlvm_respond_physdev_changestate(
>               struct controlvm_message_header *msg_hdr, int response,
>               struct spar_segment_state state)
> {
>       struct controlvm_message outmsg;
> 
>       controlvm_init_response(&outmsg, msg_hdr, response);
>       outmsg.cmd.device_change_state.state = state;
>       outmsg.cmd.device_change_state.flags.phys_device = 1;
>       if (!visorchannel_signalinsert(controlvm_channel,
>                                      CONTROLVM_QUEUE_REQUEST, &outmsg)) {
>               return;
>       }
> }

Why even check the return value?  No matter what, you return.

Which means any error handling was worthless, why even have that
function return an error?  Something's wrong here.

Same for the function above this one.

> 
> enum crash_obj_type {
>       CRASH_DEV,
>       CRASH_BUS,
> };

Any hints as to what these mean?  I want to ride on the crash_bus,
sounds fun...

> 
> static void
> save_crash_message(struct controlvm_message *msg, enum crash_obj_type typ)
> {
>       u32 local_crash_msg_offset;
>       u16 local_crash_msg_count;
> 
>       if (visorchannel_read(controlvm_channel,
>                             offsetof(struct spar_controlvm_channel_protocol,
>                                      saved_crash_message_count),
>                             &local_crash_msg_count, sizeof(u16)) < 0) {
>               POSTCODE_LINUX_2(CRASH_DEV_CTRL_RD_FAILURE_PC,
>                                POSTCODE_SEVERITY_ERR);

Again with the funky macro...

Damm this is a long file, we haven't even gotten to the code that
originally caused me to get upset at this file.  Lucky for you that I'm
stuck on a plane over the Atlantic with a power cable, no wifi, no
movies to watch, and a bottomless cup of coffee.

Or maybe that's unlucky, your choice...


>               return;
>       }
> 
>       if (local_crash_msg_count != CONTROLVM_CRASHMSG_MAX) {
>               POSTCODE_LINUX_3(CRASH_DEV_COUNT_FAILURE_PC,
>                                local_crash_msg_count,
>                                POSTCODE_SEVERITY_ERR);

Not _4?  What about _5?  :(

>               return;

No errors being returned despite this obviously being an error?

>       }
> 
>       if (visorchannel_read(controlvm_channel,
>                             offsetof(struct spar_controlvm_channel_protocol,
>                                      saved_crash_message_offset),
>                             &local_crash_msg_offset, sizeof(u32)) < 0) {
>               POSTCODE_LINUX_2(CRASH_DEV_CTRL_RD_FAILURE_PC,
>                                POSTCODE_SEVERITY_ERR);
>               return;
>       }
> 
>       if (typ == CRASH_BUS) {

here, have a 'e', it's cheap...


>               if (visorchannel_write(controlvm_channel,
>                                      local_crash_msg_offset,
>                                      msg,
>                                      sizeof(struct controlvm_message)) < 0) {
>                       POSTCODE_LINUX_2(SAVE_MSG_BUS_FAILURE_PC,
>                                        POSTCODE_SEVERITY_ERR);
>                       return;
>               }
>       } else {
>               local_crash_msg_offset += sizeof(struct controlvm_message);
>               if (visorchannel_write(controlvm_channel,
>                                      local_crash_msg_offset,
>                                      msg,
>                                      sizeof(struct controlvm_message)) < 0) {
>                       POSTCODE_LINUX_2(SAVE_MSG_DEV_FAILURE_PC,
>                                        POSTCODE_SEVERITY_ERR);
>                       return;
>               }

So an enum yet you don't check the other value?  case statement perhaps?

>       }

And your error handling sucks, or did I already say that?

> }
> 
> static void
> bus_responder(enum controlvm_id cmd_id,
>             struct controlvm_message_header *pending_msg_hdr,
>             int response)
> {
>       if (!pending_msg_hdr)
>               return;         /* no controlvm response needed */

No comments on the right please.

And how can this happen?  It's not an error?

> 
>       if (pending_msg_hdr->id != (u32)cmd_id)
>               return;
> 
>       controlvm_respond(pending_msg_hdr, response);
> }

Ah, no error handling anyway, so who cares about checking anything?

> 
> static void
> device_changestate_responder(enum controlvm_id cmd_id,
>                            struct visor_device *p, int response,
>                            struct spar_segment_state response_state)
> {
>       struct controlvm_message outmsg;
>       u32 bus_no = p->chipset_bus_no;
>       u32 dev_no = p->chipset_dev_no;
> 
>       if (!p->pending_msg_hdr)
>               return;         /* no controlvm response needed */
>       if (p->pending_msg_hdr->id != cmd_id)
>               return;

again with the right comments and no errors :(

> 
>       controlvm_init_response(&outmsg, p->pending_msg_hdr, response);
> 
>       outmsg.cmd.device_change_state.bus_no = bus_no;
>       outmsg.cmd.device_change_state.dev_no = dev_no;
>       outmsg.cmd.device_change_state.state = response_state;
> 
>       if (!visorchannel_signalinsert(controlvm_channel,
>                                      CONTROLVM_QUEUE_REQUEST, &outmsg))
>               return;

Yet if we don't error, we return anyway?  {sigh}

> }
> 
> static void
> device_responder(enum controlvm_id cmd_id,
>                struct controlvm_message_header *pending_msg_hdr,
>                int response)
> {
>       if (!pending_msg_hdr)
>               return;         /* no controlvm response needed */
> 
>       if (pending_msg_hdr->id != (u32)cmd_id)
>               return;
> 
>       controlvm_respond(pending_msg_hdr, response);
> }

Same as above.

> 
> static void
> bus_epilog(struct visor_device *bus_info,
>          u32 cmd, struct controlvm_message_header *msg_hdr,
>          int response, bool need_response)
> {
>       struct controlvm_message_header *pmsg_hdr = NULL;
> 
>       if (!bus_info) {
>               /*
>                * relying on a valid passed in response code
>                * be lazy and re-use msg_hdr for this failure, is this ok??
>                */

If you don't know then I will say NO!

Fix it.

>               pmsg_hdr = msg_hdr;
>               goto out_respond;
>       }
> 
>       if (bus_info->pending_msg_hdr) {
>               /* only non-NULL if dev is still waiting on a response */
>               response = -CONTROLVM_RESP_ERROR_MESSAGE_ID_INVALID_FOR_CLIENT;

I_LIKE_LONG_ERROR_MESSAGE_DEFINES_AND_I_CAN_NOT_LIE

Ok, I lie...

>               pmsg_hdr = bus_info->pending_msg_hdr;
>               goto out_respond;
>       }
> 
>       if (need_response) {
>               pmsg_hdr = kzalloc(sizeof(*pmsg_hdr), GFP_KERNEL);
>               if (!pmsg_hdr) {
>                       POSTCODE_LINUX_4(MALLOC_FAILURE_PC, cmd,
>                                        bus_info->chipset_bus_no,
>                                        POSTCODE_SEVERITY_ERR);
>                       return;

Wait!  You just ran out of memory!  And you don't care???

Your poor users, they really liked that data that you just dropped on
the floor, too bad they will never see it again...

>               }
> 
>               memcpy(pmsg_hdr, msg_hdr,
>                      sizeof(struct controlvm_message_header));
>               bus_info->pending_msg_hdr = pmsg_hdr;
>       }
> 
>       if (response == CONTROLVM_RESP_SUCCESS) {
>               switch (cmd) {
>               case CONTROLVM_BUS_CREATE:
>                       chipset_bus_create(bus_info);
>                       break;
>               case CONTROLVM_BUS_DESTROY:
>                       chipset_bus_destroy(bus_info);
>                       break;

default?

>               }
>       }
> 
> out_respond:
>       bus_responder(cmd, pmsg_hdr, response);

error handling?  I'm a broken record, I know.  But really, your lack of
robustness is amazing for something that is supposed to be relied on.

> }
> 
> static void
> device_epilog(struct visor_device *dev_info,
>             struct spar_segment_state state, u32 cmd,
>             struct controlvm_message_header *msg_hdr, int response,
>             bool need_response, bool for_visorbus)
> {
>       struct controlvm_message_header *pmsg_hdr = NULL;
> 
>       if (!dev_info) {
>               /*
>                * relying on a valid passed in response code
>                * be lazy and re-use msg_hdr for this failure, is this ok??

Again NO!!

Or yes.

You pick, don't be indecisive.  Or do be, but fake it...

>                */
>               pmsg_hdr = msg_hdr;
>               goto out_respond;
>       }
> 
>       if (dev_info->pending_msg_hdr) {
>               /* only non-NULL if dev is still waiting on a response */
>               response = -CONTROLVM_RESP_ERROR_MESSAGE_ID_INVALID_FOR_CLIENT;
>               pmsg_hdr = dev_info->pending_msg_hdr;
>               goto out_respond;
>       }
> 
>       if (need_response) {
>               pmsg_hdr = kzalloc(sizeof(*pmsg_hdr), GFP_KERNEL);
>               if (!pmsg_hdr) {
>                       response = -CONTROLVM_RESP_ERROR_KMALLOC_FAILED;

-ENOMEM.

See, saved you a bunch of characters.  You're welcome.

>                       goto out_respond;
>               }
> 
>               memcpy(pmsg_hdr, msg_hdr,
>                      sizeof(struct controlvm_message_header));
>               dev_info->pending_msg_hdr = pmsg_hdr;
>       }
> 
>       if (response >= 0) {
>               switch (cmd) {
>               case CONTROLVM_DEVICE_CREATE:
>                       chipset_device_create(dev_info);
>                       break;
>               case CONTROLVM_DEVICE_CHANGESTATE:
>                       /* ServerReady / ServerRunning / SegmentStateRunning */
>                       if (state.alive == segment_state_running.alive &&
>                           state.operating ==
>                               segment_state_running.operating) {
>                               chipset_device_resume(dev_info);
>                       }
>                       /* ServerNotReady / ServerLost / SegmentStateStandby */
>                       else if (state.alive == segment_state_standby.alive &&
>                                state.operating ==
>                                segment_state_standby.operating) {
>                               /*
>                                * technically this is standby case
>                                * where server is lost
>                                */
>                               chipset_device_pause(dev_info);
>                       }
>                       break;
>               case CONTROLVM_DEVICE_DESTROY:
>                       chipset_device_destroy(dev_info);
>                       break;
>               }
>       }
> 
> out_respond:
>       device_responder(cmd, pmsg_hdr, response);

Wait!  We ran out of memory!  but we didn't do anything about it :(

{sniff}

> }
> 
> static void
> bus_create(struct controlvm_message *inmsg)
> {
>       struct controlvm_message_packet *cmd = &inmsg->cmd;
>       u32 bus_no = cmd->create_bus.bus_no;
>       int rc = CONTROLVM_RESP_SUCCESS;
>       struct visor_device *bus_info;
>       struct visorchannel *visorchannel;
> 
>       bus_info = visorbus_get_device_by_id(bus_no, BUS_ROOT_DEVICE, NULL);
>       if (bus_info && (bus_info->state.created == 1)) {
>               POSTCODE_LINUX_3(BUS_CREATE_FAILURE_PC, bus_no,
>                                POSTCODE_SEVERITY_ERR);
>               rc = -CONTROLVM_RESP_ERROR_ALREADY_DONE;
>               goto out_bus_epilog;
>       }
>       bus_info = kzalloc(sizeof(*bus_info), GFP_KERNEL);
>       if (!bus_info) {
>               POSTCODE_LINUX_3(BUS_CREATE_FAILURE_PC, bus_no,
>                                POSTCODE_SEVERITY_ERR);
>               rc = -CONTROLVM_RESP_ERROR_KMALLOC_FAILED;
>               goto out_bus_epilog;

Same as above, at least you are consistant, that should count for
something...


>       }
> 
>       INIT_LIST_HEAD(&bus_info->list_all);
>       bus_info->chipset_bus_no = bus_no;
>       bus_info->chipset_dev_no = BUS_ROOT_DEVICE;
> 
>       POSTCODE_LINUX_3(BUS_CREATE_ENTRY_PC, bus_no, POSTCODE_SEVERITY_INFO);
> 
>       visorchannel = visorchannel_create(cmd->create_bus.channel_addr,
>                                          cmd->create_bus.channel_bytes,
>                                          GFP_KERNEL,
>                                          cmd->create_bus.bus_data_type_uuid);
> 
>       if (!visorchannel) {
>               POSTCODE_LINUX_3(BUS_CREATE_FAILURE_PC, bus_no,
>                                POSTCODE_SEVERITY_ERR);
>               rc = -CONTROLVM_RESP_ERROR_KMALLOC_FAILED;
>               kfree(bus_info);
>               bus_info = NULL;
>               goto out_bus_epilog;
>       }
>       bus_info->visorchannel = visorchannel;
>       if (uuid_le_cmp(cmd->create_bus.bus_inst_uuid, spar_siovm_uuid) == 0) {
>               dump_vhba_bus = bus_no;
>               save_crash_message(inmsg, CRASH_BUS);
>       }
> 
>       POSTCODE_LINUX_3(BUS_CREATE_EXIT_PC, bus_no, POSTCODE_SEVERITY_INFO);
> 
> out_bus_epilog:
>       bus_epilog(bus_info, CONTROLVM_BUS_CREATE, &inmsg->hdr,
>                  rc, inmsg->hdr.flags.response_expected == 1);
> }
> 
> static void
> bus_destroy(struct controlvm_message *inmsg)
> {
>       struct controlvm_message_packet *cmd = &inmsg->cmd;
>       u32 bus_no = cmd->destroy_bus.bus_no;
>       struct visor_device *bus_info;
>       int rc = CONTROLVM_RESP_SUCCESS;
> 
>       bus_info = visorbus_get_device_by_id(bus_no, BUS_ROOT_DEVICE, NULL);
>       if (!bus_info)
>               rc = -CONTROLVM_RESP_ERROR_BUS_INVALID;

How can this happen?

And again with the funky errors, just delete them all, this is the
kernel, use the ones we have, don't make driver/subsystem-specific ones.


>       else if (bus_info->state.created == 0)
>               rc = -CONTROLVM_RESP_ERROR_ALREADY_DONE;
> 
>       bus_epilog(bus_info, CONTROLVM_BUS_DESTROY, &inmsg->hdr,
>                  rc, inmsg->hdr.flags.response_expected == 1);
> 
>       /* bus_info is freed as part of the busdevice_release function */

Really?  See, you do know what a release function is for.  Why fake it
up above?  Perhaps you were doing something you shouldn't have been
doing, nice try...

> }
> 
> static void
> bus_configure(struct controlvm_message *inmsg,
>             struct parser_context *parser_ctx)
> {
>       struct controlvm_message_packet *cmd = &inmsg->cmd;
>       u32 bus_no;
>       struct visor_device *bus_info;
>       int rc = CONTROLVM_RESP_SUCCESS;
> 
>       bus_no = cmd->configure_bus.bus_no;
>       POSTCODE_LINUX_3(BUS_CONFIGURE_ENTRY_PC, bus_no,
>                        POSTCODE_SEVERITY_INFO);

Ick...

> 
>       bus_info = visorbus_get_device_by_id(bus_no, BUS_ROOT_DEVICE, NULL);
>       if (!bus_info) {
>               POSTCODE_LINUX_3(BUS_CONFIGURE_FAILURE_PC, bus_no,
>                                POSTCODE_SEVERITY_ERR);
>               rc = -CONTROLVM_RESP_ERROR_BUS_INVALID;
>       } else if (bus_info->state.created == 0) {
>               POSTCODE_LINUX_3(BUS_CONFIGURE_FAILURE_PC, bus_no,
>                                POSTCODE_SEVERITY_ERR);
>               rc = -CONTROLVM_RESP_ERROR_BUS_INVALID;
>       } else if (bus_info->pending_msg_hdr) {
>               POSTCODE_LINUX_3(BUS_CONFIGURE_FAILURE_PC, bus_no,
>                                POSTCODE_SEVERITY_ERR);
>               rc = -CONTROLVM_RESP_ERROR_MESSAGE_ID_INVALID_FOR_CLIENT;

This is just a bad dream, right?

>       } else {
>               visorchannel_set_clientpartition
>                       (bus_info->visorchannel,
>                        cmd->configure_bus.guest_handle);
>               bus_info->partition_uuid = parser_id_get(parser_ctx);
>               parser_param_start(parser_ctx, PARSERSTRING_NAME);
>               bus_info->name = parser_string_get(parser_ctx);
> 
>               POSTCODE_LINUX_3(BUS_CONFIGURE_EXIT_PC, bus_no,
>                                POSTCODE_SEVERITY_INFO);

This is the "real" code, right?  If os, then you should have errored out
above, this should be shifted one tab left, and then it will be
"obvious" what is happening.  right now this looks like the final error
case, which is horrid logic.


>       }
>       bus_epilog(bus_info, CONTROLVM_BUS_CONFIGURE, &inmsg->hdr,
>                  rc, inmsg->hdr.flags.response_expected == 1);

I like errors, too bad you ate them all.

> }
> 
> static void
> my_device_create(struct controlvm_message *inmsg)
> {
>       struct controlvm_message_packet *cmd = &inmsg->cmd;
>       u32 bus_no = cmd->create_device.bus_no;
>       u32 dev_no = cmd->create_device.dev_no;
>       struct visor_device *dev_info = NULL;
>       struct visor_device *bus_info;
>       struct visorchannel *visorchannel;
>       int rc = CONTROLVM_RESP_SUCCESS;
> 
>       bus_info = visorbus_get_device_by_id(bus_no, BUS_ROOT_DEVICE, NULL);
>       if (!bus_info) {
>               POSTCODE_LINUX_4(DEVICE_CREATE_FAILURE_PC, dev_no, bus_no,
>                                POSTCODE_SEVERITY_ERR);
>               rc = -CONTROLVM_RESP_ERROR_BUS_INVALID;
>               goto out_respond;
>       }
> 
>       if (bus_info->state.created == 0) {
>               POSTCODE_LINUX_4(DEVICE_CREATE_FAILURE_PC, dev_no, bus_no,
>                                POSTCODE_SEVERITY_ERR);
>               rc = -CONTROLVM_RESP_ERROR_BUS_INVALID;
>               goto out_respond;

See, this is better, fix the function above to look a bit like this (fix
the error codes, and the looney tracing mess.

>       }
> 
>       dev_info = visorbus_get_device_by_id(bus_no, dev_no, NULL);
>       if (dev_info && (dev_info->state.created == 1)) {
>               POSTCODE_LINUX_4(DEVICE_CREATE_FAILURE_PC, dev_no, bus_no,
>                                POSTCODE_SEVERITY_ERR);
>               rc = -CONTROLVM_RESP_ERROR_ALREADY_DONE;
>               goto out_respond;
>       }
> 
>       dev_info = kzalloc(sizeof(*dev_info), GFP_KERNEL);
>       if (!dev_info) {
>               POSTCODE_LINUX_4(DEVICE_CREATE_FAILURE_PC, dev_no, bus_no,
>                                POSTCODE_SEVERITY_ERR);
>               rc = -CONTROLVM_RESP_ERROR_KMALLOC_FAILED;
>               goto out_respond;

But it's an ERROR!!!  {sigh}

>       }
> 
>       dev_info->chipset_bus_no = bus_no;
>       dev_info->chipset_dev_no = dev_no;
>       dev_info->inst = cmd->create_device.dev_inst_uuid;
> 
>       /* not sure where the best place to set the 'parent' */
>       dev_info->device.parent = &bus_info->device;
> 
>       POSTCODE_LINUX_4(DEVICE_CREATE_ENTRY_PC, dev_no, bus_no,
>                        POSTCODE_SEVERITY_INFO);
> 
>       visorchannel =
>              visorchannel_create_with_lock(cmd->create_device.channel_addr,
>                                            cmd->create_device.channel_bytes,
>                                            GFP_KERNEL,
>                                            cmd->create_device.data_type_uuid);
> 
>       if (!visorchannel) {
>               POSTCODE_LINUX_4(DEVICE_CREATE_FAILURE_PC, dev_no, bus_no,
>                                POSTCODE_SEVERITY_ERR);
>               rc = -CONTROLVM_RESP_ERROR_KMALLOC_FAILED;
>               kfree(dev_info);
>               dev_info = NULL;
>               goto out_respond;
>       }
>       dev_info->visorchannel = visorchannel;
>       dev_info->channel_type_guid = cmd->create_device.data_type_uuid;
>       if (uuid_le_cmp(cmd->create_device.data_type_uuid,
>                       spar_vhba_channel_protocol_uuid) == 0)
>               save_crash_message(inmsg, CRASH_DEV);
> 
>       POSTCODE_LINUX_4(DEVICE_CREATE_EXIT_PC, dev_no, bus_no,
>                        POSTCODE_SEVERITY_INFO);
> out_respond:
>       device_epilog(dev_info, segment_state_running,
>                     CONTROLVM_DEVICE_CREATE, &inmsg->hdr, rc,
>                     inmsg->hdr.flags.response_expected == 1, 1);
> }
> 
> static void

You really don't like returning errors, should I just stop now?  I'm
just over half way done with the file, and still I'm not at the
functions that made me want to respond to this code...

Oh look, 3 more hours left in my flight. I guess it's either this code
review or rewatching the Lego movie for the zillionth time.  Tempting...

> my_device_changestate(struct controlvm_message *inmsg)
> {
>       struct controlvm_message_packet *cmd = &inmsg->cmd;
>       u32 bus_no = cmd->device_change_state.bus_no;
>       u32 dev_no = cmd->device_change_state.dev_no;
>       struct spar_segment_state state = cmd->device_change_state.state;
>       struct visor_device *dev_info;
>       int rc = CONTROLVM_RESP_SUCCESS;
> 
>       dev_info = visorbus_get_device_by_id(bus_no, dev_no, NULL);
>       if (!dev_info) {
>               POSTCODE_LINUX_4(DEVICE_CHANGESTATE_FAILURE_PC, dev_no, bus_no,
>                                POSTCODE_SEVERITY_ERR);
>               rc = -CONTROLVM_RESP_ERROR_DEVICE_INVALID;
>       } else if (dev_info->state.created == 0) {
>               POSTCODE_LINUX_4(DEVICE_CHANGESTATE_FAILURE_PC, dev_no, bus_no,
>                                POSTCODE_SEVERITY_ERR);
>               rc = -CONTROLVM_RESP_ERROR_DEVICE_INVALID;
>       }
>       if ((rc >= CONTROLVM_RESP_SUCCESS) && dev_info)
>               device_epilog(dev_info, state,
>                             CONTROLVM_DEVICE_CHANGESTATE, &inmsg->hdr, rc,
>                             inmsg->hdr.flags.response_expected == 1, 1);

You know that I'm going to say...

> }
> 
> static void
> my_device_destroy(struct controlvm_message *inmsg)
> {
>       struct controlvm_message_packet *cmd = &inmsg->cmd;
>       u32 bus_no = cmd->destroy_device.bus_no;
>       u32 dev_no = cmd->destroy_device.dev_no;
>       struct visor_device *dev_info;
>       int rc = CONTROLVM_RESP_SUCCESS;
> 
>       dev_info = visorbus_get_device_by_id(bus_no, dev_no, NULL);
>       if (!dev_info)
>               rc = -CONTROLVM_RESP_ERROR_DEVICE_INVALID;
>       else if (dev_info->state.created == 0)
>               rc = -CONTROLVM_RESP_ERROR_ALREADY_DONE;
> 
>       if ((rc >= CONTROLVM_RESP_SUCCESS) && dev_info)
>               device_epilog(dev_info, segment_state_running,
>                             CONTROLVM_DEVICE_DESTROY, &inmsg->hdr, rc,
>                             inmsg->hdr.flags.response_expected == 1, 1);

same here.

> }
> 
> /**
>  * initialize_controlvm_payload_info() - init controlvm_payload_info struct

kerneldoc for a static function?  Ick, no.

>  * @phys_addr: the physical address of controlvm channel
>  * @offset:    the offset to payload
>  * @bytes:     the size of the payload in bytes
>  * @info:      the returning valid struct
>  *
>  * When provided with the physical address of the controlvm channel
>  * (phys_addr), the offset to the payload area we need to manage
>  * (offset), and the size of this payload area (bytes), fills in the
>  * controlvm_payload_info struct.
>  *
>  * Return: CONTROLVM_RESP_SUCCESS for success or a negative for failure
>  */
> static int

Yeah, you DO know that an error should be returned.

Too bad you overloaded '0' with a crazy #define, but I guess it's better
than the other functions above...

> initialize_controlvm_payload_info(u64 phys_addr, u64 offset, u32 bytes,
>                                 struct visor_controlvm_payload_info *info)
> {
>       u8 *payload = NULL;
> 
>       if (!info)
>               return -CONTROLVM_RESP_ERROR_PAYLOAD_INVALID;

How can this be true?

> 
>       memset(info, 0, sizeof(struct visor_controlvm_payload_info));
>       if ((offset == 0) || (bytes == 0))
>               return -CONTROLVM_RESP_ERROR_PAYLOAD_INVALID;

why not check before setting everything to 0?

> 
>       payload = memremap(phys_addr + offset, bytes, MEMREMAP_WB);

You don't test for too big numbers?  That seems ripe for abuse, I guess
you like CVEs being assigned to your driver?  I personally hate them...

>       if (!payload)
>               return -CONTROLVM_RESP_ERROR_IOREMAP_FAILED;

Again with the funky error codes, I'll just stop saying them now, please
delete all of them and use the standard kernel errors in all of the
visorbus codebase.

> 
>       info->offset = offset;
>       info->bytes = bytes;
>       info->ptr = payload;
> 
>       return CONTROLVM_RESP_SUCCESS;

0  {sniff}

> }
> 
> static void
> destroy_controlvm_payload_info(struct visor_controlvm_payload_info *info)
> {
>       if (info->ptr) {

How can that happen?

>               memunmap(info->ptr);
>               info->ptr = NULL;
>       }
>       memset(info, 0, sizeof(struct visor_controlvm_payload_info));

Why 0 everything out?

> }
> 
> static void
> initialize_controlvm_payload(void)
> {
>       u64 phys_addr = visorchannel_get_physaddr(controlvm_channel);
>       u64 payload_offset = 0;
>       u32 payload_bytes = 0;
> 
>       if (visorchannel_read(controlvm_channel,
>                             offsetof(struct spar_controlvm_channel_protocol,
>                                      request_payload_offset),
>                             &payload_offset, sizeof(payload_offset)) < 0) {
>               POSTCODE_LINUX_2(CONTROLVM_INIT_FAILURE_PC,
>                                POSTCODE_SEVERITY_ERR);
>               return;

No errors :(

>       }
>       if (visorchannel_read(controlvm_channel,
>                             offsetof(struct spar_controlvm_channel_protocol,
>                                      request_payload_bytes),
>                             &payload_bytes, sizeof(payload_bytes)) < 0) {
>               POSTCODE_LINUX_2(CONTROLVM_INIT_FAILURE_PC,
>                                POSTCODE_SEVERITY_ERR);
>               return;
>       }
>       initialize_controlvm_payload_info(phys_addr,
>                                         payload_offset, payload_bytes,
>                                         &controlvm_payload_info);
> }
> 
> /**
>  * visorchipset_chipset_ready() - sends chipset_ready action
>  *
>  * Send ACTION=online for DEVPATH=/sys/devices/platform/visorchipset.
>  *
>  * Return: CONTROLVM_RESP_SUCCESS
>  */
> static int
> visorchipset_chipset_ready(void)
> {
>       kobject_uevent(&visorchipset_platform_device.dev.kobj, KOBJ_ONLINE);
>       return CONTROLVM_RESP_SUCCESS;

Ah, finally!  Here we are at the code I was wondering about.

Why?  Why KOBJ_ONLINE?  What does this tell userspace?

> }
> 
> static int
> visorchipset_chipset_selftest(void)
> {
>       char env_selftest[20];
>       char *envp[] = { env_selftest, NULL };
> 
>       sprintf(env_selftest, "SPARSP_SELFTEST=%d", 1);
>       kobject_uevent_env(&visorchipset_platform_device.dev.kobj, KOBJ_CHANGE,
>                          envp);

What changed?  a selftest caused a sysfs file change?  What file?  Who
wants this?  This looks really odd.

>       return CONTROLVM_RESP_SUCCESS;

And 0, really.  You should have noticed that the whole kernel does this,
you aren't special.  Well, you are, we all are special and unique, just
like everyone else...

> }
> 
> /**
>  * visorchipset_chipset_notready() - sends chipset_notready action
>  *
>  * Send ACTION=offline for DEVPATH=/sys/devices/platform/visorchipset.
>  *
>  * Return: CONTROLVM_RESP_SUCCESS

kerndoc for a static function? No.

>  */
> static int
> visorchipset_chipset_notready(void)
> {
>       kobject_uevent(&visorchipset_platform_device.dev.kobj, KOBJ_OFFLINE);

And the kerndoc is toally wrong compared to the code :(

You were better off not documenting it, it would have at least done what
you said it would do...

Again, why OFFLINE?  Who wants this?  Who needs this?  What just went
on/off line?

>       return CONTROLVM_RESP_SUCCESS;

0.  Come on...

> }
> 
> static void
> chipset_ready(struct controlvm_message_header *msg_hdr)
> {
>       int rc = visorchipset_chipset_ready();
> 
>       if (rc != CONTROLVM_RESP_SUCCESS)
>               rc = -rc;

That's a sign you did something wrong in your error handling if you have
to negate a number...

>       if (msg_hdr->flags.response_expected)
>               controlvm_respond(msg_hdr, rc);

But you don't propagate it up?  I guess you are always ready, even if
your hardware said it wasn't.

And your kobject function names stink, make them mean what they do.

> }
> 
> static void
> chipset_selftest(struct controlvm_message_header *msg_hdr)
> {
>       int rc = visorchipset_chipset_selftest();

No, you didn't test anything, you just told userspace that something
changed.

> 
>       if (rc != CONTROLVM_RESP_SUCCESS)
>               rc = -rc;

Again...

>       if (msg_hdr->flags.response_expected)
>               controlvm_respond(msg_hdr, rc);

and you ignore it...
> }
> 
> static void
> chipset_notready(struct controlvm_message_header *msg_hdr)
> {
>       int rc = visorchipset_chipset_notready();

It's not?  You just told userspace something, not the chipset :(

> 
>       if (rc != CONTROLVM_RESP_SUCCESS)
>               rc = -rc;
>       if (msg_hdr->flags.response_expected)
>               controlvm_respond(msg_hdr, rc);

again...

> }
> 
> /**
>  * read_controlvm_event() - retreives the next message from the
>  *                          CONTROLVM_QUEUE_EVENT queue in the controlvm
>  *                          channel
>  * @msg: pointer to the retrieved message
>  *
>  * Return: true if a valid message was retrieved or false otherwise
>  */

kerndoc :(

> static bool
> read_controlvm_event(struct controlvm_message *msg)
> {
>       if (visorchannel_signalremove(controlvm_channel,
>                                     CONTROLVM_QUEUE_EVENT, msg)) {
>               /* got a message */
>               if (msg->hdr.flags.test_message == 1)
>                       return false;
>               return true;
>       }
>       return false;

I can't tell what is supposed to happen here.  Make it obvious what the
"good" codepath is, don't bound it by errors that look like all was
good.


> }
> 
> /*
>  * The general parahotplug flow works as follows. The visorchipset
>  * driver receives a DEVICE_CHANGESTATE message from Command
>  * specifying a physical device to enable or disable. The CONTROLVM
>  * message handler calls parahotplug_process_message, which then adds
>  * the message to a global list and kicks off a udev event which
>  * causes a user level script to enable or disable the specified
>  * device. The udev script then writes to
>  * /proc/visorchipset/parahotplug, which causes parahotplug_proc_write
>  * to get called, at which point the appropriate CONTROLVM message is
>  * retrieved from the list and responded to.
>  */
> 
> #define PARAHOTPLUG_TIMEOUT_MS 2000
> 
> /**
>  * parahotplug_next_id() - generate unique int to match an outstanding 
> CONTROLVM
>  *                         message with a udev script /proc response

"udev script /proc response"?

What does that mean?

udev hates /proc, and so should you...

>  *
>  * Return: a unique integer value
>  */
> static int
> parahotplug_next_id(void)
> {
>       static atomic_t id = ATOMIC_INIT(0);
> 
>       return atomic_inc_return(&id);

Hm, where's your locking?

And why not an idr?  this is just going to count up?  Why?

> }
> 
> /**
>  * parahotplug_next_expiration() - returns the time (in jiffies) when a
>  *                                 CONTROLVM message on the list should expire
>  *                                 -- PARAHOTPLUG_TIMEOUT_MS in the future
>  *
>  * Return: expected expiration time (in jiffies)
>  */
> static unsigned long
> parahotplug_next_expiration(void)
> {
>       return jiffies + msecs_to_jiffies(PARAHOTPLUG_TIMEOUT_MS);

I feel like this is not going to do good things if you wrap jiffies, are
you sure it's ok?

> }
> 
> /**
>  * parahotplug_request_create() - create a parahotplug_request, which is
>  *                                basically a wrapper for a CONTROLVM_MESSAGE
>  *                                that we can stick on a list
>  * @msg: the message to insert in the request
>  *
>  * Return: the request containing the provided message
>  */
> static struct parahotplug_request *
> parahotplug_request_create(struct controlvm_message *msg)
> {
>       struct parahotplug_request *req;
> 
>       req = kmalloc(sizeof(*req), GFP_KERNEL | __GFP_NORETRY);

Why not retry?

>       if (!req)
>               return NULL;
> 
>       req->id = parahotplug_next_id();
>       req->expiration = parahotplug_next_expiration();
>       req->msg = *msg;
> 
>       return req;
> }
> 
> /**
>  * parahotplug_request_destroy() - free a parahotplug_request
>  * @req: the request to deallocate
>  */
> static void
> parahotplug_request_destroy(struct parahotplug_request *req)
> {
>       kfree(req);
> }
> 
> /**
>  * parahotplug_request_kickoff() - initiate parahotplug request
>  * @req: the request to initiate
>  *
>  * Cause uevent to run the user level script to do the disable/enable 
> specified
>  * in the parahotplug_request.
>  */
> static void
> parahotplug_request_kickoff(struct parahotplug_request *req)
> {
>       struct controlvm_message_packet *cmd = &req->msg.cmd;
>       char env_cmd[40], env_id[40], env_state[40], env_bus[40], env_dev[40],
>           env_func[40];
>       char *envp[] = {
>               env_cmd, env_id, env_state, env_bus, env_dev, env_func, NULL
>       };
> 
>       sprintf(env_cmd, "SPAR_PARAHOTPLUG=1");

Why have a env variable that is always the same thing?  Kind of
pointless don't you think?

>       sprintf(env_id, "SPAR_PARAHOTPLUG_ID=%d", req->id);
>       sprintf(env_state, "SPAR_PARAHOTPLUG_STATE=%d",
>               cmd->device_change_state.state.active);
>       sprintf(env_bus, "SPAR_PARAHOTPLUG_BUS=%d",
>               cmd->device_change_state.bus_no);
>       sprintf(env_dev, "SPAR_PARAHOTPLUG_DEVICE=%d",
>               cmd->device_change_state.dev_no >> 3);
>       sprintf(env_func, "SPAR_PARAHOTPLUG_FUNCTION=%d",
>               cmd->device_change_state.dev_no & 0x7);
> 
>       kobject_uevent_env(&visorchipset_platform_device.dev.kobj, KOBJ_CHANGE,
>                          envp);

But did it really change?  Are you abusing sysfs here for odd things?
Who is listening to this?  For what?

> }
> 
> /**
>  * parahotplug_process_list() - remove any request from the list that's been 
> on
>  *                              there too long and respond with an error
>  */
> static void
> parahotplug_process_list(void)
> {
>       struct list_head *pos;
>       struct list_head *tmp;
> 
>       spin_lock(&parahotplug_request_list_lock);
> 
>       list_for_each_safe(pos, tmp, &parahotplug_request_list) {
>               struct parahotplug_request *req =
>                   list_entry(pos, struct parahotplug_request, list);
> 
>               if (!time_after_eq(jiffies, req->expiration))
>                       continue;
> 
>               list_del(pos);
>               if (req->msg.hdr.flags.response_expected)
>                       controlvm_respond_physdev_changestate(
>                               &req->msg.hdr,
>                               CONTROLVM_RESP_ERROR_DEVICE_UDEV_TIMEOUT,
>                               req->msg.cmd.device_change_state.state);
>               parahotplug_request_destroy(req);
>       }
> 
>       spin_unlock(&parahotplug_request_list_lock);
> }
> 
> /**
>  * parahotplug_request_complete() - mark request as complete
>  * @id:     the id of the request
>  * @active: indicates whether the request is assigned to active partition
>  *
>  * Called from the /proc handler, which means the user script has
>  * finished the enable/disable. Find the matching identifier, and
>  * respond to the CONTROLVM message with success.
>  *
>  * Return: 0 on success or -EINVAL on failure
>  */
> static int
> parahotplug_request_complete(int id, u16 active)
> {
>       struct list_head *pos;
>       struct list_head *tmp;
> 
>       spin_lock(&parahotplug_request_list_lock);
> 
>       /* Look for a request matching "id". */
>       list_for_each_safe(pos, tmp, &parahotplug_request_list) {
>               struct parahotplug_request *req =
>                   list_entry(pos, struct parahotplug_request, list);
>               if (req->id == id) {
>                       /*
>                        * Found a match. Remove it from the list and
>                        * respond.
>                        */
>                       list_del(pos);
>                       spin_unlock(&parahotplug_request_list_lock);
>                       req->msg.cmd.device_change_state.state.active = active;
>                       if (req->msg.hdr.flags.response_expected)
>                               controlvm_respond_physdev_changestate(
>                                       &req->msg.hdr, CONTROLVM_RESP_SUCCESS,
>                                       req->msg.cmd.device_change_state.state);
>                       parahotplug_request_destroy(req);
>                       return 0;
>               }
>       }
> 
>       spin_unlock(&parahotplug_request_list_lock);
>       return -EINVAL;
> }
> 
> /**
>  * parahotplug_process_message() - enables or disables a PCI device by kicking
>  *                                 off a udev script
>  * @inmsg: the message indicating whether to enable or disable
>  */
> static void
> parahotplug_process_message(struct controlvm_message *inmsg)
> {
>       struct parahotplug_request *req;
> 
>       req = parahotplug_request_create(inmsg);
> 

No blank line needed.

>       if (!req)
>               return;

No error?

> 
>       if (inmsg->cmd.device_change_state.state.active) {
>               /*
>                * For enable messages, just respond with success
>                * right away. This is a bit of a hack, but there are
>                * issues with the early enable messages we get (with
>                * either the udev script not detecting that the device
>                * is up, or not getting called at all). Fortunately
>                * the messages that get lost don't matter anyway, as
>                *
>                * devices are automatically enabled at
>                * initialization.

Blank line in the comment?

And don't have a hack, fix it right please.

>               */
>               parahotplug_request_kickoff(req);
>               controlvm_respond_physdev_changestate
>                       (&inmsg->hdr,
>                        CONTROLVM_RESP_SUCCESS,
>                        inmsg->cmd.device_change_state.state);
>               parahotplug_request_destroy(req);
>       } else {
>               /*
>                * For disable messages, add the request to the
>                * request list before kicking off the udev script. It
>                * won't get responded to until the script has
>                * indicated it's done.
>                */
>               spin_lock(&parahotplug_request_list_lock);
>               list_add_tail(&req->list, &parahotplug_request_list);
>               spin_unlock(&parahotplug_request_list_lock);
> 
>               parahotplug_request_kickoff(req);
>       }
> }
> 
> /**
>  * handle_command() - process a controlvm message
>  * @inmsg:        the message to process
>  * @channel_addr: address of the controlvm channel
>  *
>  * Return:
>  *    false - this function will return false only in the case where the
>  *            controlvm message was NOT processed, but processing must be
>  *            retried before reading the next controlvm message; a
>  *            scenario where this can occur is when we need to throttle
>  *            the allocation of memory in which to copy out controlvm
>  *            payload data
>  *    true  - processing of the controlvm message completed,
>  *            either successfully or with an error
>  */
> static bool
> handle_command(struct controlvm_message inmsg, u64 channel_addr)
> {
>       struct controlvm_message_packet *cmd = &inmsg.cmd;
>       u64 parm_addr;
>       u32 parm_bytes;
>       struct parser_context *parser_ctx = NULL;
>       bool local_addr;
>       struct controlvm_message ackmsg;
> 
>       /* create parsing context if necessary */
>       local_addr = (inmsg.hdr.flags.test_message == 1);
>       if (channel_addr == 0)
>               return true;

Why not check this before setting local_addr?

>       parm_addr = channel_addr + inmsg.hdr.payload_vm_offset;
>       parm_bytes = inmsg.hdr.payload_bytes;
> 
>       /*
>        * Parameter and channel addresses within test messages actually lie
>        * within our OS-controlled memory. We need to know that, because it
>        * makes a difference in how we compute the virtual address.
>        */
>       if (parm_addr && parm_bytes) {
>               bool retry = false;
> 
>               parser_ctx =
>                   parser_init_byte_stream(parm_addr, parm_bytes,
>                                           local_addr, &retry);
>               if (!parser_ctx && retry)
>                       return false;
>       }
> 
>       if (!local_addr) {
>               controlvm_init_response(&ackmsg, &inmsg.hdr,
>                                       CONTROLVM_RESP_SUCCESS);
>               if (controlvm_channel)
>                       visorchannel_signalinsert(controlvm_channel,
>                                                 CONTROLVM_QUEUE_ACK,
>                                                 &ackmsg);
>       }
>       switch (inmsg.hdr.id) {
>       case CONTROLVM_CHIPSET_INIT:
>               chipset_init(&inmsg);
>               break;
>       case CONTROLVM_BUS_CREATE:
>               bus_create(&inmsg);
>               break;
>       case CONTROLVM_BUS_DESTROY:
>               bus_destroy(&inmsg);
>               break;
>       case CONTROLVM_BUS_CONFIGURE:
>               bus_configure(&inmsg, parser_ctx);
>               break;
>       case CONTROLVM_DEVICE_CREATE:
>               my_device_create(&inmsg);
>               break;
>       case CONTROLVM_DEVICE_CHANGESTATE:
>               if (cmd->device_change_state.flags.phys_device) {
>                       parahotplug_process_message(&inmsg);
>               } else {
>                       /*
>                        * save the hdr and cmd structures for later use
>                        * when sending back the response to Command
>                        */
>                       my_device_changestate(&inmsg);
>                       g_devicechangestate_packet = inmsg.cmd;
>                       break;
>               }
>               break;
>       case CONTROLVM_DEVICE_DESTROY:
>               my_device_destroy(&inmsg);
>               break;
>       case CONTROLVM_DEVICE_CONFIGURE:
>               /* no op for now, just send a respond that we passed */
>               if (inmsg.hdr.flags.response_expected)
>                       controlvm_respond(&inmsg.hdr, CONTROLVM_RESP_SUCCESS);
>               break;
>       case CONTROLVM_CHIPSET_READY:
>               chipset_ready(&inmsg.hdr);
>               break;
>       case CONTROLVM_CHIPSET_SELFTEST:
>               chipset_selftest(&inmsg.hdr);
>               break;
>       case CONTROLVM_CHIPSET_STOP:
>               chipset_notready(&inmsg.hdr);
>               break;
>       default:
>               if (inmsg.hdr.flags.response_expected)
>                       controlvm_respond
>                               (&inmsg.hdr,
>                                -CONTROLVM_RESP_ERROR_MESSAGE_ID_UNKNOWN);
>               break;
>       }
> 
>       if (parser_ctx) {
>               parser_done(parser_ctx);
>               parser_ctx = NULL;
>       }
>       return true;
> }
> 
> static inline unsigned int

Don't use inline in a .c file, gcc is smarter than we are, trust it
please.

> issue_vmcall_io_controlvm_addr(u64 *control_addr, u32 *control_bytes)
> {
>       struct vmcall_io_controlvm_addr_params params;
>       int result = VMCALL_SUCCESS;
>       u64 physaddr;
> 
>       physaddr = virt_to_phys(&params);
>       ISSUE_IO_VMCALL(VMCALL_IO_CONTROLVM_ADDR, physaddr, result);

WHY SHOUT AT THE COMPILER?

and no error handling?

>       if (VMCALL_SUCCESSFUL(result)) {

how can result change?  Is that not a function call above?  If not, ick,
that's bad...

>               *control_addr = params.address;
>               *control_bytes = params.channel_bytes;
>       }
>       return result;
> }
> 
> static u64 controlvm_get_channel_address(void)
> {
>       u64 addr = 0;
>       u32 size = 0;
> 
>       if (!VMCALL_SUCCESSFUL(issue_vmcall_io_controlvm_addr(&addr, &size)))
>               return 0;
> 
>       return addr;
> }
> 
> static void
> controlvm_periodic_work(struct work_struct *work)
> {
>       struct controlvm_message inmsg;
>       bool got_command = false;
>       bool handle_command_failed = false;
> 
>       while (visorchannel_signalremove(controlvm_channel,
>                                        CONTROLVM_QUEUE_RESPONSE,
>                                        &inmsg))
>               ;

I like to sit and spin, don't you?  CPU cycles are fun to just burn away
with no end in sight...

Always have a timeout, someday your hardware will break and your users
will thank you.


>       if (!got_command) {
>               if (controlvm_pending_msg_valid) {
>                       /*
>                        * we throttled processing of a prior
>                        * msg, so try to process it again
>                        * rather than reading a new one
>                        */
>                       inmsg = controlvm_pending_msg;
>                       controlvm_pending_msg_valid = false;
>                       got_command = true;
>               } else {
>                       got_command = read_controlvm_event(&inmsg);
>               }
>       }
> 
>       handle_command_failed = false;
>       while (got_command && (!handle_command_failed)) {
>               most_recent_message_jiffies = jiffies;
>               if (handle_command(inmsg,
>                                  visorchannel_get_physaddr
>                                  (controlvm_channel)))
>                       got_command = read_controlvm_event(&inmsg);
>               else {
>                       /*
>                        * this is a scenario where throttling
>                        * is required, but probably NOT an
>                        * error...; we stash the current
>                        * controlvm msg so we will attempt to
>                        * reprocess it on our next loop
>                        */
>                       handle_command_failed = true;
>                       controlvm_pending_msg = inmsg;
>                       controlvm_pending_msg_valid = true;
>               }
>       }
> 
>       /* parahotplug_worker */
>       parahotplug_process_list();
> 
>       if (time_after(jiffies,
>                      most_recent_message_jiffies + (HZ * MIN_IDLE_SECONDS))) {
>               /*
>                * it's been longer than MIN_IDLE_SECONDS since we
>                * processed our last controlvm message; slow down the
>                * polling
>                */
>               if (poll_jiffies != POLLJIFFIES_CONTROLVMCHANNEL_SLOW)
>                       poll_jiffies = POLLJIFFIES_CONTROLVMCHANNEL_SLOW;
>       } else {
>               if (poll_jiffies != POLLJIFFIES_CONTROLVMCHANNEL_FAST)
>                       poll_jiffies = POLLJIFFIES_CONTROLVMCHANNEL_FAST;
>       }

So newer hardware will run faster which will always cause the fast mode
to happen?  I don't understand the logic here, what are you trying to
throttle?

> 
>       schedule_delayed_work(&periodic_controlvm_work, poll_jiffies);
> }
> 
> static void
> setup_crash_devices_work_queue(struct work_struct *work)
> {
>       struct controlvm_message local_crash_bus_msg;
>       struct controlvm_message local_crash_dev_msg;
>       struct controlvm_message msg;
>       u32 local_crash_msg_offset;
>       u16 local_crash_msg_count;
> 
>       POSTCODE_LINUX_2(CRASH_DEV_ENTRY_PC, POSTCODE_SEVERITY_INFO);

{sigh}

> 
>       /* send init chipset msg */
>       msg.hdr.id = CONTROLVM_CHIPSET_INIT;
>       msg.cmd.init_chipset.bus_count = 23;
>       msg.cmd.init_chipset.switch_count = 0;
> 
>       chipset_init(&msg);
> 
>       /* get saved message count */
>       if (visorchannel_read(controlvm_channel,
>                             offsetof(struct spar_controlvm_channel_protocol,
>                                      saved_crash_message_count),
>                             &local_crash_msg_count, sizeof(u16)) < 0) {
>               POSTCODE_LINUX_2(CRASH_DEV_CTRL_RD_FAILURE_PC,
>                                POSTCODE_SEVERITY_ERR);
>               return;

No error handling.

>       }
> 
>       if (local_crash_msg_count != CONTROLVM_CRASHMSG_MAX) {
>               POSTCODE_LINUX_3(CRASH_DEV_COUNT_FAILURE_PC,
>                                local_crash_msg_count,
>                                POSTCODE_SEVERITY_ERR);
>               return;
>       }
> 
>       /* get saved crash message offset */
>       if (visorchannel_read(controlvm_channel,
>                             offsetof(struct spar_controlvm_channel_protocol,
>                                      saved_crash_message_offset),
>                             &local_crash_msg_offset, sizeof(u32)) < 0) {
>               POSTCODE_LINUX_2(CRASH_DEV_CTRL_RD_FAILURE_PC,
>                                POSTCODE_SEVERITY_ERR);
>               return;
>       }
> 
>       /* read create device message for storage bus offset */
>       if (visorchannel_read(controlvm_channel,
>                             local_crash_msg_offset,
>                             &local_crash_bus_msg,
>                             sizeof(struct controlvm_message)) < 0) {
>               POSTCODE_LINUX_2(CRASH_DEV_RD_BUS_FAIULRE_PC,
>                                POSTCODE_SEVERITY_ERR);
>               return;
>       }
> 
>       /* read create device message for storage device */
>       if (visorchannel_read(controlvm_channel,
>                             local_crash_msg_offset +
>                             sizeof(struct controlvm_message),
>                             &local_crash_dev_msg,
>                             sizeof(struct controlvm_message)) < 0) {
>               POSTCODE_LINUX_2(CRASH_DEV_RD_DEV_FAIULRE_PC,
>                                POSTCODE_SEVERITY_ERR);
>               return;
>       }
> 
>       /* reuse IOVM create bus message */
>       if (local_crash_bus_msg.cmd.create_bus.channel_addr) {
>               bus_create(&local_crash_bus_msg);
>       } else {
>               POSTCODE_LINUX_2(CRASH_DEV_BUS_NULL_FAILURE_PC,
>                                POSTCODE_SEVERITY_ERR);
>               return;
>       }
> 
>       /* reuse create device message for storage device */
>       if (local_crash_dev_msg.cmd.create_device.channel_addr) {
>               my_device_create(&local_crash_dev_msg);
>       } else {
>               POSTCODE_LINUX_2(CRASH_DEV_DEV_NULL_FAILURE_PC,
>                                POSTCODE_SEVERITY_ERR);
>               return;
>       }
>       POSTCODE_LINUX_2(CRASH_DEV_EXIT_PC, POSTCODE_SEVERITY_INFO);
> }

85%, almost there, I can make it...

> 
> void
> bus_create_response(struct visor_device *bus_info, int response)
> {
>       if (response >= 0)
>               bus_info->state.created = 1;

Why would response be 0?

And that's a horrid global function name, please use your own namespace.

> 
>       bus_responder(CONTROLVM_BUS_CREATE, bus_info->pending_msg_hdr,
>                     response);
> 
>       kfree(bus_info->pending_msg_hdr);
>       bus_info->pending_msg_hdr = NULL;

It always works?

> }
> 
> void
> bus_destroy_response(struct visor_device *bus_info, int response)

same here, use your own namespace.

> {
>       bus_responder(CONTROLVM_BUS_DESTROY, bus_info->pending_msg_hdr,
>                     response);
> 
>       kfree(bus_info->pending_msg_hdr);
>       bus_info->pending_msg_hdr = NULL;
> }
> 
> void
> device_create_response(struct visor_device *dev_info, int response)

same namespace issue.

> {
>       if (response >= 0)
>               dev_info->state.created = 1;
> 
>       device_responder(CONTROLVM_DEVICE_CREATE, dev_info->pending_msg_hdr,
>                        response);
> 
>       kfree(dev_info->pending_msg_hdr);
>       dev_info->pending_msg_hdr = NULL;

No errors?

> }
> 
> void
> device_destroy_response(struct visor_device *dev_info, int response)

namespace...

> {
>       device_responder(CONTROLVM_DEVICE_DESTROY, dev_info->pending_msg_hdr,
>                        response);
> 
>       kfree(dev_info->pending_msg_hdr);
>       dev_info->pending_msg_hdr = NULL;
> }
> 
> void
> device_pause_response(struct visor_device *dev_info,
>                     int response)

namespace...

> {
>       device_changestate_responder(CONTROLVM_DEVICE_CHANGESTATE,
>                                    dev_info, response,
>                                    segment_state_standby);
> 
>       kfree(dev_info->pending_msg_hdr);
>       dev_info->pending_msg_hdr = NULL;
> }
> 
> void
> device_resume_response(struct visor_device *dev_info, int response)

namespace, errors, etc...

> {
>       device_changestate_responder(CONTROLVM_DEVICE_CHANGESTATE,
>                                    dev_info, response,
>                                    segment_state_running);
> 
>       kfree(dev_info->pending_msg_hdr);
>       dev_info->pending_msg_hdr = NULL;
> }
> 
> /**
>  * devicedisabled_store() - disables the hotplug device
>  * @dev:   sysfs interface variable not utilized in this function
>  * @attr:  sysfs interface variable not utilized in this function
>  * @buf:   buffer containing the device id
>  * @count: the size of the buffer
>  *
>  * The parahotplug/devicedisabled interface gets called by our support script
>  * when an SR-IOV device has been shut down. The ID is passed to the script
>  * and then passed back when the device has been removed.
>  *
>  * Return: the size of the buffer for success or negative for error
>  */
> static ssize_t devicedisabled_store(struct device *dev,
>                                   struct device_attribute *attr,
>                                   const char *buf, size_t count)

device_disabled_store()?


> {
>       unsigned int id;
>       int err;
> 
>       if (kstrtouint(buf, 10, &id))
>               return -EINVAL;
> 
>       err = parahotplug_request_complete(id, 0);
>       if (err < 0)
>               return err;
>       return count;
> }
> 
> /**
>  * deviceenabled_store() - enables the hotplug device
>  * @dev:   sysfs interface variable not utilized in this function
>  * @attr:  sysfs interface variable not utilized in this function
>  * @buf:   buffer containing the device id
>  * @count: the size of the buffer
>  *
>  * The parahotplug/deviceenabled interface gets called by our support script
>  * when an SR-IOV device has been recovered. The ID is passed to the script
>  * and then passed back when the device has been brought back up.
>  *
>  * Return: the size of the buffer for success or negative for error
>  */
> static ssize_t deviceenabled_store(struct device *dev,
>                                  struct device_attribute *attr,
>                                  const char *buf, size_t count)

device_enabled_store()?

> {
>       unsigned int id;
> 
>       if (kstrtouint(buf, 10, &id))
>               return -EINVAL;
> 
>       parahotplug_request_complete(id, 1);
>       return count;
> }
> 
> static int
> visorchipset_mmap(struct file *file, struct vm_area_struct *vma)
> {
>       unsigned long physaddr = 0;
>       unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
>       u64 addr = 0;
> 
>       /* sv_enable_dfp(); */
>       if (offset & (PAGE_SIZE - 1))
>               return -ENXIO;  /* need aligned offsets */
> 
>       switch (offset) {
>       case VISORCHIPSET_MMAP_CONTROLCHANOFFSET:

offsets into mmap do different things?  Tricky...  Are you sure you want
to do this?

Oh wait, this is just 0x00.  So you don't want any offset at all.  Make
that frickin obvious, it sure isn't.

So, no offset, you just want to read/write starting at 0?  This feels
really odd...

>               vma->vm_flags |= VM_IO;
>               if (!*file_controlvm_channel)
>                       return -ENXIO;
> 
>               visorchannel_read
>                       (*file_controlvm_channel,
>                        offsetof(struct spar_controlvm_channel_protocol,
>                                 gp_control_channel),
>                        &addr, sizeof(addr));
>               if (!addr)
>                       return -ENXIO;
> 
>               physaddr = (unsigned long)addr;
>               if (remap_pfn_range(vma, vma->vm_start,
>                                   physaddr >> PAGE_SHIFT,
>                                   vma->vm_end - vma->vm_start,
>                                   /*pgprot_noncached */
>                                   (vma->vm_page_prot))) {
>                       return -EAGAIN;
>               }
>               break;
>       default:
>               return -ENXIO;
>       }
>       return 0;
> }
> 
> static inline s64 issue_vmcall_query_guest_virtual_time_offset(void)
> {
>       u64 result = VMCALL_SUCCESS;
>       u64 physaddr = 0;
> 
>       ISSUE_IO_VMCALL(VMCALL_QUERY_GUEST_VIRTUAL_TIME_OFFSET, physaddr,
>                       result);
>       return result;

Look, an error value!  There is hope!

Well, almost, should have been an 'int' not 's64', right?

> }
> 
> static inline int issue_vmcall_update_physical_time(u64 adjustment)
> {
>       int result = VMCALL_SUCCESS;
> 
>       ISSUE_IO_VMCALL(VMCALL_UPDATE_PHYSICAL_TIME, adjustment, result);
>       return result;

Almost!!!  (why preinit result?)

> }
> 
> static long visorchipset_ioctl(struct file *file, unsigned int cmd,
>                              unsigned long arg)
> {
>       u64 adjustment;

Shouldn't this be __u64?

>       s64 vrtc_offset;

shouldn't this be __s64?

And what about endian issues?

> 
>       switch (cmd) {
>       case VMCALL_QUERY_GUEST_VIRTUAL_TIME_OFFSET:
>               /* get the physical rtc offset */
>               vrtc_offset = issue_vmcall_query_guest_virtual_time_offset();
>               if (copy_to_user((void __user *)arg, &vrtc_offset,
>                                sizeof(vrtc_offset))) {
>                       return -EFAULT;
>               }
>               return 0;
>       case VMCALL_UPDATE_PHYSICAL_TIME:
>               if (copy_from_user(&adjustment, (void __user *)arg,
>                                  sizeof(adjustment))) {
>                       return -EFAULT;
>               }
>               return issue_vmcall_update_physical_time(adjustment);
>       default:
>               return -EFAULT;
>       }
> }
> 
> static const struct file_operations visorchipset_fops = {
>       .owner = THIS_MODULE,
>       .open = visorchipset_open,
>       .read = NULL,
>       .write = NULL,

No need to set something to NULL, it's implicit here, right?

>       .unlocked_ioctl = visorchipset_ioctl,
>       .release = visorchipset_release,
>       .mmap = visorchipset_mmap,
> };
> 
> static int
> visorchipset_file_init(dev_t major_dev, struct visorchannel 
> **controlvm_channel)
> {
>       int rc = 0;
> 
>       file_controlvm_channel = controlvm_channel;
>       cdev_init(&file_cdev, &visorchipset_fops);
>       file_cdev.owner = THIS_MODULE;
>       if (MAJOR(major_dev) == 0) {
>               rc = alloc_chrdev_region(&major_dev, 0, 1, "visorchipset");
>               /* dynamic major device number registration required */
>               if (rc < 0)
>                       return rc;
>       } else {
>               /* static major device number registration required */
>               rc = register_chrdev_region(major_dev, 1, "visorchipset");
>               if (rc < 0)
>                       return rc;
>       }
>       rc = cdev_add(&file_cdev, MKDEV(MAJOR(major_dev), 0), 1);
>       if (rc < 0) {
>               unregister_chrdev_region(major_dev, 1);
>               return rc;
>       }
>       return 0;

Just use the misc device interface please.  Much simpler, and you don't
burn a whole major number for no reason.

> }
> 
> static void
> visorchipset_file_cleanup(dev_t major_dev)
> {
>       if (file_cdev.ops)
>               cdev_del(&file_cdev);

really?  How could this not always happen?

>       file_cdev.ops = NULL;

why?

>       unregister_chrdev_region(major_dev, 1);

Again, misc device please.

> }
> 
> static int
> visorchipset_init(struct acpi_device *acpi_device)
> {
>       int err = -ENODEV;
>       u64 addr;
>       uuid_le uuid = SPAR_CONTROLVM_CHANNEL_PROTOCOL_UUID;
> 
>       addr = controlvm_get_channel_address();
>       if (!addr)
>               goto error;
> 
>       memset(&controlvm_payload_info, 0, sizeof(controlvm_payload_info));
> 
>       controlvm_channel = visorchannel_create_with_lock(addr, 0,
>                                                         GFP_KERNEL, uuid);
>       if (!controlvm_channel)
>               goto error;
> 
>       if (SPAR_CONTROLVM_CHANNEL_OK_CLIENT(
>                   visorchannel_get_header(controlvm_channel))) {

Fix the ALL_SHOUTING_FUNCTIONS_THAT_DO_UNKNOWN_THINGS_TO_PARAMETERS()

>               initialize_controlvm_payload();

No error checking?

>       } else {
>               goto error_destroy_channel;
>       }
> 
>       major_dev = MKDEV(visorchipset_major, 0);

Another device node?

>       err = visorchipset_file_init(major_dev, &controlvm_channel);
>       if (err < 0)
>               goto error_destroy_payload;
> 
>       /* if booting in a crash kernel */
>       if (is_kdump_kernel())
>               INIT_DELAYED_WORK(&periodic_controlvm_work,
>                                 setup_crash_devices_work_queue);
>       else
>               INIT_DELAYED_WORK(&periodic_controlvm_work,
>                                 controlvm_periodic_work);
> 
>       most_recent_message_jiffies = jiffies;
>       poll_jiffies = POLLJIFFIES_CONTROLVMCHANNEL_FAST;
>       schedule_delayed_work(&periodic_controlvm_work, poll_jiffies);
> 
>       visorchipset_platform_device.dev.devt = major_dev;

Hm, no. Don't abuse a platform device like this, you can't ever have a
static 'struct device' or bad things will happen (as you found out and
hacked around...)

>       if (platform_device_register(&visorchipset_platform_device) < 0) {
>               POSTCODE_LINUX_2(DEVICE_REGISTER_FAILURE_PC, DIAG_SEVERITY_ERR);
>               err = -ENODEV;

Why throw away the error that the platform code gave you?

Ah, nevermind, I don't care, don't use a platform device and I'll be
happier.

>               goto error_cancel_work;
>       }
>       POSTCODE_LINUX_2(CHIPSET_INIT_SUCCESS_PC, POSTCODE_SEVERITY_INFO);

No...

> 
>       err = visorbus_init();
>       if (err < 0)
>               goto error_unregister;
> 
>       return 0;
> 
> error_unregister:
>       platform_device_unregister(&visorchipset_platform_device);
> 
> error_cancel_work:
>       cancel_delayed_work_sync(&periodic_controlvm_work);
>       visorchipset_file_cleanup(major_dev);
> 
> error_destroy_payload:
>       destroy_controlvm_payload_info(&controlvm_payload_info);
> 
> error_destroy_channel:
>       visorchannel_destroy(controlvm_channel);
> 
> error:
>       POSTCODE_LINUX_3(CHIPSET_INIT_FAILURE_PC, err, POSTCODE_SEVERITY_ERR);
>       return err;
> }
> 
> static int
> visorchipset_exit(struct acpi_device *acpi_device)
> {
>       POSTCODE_LINUX_2(DRIVER_EXIT_PC, POSTCODE_SEVERITY_INFO);
> 
>       visorbus_exit();
> 
>       cancel_delayed_work_sync(&periodic_controlvm_work);
>       destroy_controlvm_payload_info(&controlvm_payload_info);
> 
>       visorchannel_destroy(controlvm_channel);
> 
>       visorchipset_file_cleanup(visorchipset_platform_device.dev.devt);
>       platform_device_unregister(&visorchipset_platform_device);
>       POSTCODE_LINUX_2(DRIVER_EXIT_PC, POSTCODE_SEVERITY_INFO);
> 
>       return 0;
> }
> 
> static const struct acpi_device_id unisys_device_ids[] = {
>       {"PNP0A07", 0},
>       {"", 0},
> };
> 
> static struct acpi_driver unisys_acpi_driver = {
>       .name = "unisys_acpi",
>       .class = "unisys_acpi_class",
>       .owner = THIS_MODULE,
>       .ids = unisys_device_ids,
>       .ops = {
>               .add = visorchipset_init,
>               .remove = visorchipset_exit,
>               },

Indent one tab left please.

> };
> 
> MODULE_DEVICE_TABLE(acpi, unisys_device_ids);

Look, you are an acpi device, use that, not a platform device.

> 
> static __init uint32_t visorutil_spar_detect(void)

unit32_t?  This isn't userspace.

"int" love it.  It loves you...  Ok, maybe that was too much coffee...

> {
>       unsigned int eax, ebx, ecx, edx;
> 
>       if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
>               /* check the ID */
>               cpuid(UNISYS_SPAR_LEAF_ID, &eax, &ebx, &ecx, &edx);
>               return  (ebx == UNISYS_SPAR_ID_EBX) &&
>                       (ecx == UNISYS_SPAR_ID_ECX) &&
>                       (edx == UNISYS_SPAR_ID_EDX);
>       } else {
>               return 0;
>       }
> }
> 
> static int init_unisys(void)
> {
>       int result;
> 
>       if (!visorutil_spar_detect())
>               return -ENODEV;

return the error given you.

> 
>       result = acpi_bus_register_driver(&unisys_acpi_driver);
>       if (result)
>               return -ENODEV;

return what was given you.

> 
>       pr_info("Unisys Visorchipset Driver Loaded.\n");

Don't be noisy.  Be quiet.  If all is good with your driver and
hardware, no one will know.  That's the goal.  Don't splat all over
syslogs just because you like to see your driver name in there, you are
better than that.

>       return 0;
> };
> 
> static void exit_unisys(void)
> {
>       acpi_bus_unregister_driver(&unisys_acpi_driver);
> }
> 
> module_param_named(major, visorchipset_major, int, S_IRUGO);
> MODULE_PARM_DESC(visorchipset_major,
>                "major device number to use for the device node");
> module_param_named(visorbusregwait, visorchipset_visorbusregwait, int, 
> S_IRUGO);
> MODULE_PARM_DESC(visorchipset_visorbusregwait,
>                "1 to have the module wait for the visor bus to register");

As said way above, 3000 lines ago, just delete these.

> 
> module_init(init_unisys);
> module_exit(exit_unisys);
> 
> MODULE_AUTHOR("Unisys");
> MODULE_LICENSE("GPL");
> MODULE_DESCRIPTION("Supervisor chipset driver for service partition: ver "
>                  VERSION);
> MODULE_VERSION(VERSION);

You don't care about the version the kernel has a version, embrace that.
Your driver version means nothing anymore, it has been swallowed up by
the overgrowth of the kernel tarball.  Welcome to the hive.



Ok, that was a lot, sorry about that.  But for a driver that supposedly
was ready to be moved to the main part of the kernel I had a lot of
questions.

Now either I got really "unlucky" and just picked the one random file
that deserved this much criticism, or the whole driver is this bad.
Honestly I don't want to find out, I'm stopping here, I think that's
enough.

Or I'm totally wrong about all of this review, if so, please point out
my mistakes, I'm low on sleep, and probably oxygen, and odds are, got
lots of things wrong.

If not, please work on fixing up these issues, propagate those fixes to
the whole driver subsystem, and then we can revisit the "it's good
enough to be out of staging" or not.

Damn, 2.5 hours left till I land, ugh, this means I need to go review
serial driver patches, that's even worse than staging patches...

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to