On Fri, 7 Jul 2017 14:12:19 +0100
Mark Cave-Ayland <mark.cave-ayl...@ilande.co.uk> wrote:

> On 07/07/17 12:33, Igor Mammedov wrote:
> 
> > On Tue, 4 Jul 2017 19:08:44 +0100
> > Mark Cave-Ayland <mark.cave-ayl...@ilande.co.uk> wrote:
> >   
> >> On 03/07/17 10:39, Igor Mammedov wrote:
> >>  
> >>> On Thu, 29 Jun 2017 15:07:19 +0100
> >>> Mark Cave-Ayland <mark.cave-ayl...@ilande.co.uk> wrote:
> >>>     
> >>>> When looking to instantiate a TYPE_FW_CFG_MEM or TYPE_FW_CFG_IO device 
> >>>> to be
> >>>> able to wire it up differently, it is much more convenient for the 
> >>>> caller to
> >>>> instantiate the device and have the fw_cfg default files already 
> >>>> preloaded
> >>>> during realize.
> >>>>
> >>>> Move fw_cfg_init1() to the end of both the fw_cfg_mem_realize() and
> >>>> fw_cfg_io_realize() functions so it no longer needs to be called manually
> >>>> when instantiating the device, and also rename it to 
> >>>> fw_cfg_common_realize()
> >>>> which better describes its new purpose.
> >>>>
> >>>> Since it is now the responsibility of the machine to wire up the fw_cfg 
> >>>> device
> >>>> it is necessary to introduce a object_property_add_child() call into
> >>>> fw_cfg_init_io() and fw_cfg_init_mem() to link the fw_cfg device to the 
> >>>> root
> >>>> machine object as before.
> >>>>
> >>>> Finally we can now convert the asserts() preventing multiple fw_cfg 
> >>>> devices
> >>>> and unparented fw_cfg devices being instantiated and replace them with 
> >>>> proper
> >>>> error reporting at realize time. This allows us to remove FW_CFG_NAME and
> >>>> FW_CFG_PATH since they are no longer required.
> >>>>
> >>>> Signed-off-by: Mark Cave-Ayland <mark.cave-ayl...@ilande.co.uk>
> >>>> ---
> >>>>  hw/nvram/fw_cfg.c |   41 +++++++++++++++++++++++++++++------------
> >>>>  1 file changed, 29 insertions(+), 12 deletions(-)
> >>>>
> >>>> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> >>>> index 2291121..31029ac 100644
> >>>> --- a/hw/nvram/fw_cfg.c
> >>>> +++ b/hw/nvram/fw_cfg.c
> >>>> @@ -37,9 +37,6 @@
> >>>>  
> >>>>  #define FW_CFG_FILE_SLOTS_DFLT 0x20
> >>>>  
> >>>> -#define FW_CFG_NAME "fw_cfg"
> >>>> -#define FW_CFG_PATH "/machine/" FW_CFG_NAME
> >>>> -
> >>>>  #define TYPE_FW_CFG     "fw_cfg"
> >>>>  #define TYPE_FW_CFG_IO  "fw_cfg_io"
> >>>>  #define TYPE_FW_CFG_MEM "fw_cfg_mem"
> >>>> @@ -920,19 +917,22 @@ static int fw_cfg_unattached_at_realize(void)
> >>>>  }
> >>>>  
> >>>>  
> >>>> -static void fw_cfg_init1(DeviceState *dev)
> >>>> +static void fw_cfg_common_realize(DeviceState *dev, Error **errp)
> >>>>  {
> >>>>      FWCfgState *s = FW_CFG(dev);
> >>>>      MachineState *machine = MACHINE(qdev_get_machine());
> >>>>      uint32_t version = FW_CFG_VERSION;
> >>>>  
> >>>> -    assert(!object_resolve_path(FW_CFG_PATH, NULL));
> >>>> -
> >>>> -    object_property_add_child(OBJECT(machine), FW_CFG_NAME, OBJECT(s), 
> >>>> NULL);
> >>>> -
> >>>> -    qdev_init_nofail(dev);
> >>>> +    if (!fw_cfg_find()) {    
> >>> maybe add comment that here, that fw_cfg_find() will return NULL if more 
> >>> than
> >>> 1 device is found.    
> >>
> >> This should be the same behaviour as before, i.e. a NULL means that the
> >> fw_cfg device couldn't be found. It seems intuitive to me from the name
> >> of the function exactly what it does, so I don't find the lack of
> >> comment too confusing. Does anyone else have any thoughts here?  
> >
> > intuitive meaning from the function name would be return NULL if nothing 
> > were found,
> > however it's not the case here.  
> 
> The function with its current name has always existed, so all I'm doing
> here is reusing it as per your comment on an earlier patchset.
that's why I've asked for comment explaining what's going on here


> > taking in account that fwcfg in not user creatable device how about:
> > 
> > diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> > index 316fca9..8f6eef8 100644
> > --- a/hw/nvram/fw_cfg.c
> > +++ b/hw/nvram/fw_cfg.c
> > @@ -1014,7 +1014,10 @@ FWCfgState *fw_cfg_init_mem(hwaddr ctl_addr, hwaddr 
> > data_addr)
> >  
> >  FWCfgState *fw_cfg_find(void)
> >  {
> > -    return FW_CFG(object_resolve_path(FW_CFG_PATH, NULL));
> > +    bool ambig = false;
> > +    FWCfgState *f = FW_CFG(object_resolve_path_type("", TYPE_FW_CFG, 
> > &ambig));
> > +    assert(!ambig);
> > +    return f;
> >  }
> > 
> > or if we must to print user friendly error and fail realize gracefully
> > (not sure why) just add errp argument to function so it could report
> > error back to caller, then places that do not care much about graceful
> > exit would use error_abort as argument and realize would use
> > its local_error/errp argument.  
> 
> My understanding from the thread was that we should only use assert()s
> where there is no other choice so that any failures can be handled in a
> more friendly manner.
the rule I use to decide assert vs nice error handling:
1: try to avoid crash on hotplug path
2: if error could be induced by end user on startup, try to print nice error
   before dying
3: when error should not happen just assert or use error_abort
   which would print nice error message before dying.

> Now as Laszlo pointed out, fw_cfg_find() is used externally to locate
> the fw_cfg device in other parts of the QEMU codebase. Yes I agree that
> it is possible to change the way in which it returns, however I would
> argue that changing those semantics are outside of the scope of this patch.
I'd just kill qemu in fw_cfg case, which is small not intrusive change.

[...]

> >>>>  
> >>>> -    assert(!fw_cfg_unattached_at_realize());
> >>>> +    if (fw_cfg_unattached_at_realize()) {    
> >>> as I pointed out in v6, this condition will always be false,
> >>> I suggest to drop 4/6 patch and this hunk here so it won't to confuse
> >>> readers with assumption that condition might succeed.    
> >>
> >> Definitely look more closely at the fw_cfg_unattached_at_realize()
> >> implementation in patch 4. You'll see that the check to determine if the
> >> device is attached does not check obj->parent but checks to see if the
> >> device exists under /machine/unattached which is what the
> >> device_set_realised() does if the device doesn't have a parent.  
> >
> > looking more fw_cfg_unattached_at_realize(),
> >  returns true if fwcfg is direct child of/machine/unattached
> > meaning implicit parent assignment.
> > 
> > As result, above condition essentially forbids having fwcfg under
> > /machine/unattached and forces user to explicitly set parent
> > outside of /machine/unattached or be a child of some other device.
> > 
> > Is this your intent (why)?  
> 
> Yes that is entirely correct. All current fw_cfg users setup the device
> using fw_cfg_init_io() and fw_cfg_init_mem() which is fine for those
> cases because these functions attach the fw_cfg device directly to the
> machine at /machine/fw_cfg. This makes it trivial to determine whether
> or not an existing fw_cfg has been instantiated and prevent any more
> instances, which Laszlo has stated is an underlying assumption for
> fw_cfg_find().
> 
> In my particular use case for SPARC64, I need to move the fw_cfg device
> behind a PCI bridge. Therefore in order to allow the QOM tree to reflect
> the actual hardware DT then the fw_cfg device needs to be attached to
> the PCI bridge and not the machine. Hence the check for an existing
> device at /machine/fw_cfg is no longer good enough to determine if a
> fw_cfg device already exists since if they do, they can be in several
> different locations in the QOM tree.
> 
> This explains the change to fw_cfg_find() to make sure that we find any
> other fw_cfg instances, no matter where they are in the QOM tree.
without using ambiguous argument object_resolve_path_type() isn't
returning NULL in case of duplicates in different leafs of tree.

for reason, see 
https://www.mail-archive.com/qemu-devel@nongnu.org/msg460692.html
or look at object_resolve_partial_path() impl.

> However this causes us a problem: if you instantiate the fw_cfg yourself
> with qdev_create()...qdev_init_nofail() then you can potentially access
> the underlying MemoryRegions directly and wire up the device without
> attaching it to the QOM tree. This is an invalid configuration but
> wouldn't be detected with fw_cfg_find().
device_set_realized() makes sure that it's attached to QOM tree,

it's up to board to decide if default /machine/unattached parent
is sufficient or fwcfg should be placed explicitly under another
parent. it's not upto device to decide on it's location though.


> The correct way to handle this is to wire up the device yourself with
> object_property_add_child() but then you find the situation whereby the
> people who know that you have to call object_property_add_child() when
> adding the fw_cfg device are the ones who don't need the error message.
> 
> Therefore the solution was to enforce that the fw_cfg device has been
> added to the QOM tree before realize because it solves all the problems:
there is no need to enforce that as device_set_realized() guaranties
that device's a child of some QOM object (explicitly or implictly) by the time
leaf realize is called.


> 1) The fw_cfg device can now be placed anywhere in the QOM tree, meaning
> that the QOM tree can be a topologically correct representation of the
> machine
with this series it can't be placed under /machine/unattached
I don't see any reason to disable it.

> 2) By enforcing that the fw_cfg device has been parented, we guarantee
> that the fw_cfg_find() check is correct and it is impossible to access a
> fw_cfg device that hasn't been wired up to the machine
fw_cfg_find() check at realize or later time is always correct if you use
ambiguous argument so object_resolve_path_type could properly detect duplicates.


> 3) Since these checks are done at realize time, we can provide nice
> friendly messages back to the developer to tell them what needs to be fixed
error_set(error_abort, ...) should work nice for fwcfg usecase,
you get error message and a stack trace in core file pointing to a place
of the error.


> Finally for reference here is the current version of the code in my
> outstanding sun4u patchset which wires up the fw_cfg device behind a PCI
> bridge in hw/sparc64/sun4u:
> 
>   dev = qdev_create(NULL, TYPE_FW_CFG_IO);
>   qdev_prop_set_bit(dev, "dma_enabled", false);
>   object_property_add_child(OBJECT(ebus), TYPE_FW_CFG, OBJECT(dev),
>                             NULL);
>   qdev_init_nofail(dev);
>   memory_region_add_subregion(pci_address_space_io(ebus),
>                               BIOS_CFG_IOPORT,
>                               &FW_CFG_IO(dev)->comb_iomem);
looks fine,

so what I'd do is:
 * drop 4/6
 * make fw_cfg_find() use ambiguous argument and error_abort if ambiguous == 
true
 * from fw_cfg_common_realize() just call

     // if fw_cfg_find() returns NULL it means that device isn't in QOM tree
     // which shouldn't ever happen, fw_cfg_find() will abort itself if
     // another instance of device present in QOM tree.
     assert(fw_cfg_find());


> ATB,
> 
> Mark.
> 


Reply via email to