Quoting Bharata B Rao (2014-09-04 10:08:20) > On Thu, Sep 4, 2014 at 4:33 AM, Michael Roth <mdr...@linux.vnet.ibm.com> > wrote: > >> > +static int spapr_device_hotplug_add(DeviceState *qdev, PCIDevice *dev) > >> > +{ > >> > + sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(qdev); > >> > + sPAPRDrcEntry *drc_entry, *drc_entry_slot; > >> > + sPAPRConfigureConnectorState *ccs; > >> > + int slot = PCI_SLOT(dev->devfn); > >> > + int offset, ret; > >> > + void *fdt_orig, *fdt; > >> > + char nodename[512]; > >> > + uint32_t encoded = ENCODE_DRC_STATE(INDICATOR_ENTITY_SENSE_PRESENT, > >> > + INDICATOR_ENTITY_SENSE_MASK, > >> > + INDICATOR_ENTITY_SENSE_SHIFT); > >> > + > >> > >> I am building on this infrastructure of yours and adding CPU hotplug > >> support to sPAPR guests. > >> > >> So we start with dr state of UNUSABLE and change it to PRESENT like > >> above when performing hotplug operation. But after this, in case of > >> CPU hotplug, the CPU hotplug code in the kernel > >> (arch/powerpc/platforms/pseries/dlpar.c:dlpar_acquire_drc()) does > >> get-sensor-state rtas call and expects the dr state to be UNUSABLE. Is > >> the guest kernel right in expecting dr state to be in UNUSABLE state > >> like this ? I have in fact disabled this check in the guest kernel to > >> get a CPU hotplugged successfully, but wanted to understand the state > >> changes and the expectations from the guest kernel correctly. > > > > According to PAPR+ 2.7 13.5.3.3, > > > > PRESENT (1): > > > > Returned for logical and physical DR entities when the DR connector is > > allocated to the OS and the DR entity is present. For physical DR > > entities, > > this indicates that the DR connector actually has a DR entity plugged into > > it. For DR connectors of physical DR entities, the DR connector must be > > allocated to the OS to return this value, otherwise a status of -3, no > > such > > sensor implemented, will be returned from the get-sensor-state RTAS call. > > For > > DR connectors of logical DR entities, the DR connector must be allocated > > to > > the OS to return this value, otherwise a sensor value of 2 or 3 will be > > returned. > > > > UNUSABLE (2): > > > > Returned for logical DR entities when the DR entity is not currently > > available to the OS, but may possibly be made available to the OS by > > calling > > set-indicator with the allocation-state indicator, setting that indicator > > to > > usable. > > > > So it seems 'PRESENT' is in fact the right value immediately after PCI > > hotplug, but it doesn't seem clear from the documentation whether 'PRESENT' > > or 'UNUSABLE' is more correct immediately after CPU hotplug. What does > > seem clear as that UNUSABLE does not have any use in the case of PCI > > devices: just PRESENT/EMPTY(0). > > > > But we actually 0-initialize the sensor field for DRCEntrys corresponding > > to PCI devices, which corresponds to 'EMPTY' (0), so the handling seems > > correct for PCI devices... > > Thanks Michael for the above information on PRESENT and USABLE states. > > > > > And since we already initialize PHB sensors to UNUSABLE in the top-level > > DRC list, I'm not sure why adjacent CPU entries would be affected by what > > we do later for PCI devices? > > Sorry if I wasn't clear enough in my previous mail. CPU hotplug isn't > affected by what you do for PCI devices, but... > > > It seems like you'd just need to do the > > equivalent of what we do for PHBs during realize: > > when I try to do the same state changes for CPU hotplug, things don't > work as expected. > > > > > spapr_add_phb_to_drc_table(sphb->buid, 2 /* Unusable */); > > > > So I'm not sure where the need for guest kernel changes is coming from for > > CPU hotplug. > > When the resource is hotplugged, you change the state from UNUSABLE to > PRESENT in QEMU before signalling the guest (via check exception irq). > But the same state change in CPU hotplug case isn't as per guest > kernel's expectation. > > > Do you have a snippet of what the initialize/hot_add hooks > > like in your case? > > I am talking about this piece of code in the the kernel in > arch/powerpc/platforms/pseries/dlpar.c > > int dlpar_acquire_drc(u32 drc_index) > { > int dr_status, rc; > > rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, &dr_status, > DR_ENTITY_SENSE, drc_index); > if (rc || dr_status != DR_ENTITY_UNUSABLE) > return -1; > ... > } > > I have circumvented this problem by not setting the state to PRESENT > in my current hotplug patch. You can refer to the same in > spapr_cpu_hotplug_add() routine that's part of my patch 14/15 > (https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg00757.html)
Yah, that's what I was getting at: at least just to get things working for testing, just avoid the PRESENT bits in your hot_add_cpu hook rather than patching the guest. Unfortunately the documentation isn't particularly clear about which of these approaches is more correct as far as CPUs go. But looking at it again: UNUSABLE (2): Returned for logical DR entities when the DR entity is not currently available to the OS, but may possibly be made available to the OS by calling set-indicator with the allocation-state indicator, setting that indicator to usable. That 'usable' indicator setting is documented for set-indicator as (1), which happens to correspond to PRESENT (1). So my read would be that for 'physical' hotplug (like PCI), the firmware changes the indicator state to PRESENT/USABLE immediately after hotplug, whereas with 'logical' hotplug (like PHB/CPU), the guest OS signals this transition to USABLE through set-indicator calls for the 9003 sensor/allocation state after hotplug (which also seems to match up with the kernel code). This seems to correspond with the dlpar_acquire_drc() function, but I'm a little confused why that's not also called in the PHB path...I think maybe in that case it's handled by drmgr in userspace... will take another look to confirm. > > Regards, > Bharata.