Re: [PATCH 13/15] hotplug: introduce qdev_unplug_complete() to remove device from views

2012-08-13 Thread Marcelo Tosatti
On Thu, Aug 09, 2012 at 10:00:16AM +0200, Paolo Bonzini wrote:
 Il 09/08/2012 09:28, liu ping fan ha scritto:
   VCPU threadI/O thread
   =
   get MMIO request
   rcu_read_lock()
   walk memory map
  qdev_unmap()
  lock_devtree()
  ...
  unlock_devtree
  unref dev - refcnt=0, free enqueued
   ref()
  No ref() for dev here, while we have ref to flatview+radix in my patches.
  I use rcu to protect radix+flatview+mr refered. As to dev, its ref has
  inc when it is added into mem view -- that is
  memory_region_add_subregion - memory_region_get() {
  if(atomic_add_and_return()) dev-ref++  }.
  So not until reclaimer of mem view, the dev's ref is hold by mem view.
  In a short word, rcu protect mem view, while device is protected by refcnt.

The idea, written on that plan, was:

- RCU protects memory maps.
- Object reference protects device in the window between 4. and 5.

The unplug/remove path should:

1) Lock memmap_lock for write (if not using RCU).
2) Remove any memmap entries (which is possible due to write lock on
memmap_lock. Alternatively wait for an RCU grace period). Device should
not be visible after that.
3) Lock dev-lock.
4) Wait until references are removed (no new references can be made
since device is not visible).
5) Remove device. 

So its a combination of both dev-lock and reference counter.

Note: a first step can be only parallel execution of MMIO lookups
(actually that is a very good first target). dev-lock above would be 
qemu_big_lock in that first stage, then _only devices which are 
performance sensitive need to be converted_.

 But the RCU critical section should not include the whole processing of
 MMIO, only the walk of the memory map.

Yes.

 And in general I think this is a bit too tricky... I understand not
 adding refcounting to all of bottom halves, timers, etc., but if you are
 using a device you should have explicit ref/unref pairs.
 
 Paolo
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/15] hotplug: introduce qdev_unplug_complete() to remove device from views

2012-08-13 Thread Marcelo Tosatti
On Fri, Aug 10, 2012 at 02:42:58PM +0800, liu ping fan wrote:
 On Thu, Aug 9, 2012 at 4:00 PM, Paolo Bonzini pbonz...@redhat.com wrote:
  Il 09/08/2012 09:28, liu ping fan ha scritto:
   VCPU threadI/O thread
   =
   get MMIO request
   rcu_read_lock()
   walk memory map
  qdev_unmap()
  lock_devtree()
  ...
  unlock_devtree
  unref dev - refcnt=0, free enqueued
   ref()
  No ref() for dev here, while we have ref to flatview+radix in my patches.
  I use rcu to protect radix+flatview+mr refered. As to dev, its ref has
  inc when it is added into mem view -- that is
  memory_region_add_subregion - memory_region_get() {
  if(atomic_add_and_return()) dev-ref++  }.
  So not until reclaimer of mem view, the dev's ref is hold by mem view.
  In a short word, rcu protect mem view, while device is protected by refcnt.
 
  But the RCU critical section should not include the whole processing of
  MMIO, only the walk of the memory map.
 
 Yes, you are right.  And I think cur_map_get() can be broken into the
 style lock,  ref++, phys_page_find(); unlock.  easily.
 
  And in general I think this is a bit too tricky... I understand not
  adding refcounting to all of bottom halves, timers, etc., but if you are
  using a device you should have explicit ref/unref pairs.
 
 Actually, there are pairs -- when dev enter mem view, inc ref; and
 when it leave, dec ref.
 But as Avi has pointed out, the mr-refcnt introduce complicate and no
 gain. So I will discard this design

The plan that you refer that has been relatively well thought out, IIRC. 
Instead of designing something, i would try to understand/improve on
that.

 Thanks and regards,
 pingfan
 
  Paolo
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/15] hotplug: introduce qdev_unplug_complete() to remove device from views

2012-08-10 Thread liu ping fan
On Thu, Aug 9, 2012 at 4:00 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 09/08/2012 09:28, liu ping fan ha scritto:
  VCPU threadI/O thread
  =
  get MMIO request
  rcu_read_lock()
  walk memory map
 qdev_unmap()
 lock_devtree()
 ...
 unlock_devtree
 unref dev - refcnt=0, free enqueued
  ref()
 No ref() for dev here, while we have ref to flatview+radix in my patches.
 I use rcu to protect radix+flatview+mr refered. As to dev, its ref has
 inc when it is added into mem view -- that is
 memory_region_add_subregion - memory_region_get() {
 if(atomic_add_and_return()) dev-ref++  }.
 So not until reclaimer of mem view, the dev's ref is hold by mem view.
 In a short word, rcu protect mem view, while device is protected by refcnt.

 But the RCU critical section should not include the whole processing of
 MMIO, only the walk of the memory map.

Yes, you are right.  And I think cur_map_get() can be broken into the
style lock,  ref++, phys_page_find(); unlock.  easily.

 And in general I think this is a bit too tricky... I understand not
 adding refcounting to all of bottom halves, timers, etc., but if you are
 using a device you should have explicit ref/unref pairs.

Actually, there are pairs -- when dev enter mem view, inc ref; and
when it leave, dec ref.
But as Avi has pointed out, the mr-refcnt introduce complicate and no
gain. So I will discard this design

Thanks and regards,
pingfan

 Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/15] hotplug: introduce qdev_unplug_complete() to remove device from views

2012-08-09 Thread liu ping fan
On Wed, Aug 8, 2012 at 5:52 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 08/08/2012 08:25, Liu Ping Fan ha scritto:
 +void qdev_unplug_complete(DeviceState *dev, Error **errp)
 +{
 +/* isolate from mem view */
 +qdev_unmap(dev);
 +qemu_lock_devtree();
 +/* isolate from device tree */
 +qdev_unset_parent(dev);
 +qemu_unlock_devtree();
 +object_unref(OBJECT(dev));

 Rather than deferring the free, you should defer the unref.  Otherwise
 the following can happen when you have real RCU access to the memory
 map on the read-side:

 VCPU threadI/O thread
 =
 get MMIO request
 rcu_read_lock()
 walk memory map
qdev_unmap()
lock_devtree()
...
unlock_devtree
unref dev - refcnt=0, free enqueued
 ref()

No ref() for dev here, while we have ref to flatview+radix in my patches.
I use rcu to protect radix+flatview+mr refered. As to dev, its ref has
inc when it is added into mem view -- that is
memory_region_add_subregion - memory_region_get() {
if(atomic_add_and_return()) dev-ref++  }.
So not until reclaimer of mem view, the dev's ref is hold by mem view.
In a short word, rcu protect mem view, while device is protected by refcnt.

 rcu_read_unlock()
free()
 dangling pointer!

 If you defer the unref, you have instead

 VCPU threadI/O thread
 =
 get MMIO request
 rcu_read_lock()
 walk memory map
qdev_unmap()
lock_devtree()
...
unlock_devtree
unref is enqueued
 ref() - refcnt = 2
 rcu_read_unlock()
unref() - refcnt=1
 unref() - refcnt = 1

 So this also makes patch 14 unnecessary.

 Paolo

 +}


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/15] hotplug: introduce qdev_unplug_complete() to remove device from views

2012-08-09 Thread Paolo Bonzini
Il 09/08/2012 09:28, liu ping fan ha scritto:
  VCPU threadI/O thread
  =
  get MMIO request
  rcu_read_lock()
  walk memory map
 qdev_unmap()
 lock_devtree()
 ...
 unlock_devtree
 unref dev - refcnt=0, free enqueued
  ref()
 No ref() for dev here, while we have ref to flatview+radix in my patches.
 I use rcu to protect radix+flatview+mr refered. As to dev, its ref has
 inc when it is added into mem view -- that is
 memory_region_add_subregion - memory_region_get() {
 if(atomic_add_and_return()) dev-ref++  }.
 So not until reclaimer of mem view, the dev's ref is hold by mem view.
 In a short word, rcu protect mem view, while device is protected by refcnt.

But the RCU critical section should not include the whole processing of
MMIO, only the walk of the memory map.

And in general I think this is a bit too tricky... I understand not
adding refcounting to all of bottom halves, timers, etc., but if you are
using a device you should have explicit ref/unref pairs.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/15] hotplug: introduce qdev_unplug_complete() to remove device from views

2012-08-08 Thread Liu Ping Fan
From: Liu Ping Fan pingf...@linux.vnet.ibm.com

When guest confirm the removal of device, we should
--unmap from MemoryRegion view
--isolated from device tree view

Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
---
 hw/acpi_piix4.c |4 ++--
 hw/pci.c|   13 -
 hw/pci.h|2 ++
 hw/qdev.c   |   28 
 hw/qdev.h   |3 ++-
 5 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index 0aace60..c209ff7 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -305,8 +305,8 @@ static void acpi_piix_eject_slot(PIIX4PMState *s, unsigned 
slots)
 if (pc-no_hotplug) {
 slot_free = false;
 } else {
-object_unparent(OBJECT(dev));
-qdev_free(qdev);
+/* refcnt will be decreased */
+qdev_unplug_complete(qdev, NULL);
 }
 }
 }
diff --git a/hw/pci.c b/hw/pci.c
index 99a4304..2095abf 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -856,12 +856,22 @@ static int pci_unregister_device(DeviceState *dev)
 if (ret)
 return ret;
 
-pci_unregister_io_regions(pci_dev);
 pci_del_option_rom(pci_dev);
 do_pci_unregister_device(pci_dev);
 return 0;
 }
 
+static void pci_unmap_device(DeviceState *dev)
+{
+PCIDevice *pci_dev = PCI_DEVICE(dev);
+PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(pci_dev);
+
+pci_unregister_io_regions(pci_dev);
+if (pc-unmap) {
+pc-unmap(pci_dev);
+}
+}
+
 void pci_register_bar(PCIDevice *pci_dev, int region_num,
   uint8_t type, MemoryRegion *memory)
 {
@@ -2022,6 +2032,7 @@ static void pci_device_class_init(ObjectClass *klass, 
void *data)
 DeviceClass *k = DEVICE_CLASS(klass);
 k-init = pci_qdev_init;
 k-unplug = pci_unplug_device;
+k-unmap = pci_unmap_device;
 k-exit = pci_unregister_device;
 k-bus_type = TYPE_PCI_BUS;
 k-props = pci_props;
diff --git a/hw/pci.h b/hw/pci.h
index 79d38fd..1c5b909 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -145,6 +145,8 @@ typedef struct PCIDeviceClass {
 DeviceClass parent_class;
 
 int (*init)(PCIDevice *dev);
+void (*unmap)(PCIDevice *dev);
+
 PCIUnregisterFunc *exit;
 PCIConfigReadFunc *config_read;
 PCIConfigWriteFunc *config_write;
diff --git a/hw/qdev.c b/hw/qdev.c
index 17525fe..530eabe 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -104,6 +104,14 @@ void qdev_set_parent_bus(DeviceState *dev, BusState *bus)
 bus_add_child(bus, dev);
 }
 
+static void qdev_unset_parent(DeviceState *dev)
+{
+BusState *b = dev-parent_bus;
+
+object_unparent(OBJECT(dev));
+bus_remove_child(b, dev);
+}
+
 /* Create a new device.  This only initializes the device state structure
and allows properties to be set.  qdev_init should be called to
initialize the actual device emulation.  */
@@ -194,6 +202,26 @@ void qdev_set_legacy_instance_id(DeviceState *dev, int 
alias_id,
 dev-alias_required_for_version = required_for_version;
 }
 
+static int qdev_unmap(DeviceState *dev)
+{
+DeviceClass *dc =  DEVICE_GET_CLASS(dev);
+if (dc-unmap) {
+dc-unmap(dev);
+}
+return 0;
+}
+
+void qdev_unplug_complete(DeviceState *dev, Error **errp)
+{
+/* isolate from mem view */
+qdev_unmap(dev);
+qemu_lock_devtree();
+/* isolate from device tree */
+qdev_unset_parent(dev);
+qemu_unlock_devtree();
+object_unref(OBJECT(dev));
+}
+
 void qdev_unplug(DeviceState *dev, Error **errp)
 {
 DeviceClass *dc = DEVICE_GET_CLASS(dev);
diff --git a/hw/qdev.h b/hw/qdev.h
index f4683dc..705635a 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -47,7 +47,7 @@ typedef struct DeviceClass {
 
 /* callbacks */
 void (*reset)(DeviceState *dev);
-
+void (*unmap)(DeviceState *dev);
 /* device state */
 const VMStateDescription *vmsd;
 
@@ -162,6 +162,7 @@ void qdev_init_nofail(DeviceState *dev);
 void qdev_set_legacy_instance_id(DeviceState *dev, int alias_id,
  int required_for_version);
 void qdev_unplug(DeviceState *dev, Error **errp);
+void qdev_unplug_complete(DeviceState *dev, Error **errp);
 void qdev_free(DeviceState *dev);
 int qdev_simple_unplug_cb(DeviceState *dev);
 void qdev_machine_creation_done(void);
-- 
1.7.4.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/15] hotplug: introduce qdev_unplug_complete() to remove device from views

2012-08-08 Thread Paolo Bonzini
Il 08/08/2012 08:25, Liu Ping Fan ha scritto:
 +void qdev_unplug_complete(DeviceState *dev, Error **errp)
 +{
 +/* isolate from mem view */
 +qdev_unmap(dev);
 +qemu_lock_devtree();
 +/* isolate from device tree */
 +qdev_unset_parent(dev);
 +qemu_unlock_devtree();
 +object_unref(OBJECT(dev));

Rather than deferring the free, you should defer the unref.  Otherwise
the following can happen when you have real RCU access to the memory
map on the read-side:

VCPU threadI/O thread
=
get MMIO request
rcu_read_lock()
walk memory map
   qdev_unmap()
   lock_devtree()
   ...
   unlock_devtree
   unref dev - refcnt=0, free enqueued
ref()
rcu_read_unlock()
   free()
dangling pointer!

If you defer the unref, you have instead

VCPU threadI/O thread
=
get MMIO request
rcu_read_lock()
walk memory map
   qdev_unmap()
   lock_devtree()
   ...
   unlock_devtree
   unref is enqueued
ref() - refcnt = 2
rcu_read_unlock()
   unref() - refcnt=1
unref() - refcnt = 1

So this also makes patch 14 unnecessary.

Paolo

 +}


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/15] hotplug: introduce qdev_unplug_complete() to remove device from views

2012-08-08 Thread Avi Kivity
On 08/08/2012 12:52 PM, Paolo Bonzini wrote:
 Il 08/08/2012 08:25, Liu Ping Fan ha scritto:
 +void qdev_unplug_complete(DeviceState *dev, Error **errp)
 +{
 +/* isolate from mem view */
 +qdev_unmap(dev);
 +qemu_lock_devtree();
 +/* isolate from device tree */
 +qdev_unset_parent(dev);
 +qemu_unlock_devtree();
 +object_unref(OBJECT(dev));
 
 Rather than deferring the free, you should defer the unref.  Otherwise
 the following can happen when you have real RCU access to the memory
 map on the read-side:
 
 VCPU threadI/O thread
 =
 get MMIO request
 rcu_read_lock()
 walk memory map
qdev_unmap()
lock_devtree()
...
unlock_devtree
unref dev - refcnt=0, free enqueued
 ref()
 rcu_read_unlock()
free()
 dangling pointer!

unref should follow either synchronize_rcu(), or be in a call_rcu()
callback (deferring the unref).  IMO synchronize_rcu() is sufficient
here, unplug is a truly slow path, esp. on real hardware.


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html