subject:"Re\: \[Qemu\-devel\] Re\: \[PATCH\] PCI\: Bus number from the bridge, not the device"

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-21 Thread Isaku Yamahata

On Thu, Dec 16, 2010 at 10:36:08AM +0200, Michael S. Tsirkin wrote:
 On Thu, Dec 16, 2010 at 04:08:16PM +0900, Isaku Yamahata wrote:
  On Wed, Dec 15, 2010 at 08:27:49AM -0700, Alex Williamson wrote:
   On Wed, 2010-12-15 at 11:56 +0200, Michael S. Tsirkin wrote:
On Tue, Dec 14, 2010 at 11:34:53AM -0700, Alex Williamson wrote:
 On Tue, 2010-12-14 at 14:26 +0200, Michael S. Tsirkin wrote:
  On Mon, Dec 13, 2010 at 10:04:24PM -0700, Alex Williamson wrote:
   
   I've only ever seen config[PCI_SECONDARY_BUS] be non-zero for an
   assigned device, so I'm pretty sure we're not going to hurt 
   migration,
   but the code is clearly wrong and I'd like to make sure we don't 
   trip on
   a migration failure for a minor device config space change.
  
  Which reminds me: maybe just mark nested bridges as non-migrateable
  for now?  Care writing such a patch?
 
 Hmm, this is trickier than it sounds.

Hmm, since 0 is put in the path instead of the bridge number,
will the correct bridge be restored?

  We're really only broken wrt
 migration if a device under a bridge calls qemu_ram_alloc.

I guess there's more broken-ness. What exactly breaks qemu_ram_alloc?
   
   You're right, it's more broken than that.  Anything that calls
   get_dev_path is broken for migration of bridges since the path is
   determined before the guest updates bus numbers.  That includes
   qemu_ram_alloc and vmstate.  I was only looking at the qemu_ram_alloc
   side.  So perhaps the right answer, for the moment, is to block
   migration if there's a p2p bridge.
  
  Eww. That's bad. Anyway, I agree to disable it for the moment.
 
 Patch?

Although I'm willing to create patch, how to disable the migration?
As long as I know, Alex Williamson had tried to push the patch series
Save state error handling (kill off no_migrate), they haven't been
merged.
If they are merged in, it seems quite easy to disable the migration.

Anyway register_device_unmigratable() seems to need some clean up
for DeviceInfo::vmsd. Any ideas?
-- 
yamahata

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-21 Thread Michael S. Tsirkin

On Tue, Dec 21, 2010 at 07:13:57PM +0900, Isaku Yamahata wrote:
 On Thu, Dec 16, 2010 at 10:36:08AM +0200, Michael S. Tsirkin wrote:
  On Thu, Dec 16, 2010 at 04:08:16PM +0900, Isaku Yamahata wrote:
   On Wed, Dec 15, 2010 at 08:27:49AM -0700, Alex Williamson wrote:
On Wed, 2010-12-15 at 11:56 +0200, Michael S. Tsirkin wrote:
 On Tue, Dec 14, 2010 at 11:34:53AM -0700, Alex Williamson wrote:
  On Tue, 2010-12-14 at 14:26 +0200, Michael S. Tsirkin wrote:
   On Mon, Dec 13, 2010 at 10:04:24PM -0700, Alex Williamson wrote:

I've only ever seen config[PCI_SECONDARY_BUS] be non-zero for an
assigned device, so I'm pretty sure we're not going to hurt 
migration,
but the code is clearly wrong and I'd like to make sure we 
don't trip on
a migration failure for a minor device config space change.
   
   Which reminds me: maybe just mark nested bridges as 
   non-migrateable
   for now?  Care writing such a patch?
  
  Hmm, this is trickier than it sounds.
 
 Hmm, since 0 is put in the path instead of the bridge number,
 will the correct bridge be restored?
 
   We're really only broken wrt
  migration if a device under a bridge calls qemu_ram_alloc.
 
 I guess there's more broken-ness. What exactly breaks qemu_ram_alloc?

You're right, it's more broken than that.  Anything that calls
get_dev_path is broken for migration of bridges since the path is
determined before the guest updates bus numbers.  That includes
qemu_ram_alloc and vmstate.  I was only looking at the qemu_ram_alloc
side.  So perhaps the right answer, for the moment, is to block
migration if there's a p2p bridge.
   
   Eww. That's bad. Anyway, I agree to disable it for the moment.
  
  Patch?
 
 Although I'm willing to create patch, how to disable the migration?

Set the no_migrate flag in SaveStateEntry.

 As long as I know, Alex Williamson had tried to push the patch series
 Save state error handling (kill off no_migrate), they haven't been
 merged.
 If they are merged in, it seems quite easy to disable the migration.

These look unlikely to be merged: they seem to go
contrary to the table-driver approach to migration that we are trying to
take.

 Anyway register_device_unmigratable() seems to need some clean up
 for DeviceInfo::vmsd. Any ideas?

So clean it up if you like.

 -- 
 yamahata

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-16 Thread Michael S. Tsirkin

On Thu, Dec 16, 2010 at 04:08:16PM +0900, Isaku Yamahata wrote:
 On Wed, Dec 15, 2010 at 08:27:49AM -0700, Alex Williamson wrote:
  On Wed, 2010-12-15 at 11:56 +0200, Michael S. Tsirkin wrote:
   On Tue, Dec 14, 2010 at 11:34:53AM -0700, Alex Williamson wrote:
On Tue, 2010-12-14 at 14:26 +0200, Michael S. Tsirkin wrote:
 On Mon, Dec 13, 2010 at 10:04:24PM -0700, Alex Williamson wrote:
  
  I've only ever seen config[PCI_SECONDARY_BUS] be non-zero for an
  assigned device, so I'm pretty sure we're not going to hurt 
  migration,
  but the code is clearly wrong and I'd like to make sure we don't 
  trip on
  a migration failure for a minor device config space change.
 
 Which reminds me: maybe just mark nested bridges as non-migrateable
 for now?  Care writing such a patch?

Hmm, this is trickier than it sounds.
   
   Hmm, since 0 is put in the path instead of the bridge number,
   will the correct bridge be restored?
   
 We're really only broken wrt
migration if a device under a bridge calls qemu_ram_alloc.
   
   I guess there's more broken-ness. What exactly breaks qemu_ram_alloc?
  
  You're right, it's more broken than that.  Anything that calls
  get_dev_path is broken for migration of bridges since the path is
  determined before the guest updates bus numbers.  That includes
  qemu_ram_alloc and vmstate.  I was only looking at the qemu_ram_alloc
  side.  So perhaps the right answer, for the moment, is to block
  migration if there's a p2p bridge.
 
 Eww. That's bad. Anyway, I agree to disable it for the moment.

Patch?

 Let's fix it after 0.14 release.
 -- 
 yamahata

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-15 Thread Michael S. Tsirkin

On Tue, Dec 14, 2010 at 11:34:53AM -0700, Alex Williamson wrote:
 On Tue, 2010-12-14 at 14:26 +0200, Michael S. Tsirkin wrote:
  On Mon, Dec 13, 2010 at 10:04:24PM -0700, Alex Williamson wrote:
   
   I've only ever seen config[PCI_SECONDARY_BUS] be non-zero for an
   assigned device, so I'm pretty sure we're not going to hurt migration,
   but the code is clearly wrong and I'd like to make sure we don't trip on
   a migration failure for a minor device config space change.
  
  Which reminds me: maybe just mark nested bridges as non-migrateable
  for now?  Care writing such a patch?
 
 Hmm, this is trickier than it sounds.

Hmm, since 0 is put in the path instead of the bridge number,
will the correct bridge be restored?

  We're really only broken wrt
 migration if a device under a bridge calls qemu_ram_alloc.

I guess there's more broken-ness. What exactly breaks qemu_ram_alloc?

  Any device
 is free to do this, but typically it only happens via
 pci_add_option_rom() (not counting vga as typical).  So maybe the better
 approach for now is to prevent the problem by disallowing option ROMs
 for devices below a bridge.  We obviously risk devices coming along that
 allocate RAM on their own, but we could still allow the most common
 issue with almost no lost functionality (assuming no one wants to boot
 off that nested device).  Thoughts?  Thanks,
 
 Alex

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-15 Thread Alex Williamson

On Wed, 2010-12-15 at 11:56 +0200, Michael S. Tsirkin wrote:
 On Tue, Dec 14, 2010 at 11:34:53AM -0700, Alex Williamson wrote:
  On Tue, 2010-12-14 at 14:26 +0200, Michael S. Tsirkin wrote:
   On Mon, Dec 13, 2010 at 10:04:24PM -0700, Alex Williamson wrote:

I've only ever seen config[PCI_SECONDARY_BUS] be non-zero for an
assigned device, so I'm pretty sure we're not going to hurt migration,
but the code is clearly wrong and I'd like to make sure we don't trip on
a migration failure for a minor device config space change.
   
   Which reminds me: maybe just mark nested bridges as non-migrateable
   for now?  Care writing such a patch?
  
  Hmm, this is trickier than it sounds.
 
 Hmm, since 0 is put in the path instead of the bridge number,
 will the correct bridge be restored?
 
   We're really only broken wrt
  migration if a device under a bridge calls qemu_ram_alloc.
 
 I guess there's more broken-ness. What exactly breaks qemu_ram_alloc?

You're right, it's more broken than that.  Anything that calls
get_dev_path is broken for migration of bridges since the path is
determined before the guest updates bus numbers.  That includes
qemu_ram_alloc and vmstate.  I was only looking at the qemu_ram_alloc
side.  So perhaps the right answer, for the moment, is to block
migration if there's a p2p bridge.

Alex

   Any device
  is free to do this, but typically it only happens via
  pci_add_option_rom() (not counting vga as typical).  So maybe the better
  approach for now is to prevent the problem by disallowing option ROMs
  for devices below a bridge.  We obviously risk devices coming along that
  allocate RAM on their own, but we could still allow the most common
  issue with almost no lost functionality (assuming no one wants to boot
  off that nested device).  Thoughts?  Thanks,
  
  Alex

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-15 Thread Isaku Yamahata

On Wed, Dec 15, 2010 at 08:27:49AM -0700, Alex Williamson wrote:
 On Wed, 2010-12-15 at 11:56 +0200, Michael S. Tsirkin wrote:
  On Tue, Dec 14, 2010 at 11:34:53AM -0700, Alex Williamson wrote:
   On Tue, 2010-12-14 at 14:26 +0200, Michael S. Tsirkin wrote:
On Mon, Dec 13, 2010 at 10:04:24PM -0700, Alex Williamson wrote:
 
 I've only ever seen config[PCI_SECONDARY_BUS] be non-zero for an
 assigned device, so I'm pretty sure we're not going to hurt migration,
 but the code is clearly wrong and I'd like to make sure we don't trip 
 on
 a migration failure for a minor device config space change.

Which reminds me: maybe just mark nested bridges as non-migrateable
for now?  Care writing such a patch?
   
   Hmm, this is trickier than it sounds.
  
  Hmm, since 0 is put in the path instead of the bridge number,
  will the correct bridge be restored?
  
We're really only broken wrt
   migration if a device under a bridge calls qemu_ram_alloc.
  
  I guess there's more broken-ness. What exactly breaks qemu_ram_alloc?
 
 You're right, it's more broken than that.  Anything that calls
 get_dev_path is broken for migration of bridges since the path is
 determined before the guest updates bus numbers.  That includes
 qemu_ram_alloc and vmstate.  I was only looking at the qemu_ram_alloc
 side.  So perhaps the right answer, for the moment, is to block
 migration if there's a p2p bridge.

Eww. That's bad. Anyway, I agree to disable it for the moment.
Let's fix it after 0.14 release.
-- 
yamahata

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-14 Thread Michael S. Tsirkin

On Mon, Dec 13, 2010 at 10:04:24PM -0700, Alex Williamson wrote:
 On Tue, 2010-12-14 at 06:57 +0200, Michael S. Tsirkin wrote:
  On Mon, Dec 13, 2010 at 09:49:21PM -0700, Alex Williamson wrote:
   On Tue, 2010-12-14 at 06:46 +0200, Michael S. Tsirkin wrote:
On Mon, Dec 13, 2010 at 01:04:23PM -0700, Alex Williamson wrote:
 On Mon, 2010-11-08 at 13:22 +0200, Michael S. Tsirkin wrote:
  On Mon, Oct 04, 2010 at 03:53:11PM -0600, Alex Williamson wrote:
   pcibus_dev_print() was erroneously retrieving the device bus
   number from the secondary bus number offset of the device
   instead of the bridge above the device.  This ends of landing
   in the 2nd byte of the 3rd BAR for devices, which thankfully
   is usually zero.  pcibus_get_dev_path() copied this code,
   inheriting the same bug.  pcibus_get_dev_path() is used for
   ramblock naming, so changing it can effect migration.  However,
   I've only seen this byte be non-zero for an assigned device,
   which can't migrate anyway, so hopefully we won't run into
   any issues.
   
   Signed-off-by: Alex Williamson alex.william...@redhat.com
  
  Good catch. Applied.
 
 Um... submitted vs applied:
 
  PCI: Bus number from the bridge, not the device
  
 @@ -6,20 +8,28 @@
  number from the secondary bus number offset of the device
  instead of the bridge above the device.  This ends of landing
  in the 2nd byte of the 3rd BAR for devices, which thankfully
 -is usually zero.  pcibus_get_dev_path() copied this code,
 +is usually zero.
 +
 +Note: pcibus_get_dev_path() copied this code,
  inheriting the same bug.  pcibus_get_dev_path() is used for
  ramblock naming, so changing it can effect migration.  However,
  I've only seen this byte be non-zero for an assigned device,
  which can't migrate anyway, so hopefully we won't run into
  any issues.
  
 +This patch does not touch pcibus_get_dev_path, as
 +bus number is guest assigned for nested buses,
 +so using it for migration is broken anyway.
 +Fix it properly later.
 +
  Signed-off-by: Alex Williamson alex.william...@redhat.com
 +Signed-off-by: Michael S. Tsirkin m...@redhat.com
  
  diff --git a/hw/pci.c b/hw/pci.c
 -index 6d0934d..15416dd 100644
 +index 962886e..8f6fcf8 100644
  --- a/hw/pci.c
  +++ b/hw/pci.c
 -@@ -1940,8 +1940,7 @@ static void pcibus_dev_print(Monitor *mon, 
 DeviceState *dev, int indent)
 +@@ -1806,8 +1806,7 @@ static void pcibus_dev_print(Monitor *mon, 
 DeviceState *dev, int indent)
   
   monitor_printf(mon, %*sclass %s, addr %02x:%02x.%x, 
  pci id %04x:%04x (sub %04x:%04x)\n,
 @@ -29,14 +39,3 @@
  PCI_SLOT(d-devfn), PCI_FUNC(d-devfn),
  pci_get_word(d-config + PCI_VENDOR_ID),
  pci_get_word(d-config + PCI_DEVICE_ID),
 -@@ -1965,7 +1964,7 @@ static char *pcibus_get_dev_path(DeviceState 
 *dev)
 - char path[16];
 - 
 - snprintf(path, sizeof(path), %04x:%02x:%02x.%x,
 -- pci_find_domain(d-bus), d-config[PCI_SECONDARY_BUS],
 -+ pci_find_domain(d-bus), pci_bus_num(d-bus),
 -  PCI_SLOT(d-devfn), PCI_FUNC(d-devfn));
 - 
 - return strdup(path);
 -
 -
 
 So the chunk that fixed the part that I was actually interested in got
 dropped even though the existing code is clearly wrong.  Yes, we still
 have issues with nested bridges (not that we have many of those), but
 until the Fix it properly later part comes along, can we please
 include the obvious bug fix?  Thanks,
 
 Alex

We can stick 0 in there - would that help?  I would much rather not
create a version where we put the bus number there.
   
   Yep, 0 is good enough until we solve the nested bridge problem.  Thanks,
   
   Alex
  
  I'm surprised you see that it matters in practice, but ok.
  Like this?
 
 I've only ever seen config[PCI_SECONDARY_BUS] be non-zero for an
 assigned device, so I'm pretty sure we're not going to hurt migration,
 but the code is clearly wrong and I'd like to make sure we don't trip on
 a migration failure for a minor device config space change.

Which reminds me: maybe just mark nested bridges as non-migrateable
for now?  Care writing such a patch?

  diff --git a/hw/pci.c b/hw/pci.c
  index 254647b..81231c5 100644
  --- a/hw/pci.c
  +++ b/hw/pci.c
  @@ -1952,7 +1952,10 @@ static char *pcibus_get_dev_path(DeviceState *dev)
   char path[16];
   
   snprintf(path, sizeof(path), %04x:%02x:%02x.%x,
  - pci_find_domain(d-bus), d-config[PCI_SECONDARY_BUS],
  + pci_find_domain(d-bus),
  + 0 /* TODO: need a persistent path for nested buses.
  +

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-14 Thread Alex Williamson

On Tue, 2010-12-14 at 14:26 +0200, Michael S. Tsirkin wrote:
 On Mon, Dec 13, 2010 at 10:04:24PM -0700, Alex Williamson wrote:
  
  I've only ever seen config[PCI_SECONDARY_BUS] be non-zero for an
  assigned device, so I'm pretty sure we're not going to hurt migration,
  but the code is clearly wrong and I'd like to make sure we don't trip on
  a migration failure for a minor device config space change.
 
 Which reminds me: maybe just mark nested bridges as non-migrateable
 for now?  Care writing such a patch?

Hmm, this is trickier than it sounds.  We're really only broken wrt
migration if a device under a bridge calls qemu_ram_alloc.  Any device
is free to do this, but typically it only happens via
pci_add_option_rom() (not counting vga as typical).  So maybe the better
approach for now is to prevent the problem by disallowing option ROMs
for devices below a bridge.  We obviously risk devices coming along that
allocate RAM on their own, but we could still allow the most common
issue with almost no lost functionality (assuming no one wants to boot
off that nested device).  Thoughts?  Thanks,

Alex

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-13 Thread Alex Williamson

On Mon, 2010-11-08 at 13:22 +0200, Michael S. Tsirkin wrote:
 On Mon, Oct 04, 2010 at 03:53:11PM -0600, Alex Williamson wrote:
  pcibus_dev_print() was erroneously retrieving the device bus
  number from the secondary bus number offset of the device
  instead of the bridge above the device.  This ends of landing
  in the 2nd byte of the 3rd BAR for devices, which thankfully
  is usually zero.  pcibus_get_dev_path() copied this code,
  inheriting the same bug.  pcibus_get_dev_path() is used for
  ramblock naming, so changing it can effect migration.  However,
  I've only seen this byte be non-zero for an assigned device,
  which can't migrate anyway, so hopefully we won't run into
  any issues.
  
  Signed-off-by: Alex Williamson alex.william...@redhat.com
 
 Good catch. Applied.

Um... submitted vs applied:

 PCI: Bus number from the bridge, not the device
 
@@ -6,20 +8,28 @@
 number from the secondary bus number offset of the device
 instead of the bridge above the device.  This ends of landing
 in the 2nd byte of the 3rd BAR for devices, which thankfully
-is usually zero.  pcibus_get_dev_path() copied this code,
+is usually zero.
+
+Note: pcibus_get_dev_path() copied this code,
 inheriting the same bug.  pcibus_get_dev_path() is used for
 ramblock naming, so changing it can effect migration.  However,
 I've only seen this byte be non-zero for an assigned device,
 which can't migrate anyway, so hopefully we won't run into
 any issues.
 
+This patch does not touch pcibus_get_dev_path, as
+bus number is guest assigned for nested buses,
+so using it for migration is broken anyway.
+Fix it properly later.
+
 Signed-off-by: Alex Williamson alex.william...@redhat.com
+Signed-off-by: Michael S. Tsirkin m...@redhat.com
 
 diff --git a/hw/pci.c b/hw/pci.c
-index 6d0934d..15416dd 100644
+index 962886e..8f6fcf8 100644
 --- a/hw/pci.c
 +++ b/hw/pci.c
-@@ -1940,8 +1940,7 @@ static void pcibus_dev_print(Monitor *mon, DeviceState 
*dev, int indent)
+@@ -1806,8 +1806,7 @@ static void pcibus_dev_print(Monitor *mon, DeviceState 
*dev, int indent)
  
  monitor_printf(mon, %*sclass %s, addr %02x:%02x.%x, 
 pci id %04x:%04x (sub %04x:%04x)\n,
@@ -29,14 +39,3 @@
 PCI_SLOT(d-devfn), PCI_FUNC(d-devfn),
 pci_get_word(d-config + PCI_VENDOR_ID),
 pci_get_word(d-config + PCI_DEVICE_ID),
-@@ -1965,7 +1964,7 @@ static char *pcibus_get_dev_path(DeviceState *dev)
- char path[16];
- 
- snprintf(path, sizeof(path), %04x:%02x:%02x.%x,
-- pci_find_domain(d-bus), d-config[PCI_SECONDARY_BUS],
-+ pci_find_domain(d-bus), pci_bus_num(d-bus),
-  PCI_SLOT(d-devfn), PCI_FUNC(d-devfn));
- 
- return strdup(path);
-
-

So the chunk that fixed the part that I was actually interested in got
dropped even though the existing code is clearly wrong.  Yes, we still
have issues with nested bridges (not that we have many of those), but
until the Fix it properly later part comes along, can we please
include the obvious bug fix?  Thanks,

Alex

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-13 Thread Michael S. Tsirkin

On Mon, Dec 13, 2010 at 01:04:23PM -0700, Alex Williamson wrote:
 On Mon, 2010-11-08 at 13:22 +0200, Michael S. Tsirkin wrote:
  On Mon, Oct 04, 2010 at 03:53:11PM -0600, Alex Williamson wrote:
   pcibus_dev_print() was erroneously retrieving the device bus
   number from the secondary bus number offset of the device
   instead of the bridge above the device.  This ends of landing
   in the 2nd byte of the 3rd BAR for devices, which thankfully
   is usually zero.  pcibus_get_dev_path() copied this code,
   inheriting the same bug.  pcibus_get_dev_path() is used for
   ramblock naming, so changing it can effect migration.  However,
   I've only seen this byte be non-zero for an assigned device,
   which can't migrate anyway, so hopefully we won't run into
   any issues.
   
   Signed-off-by: Alex Williamson alex.william...@redhat.com
  
  Good catch. Applied.
 
 Um... submitted vs applied:
 
  PCI: Bus number from the bridge, not the device
  
 @@ -6,20 +8,28 @@
  number from the secondary bus number offset of the device
  instead of the bridge above the device.  This ends of landing
  in the 2nd byte of the 3rd BAR for devices, which thankfully
 -is usually zero.  pcibus_get_dev_path() copied this code,
 +is usually zero.
 +
 +Note: pcibus_get_dev_path() copied this code,
  inheriting the same bug.  pcibus_get_dev_path() is used for
  ramblock naming, so changing it can effect migration.  However,
  I've only seen this byte be non-zero for an assigned device,
  which can't migrate anyway, so hopefully we won't run into
  any issues.
  
 +This patch does not touch pcibus_get_dev_path, as
 +bus number is guest assigned for nested buses,
 +so using it for migration is broken anyway.
 +Fix it properly later.
 +
  Signed-off-by: Alex Williamson alex.william...@redhat.com
 +Signed-off-by: Michael S. Tsirkin m...@redhat.com
  
  diff --git a/hw/pci.c b/hw/pci.c
 -index 6d0934d..15416dd 100644
 +index 962886e..8f6fcf8 100644
  --- a/hw/pci.c
  +++ b/hw/pci.c
 -@@ -1940,8 +1940,7 @@ static void pcibus_dev_print(Monitor *mon, DeviceState 
 *dev, int indent)
 +@@ -1806,8 +1806,7 @@ static void pcibus_dev_print(Monitor *mon, DeviceState 
 *dev, int indent)
   
   monitor_printf(mon, %*sclass %s, addr %02x:%02x.%x, 
  pci id %04x:%04x (sub %04x:%04x)\n,
 @@ -29,14 +39,3 @@
  PCI_SLOT(d-devfn), PCI_FUNC(d-devfn),
  pci_get_word(d-config + PCI_VENDOR_ID),
  pci_get_word(d-config + PCI_DEVICE_ID),
 -@@ -1965,7 +1964,7 @@ static char *pcibus_get_dev_path(DeviceState *dev)
 - char path[16];
 - 
 - snprintf(path, sizeof(path), %04x:%02x:%02x.%x,
 -- pci_find_domain(d-bus), d-config[PCI_SECONDARY_BUS],
 -+ pci_find_domain(d-bus), pci_bus_num(d-bus),
 -  PCI_SLOT(d-devfn), PCI_FUNC(d-devfn));
 - 
 - return strdup(path);
 -
 -
 
 So the chunk that fixed the part that I was actually interested in got
 dropped even though the existing code is clearly wrong.  Yes, we still
 have issues with nested bridges (not that we have many of those), but
 until the Fix it properly later part comes along, can we please
 include the obvious bug fix?  Thanks,
 
 Alex

We can stick 0 in there - would that help?  I would much rather not
create a version where we put the bus number there.

-- 
MST

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-13 Thread Alex Williamson

On Tue, 2010-12-14 at 06:46 +0200, Michael S. Tsirkin wrote:
 On Mon, Dec 13, 2010 at 01:04:23PM -0700, Alex Williamson wrote:
  On Mon, 2010-11-08 at 13:22 +0200, Michael S. Tsirkin wrote:
   On Mon, Oct 04, 2010 at 03:53:11PM -0600, Alex Williamson wrote:
pcibus_dev_print() was erroneously retrieving the device bus
number from the secondary bus number offset of the device
instead of the bridge above the device.  This ends of landing
in the 2nd byte of the 3rd BAR for devices, which thankfully
is usually zero.  pcibus_get_dev_path() copied this code,
inheriting the same bug.  pcibus_get_dev_path() is used for
ramblock naming, so changing it can effect migration.  However,
I've only seen this byte be non-zero for an assigned device,
which can't migrate anyway, so hopefully we won't run into
any issues.

Signed-off-by: Alex Williamson alex.william...@redhat.com
   
   Good catch. Applied.
  
  Um... submitted vs applied:
  
   PCI: Bus number from the bridge, not the device
   
  @@ -6,20 +8,28 @@
   number from the secondary bus number offset of the device
   instead of the bridge above the device.  This ends of landing
   in the 2nd byte of the 3rd BAR for devices, which thankfully
  -is usually zero.  pcibus_get_dev_path() copied this code,
  +is usually zero.
  +
  +Note: pcibus_get_dev_path() copied this code,
   inheriting the same bug.  pcibus_get_dev_path() is used for
   ramblock naming, so changing it can effect migration.  However,
   I've only seen this byte be non-zero for an assigned device,
   which can't migrate anyway, so hopefully we won't run into
   any issues.
   
  +This patch does not touch pcibus_get_dev_path, as
  +bus number is guest assigned for nested buses,
  +so using it for migration is broken anyway.
  +Fix it properly later.
  +
   Signed-off-by: Alex Williamson alex.william...@redhat.com
  +Signed-off-by: Michael S. Tsirkin m...@redhat.com
   
   diff --git a/hw/pci.c b/hw/pci.c
  -index 6d0934d..15416dd 100644
  +index 962886e..8f6fcf8 100644
   --- a/hw/pci.c
   +++ b/hw/pci.c
  -@@ -1940,8 +1940,7 @@ static void pcibus_dev_print(Monitor *mon, 
  DeviceState *dev, int indent)
  +@@ -1806,8 +1806,7 @@ static void pcibus_dev_print(Monitor *mon, 
  DeviceState *dev, int indent)

monitor_printf(mon, %*sclass %s, addr %02x:%02x.%x, 
   pci id %04x:%04x (sub %04x:%04x)\n,
  @@ -29,14 +39,3 @@
   PCI_SLOT(d-devfn), PCI_FUNC(d-devfn),
   pci_get_word(d-config + PCI_VENDOR_ID),
   pci_get_word(d-config + PCI_DEVICE_ID),
  -@@ -1965,7 +1964,7 @@ static char *pcibus_get_dev_path(DeviceState *dev)
  - char path[16];
  - 
  - snprintf(path, sizeof(path), %04x:%02x:%02x.%x,
  -- pci_find_domain(d-bus), d-config[PCI_SECONDARY_BUS],
  -+ pci_find_domain(d-bus), pci_bus_num(d-bus),
  -  PCI_SLOT(d-devfn), PCI_FUNC(d-devfn));
  - 
  - return strdup(path);
  -
  -
  
  So the chunk that fixed the part that I was actually interested in got
  dropped even though the existing code is clearly wrong.  Yes, we still
  have issues with nested bridges (not that we have many of those), but
  until the Fix it properly later part comes along, can we please
  include the obvious bug fix?  Thanks,
  
  Alex
 
 We can stick 0 in there - would that help?  I would much rather not
 create a version where we put the bus number there.

Yep, 0 is good enough until we solve the nested bridge problem.  Thanks,

Alex

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-13 Thread Michael S. Tsirkin

On Mon, Dec 13, 2010 at 09:49:21PM -0700, Alex Williamson wrote:
 On Tue, 2010-12-14 at 06:46 +0200, Michael S. Tsirkin wrote:
  On Mon, Dec 13, 2010 at 01:04:23PM -0700, Alex Williamson wrote:
   On Mon, 2010-11-08 at 13:22 +0200, Michael S. Tsirkin wrote:
On Mon, Oct 04, 2010 at 03:53:11PM -0600, Alex Williamson wrote:
 pcibus_dev_print() was erroneously retrieving the device bus
 number from the secondary bus number offset of the device
 instead of the bridge above the device.  This ends of landing
 in the 2nd byte of the 3rd BAR for devices, which thankfully
 is usually zero.  pcibus_get_dev_path() copied this code,
 inheriting the same bug.  pcibus_get_dev_path() is used for
 ramblock naming, so changing it can effect migration.  However,
 I've only seen this byte be non-zero for an assigned device,
 which can't migrate anyway, so hopefully we won't run into
 any issues.
 
 Signed-off-by: Alex Williamson alex.william...@redhat.com

Good catch. Applied.
   
   Um... submitted vs applied:
   
PCI: Bus number from the bridge, not the device

   @@ -6,20 +8,28 @@
number from the secondary bus number offset of the device
instead of the bridge above the device.  This ends of landing
in the 2nd byte of the 3rd BAR for devices, which thankfully
   -is usually zero.  pcibus_get_dev_path() copied this code,
   +is usually zero.
   +
   +Note: pcibus_get_dev_path() copied this code,
inheriting the same bug.  pcibus_get_dev_path() is used for
ramblock naming, so changing it can effect migration.  However,
I've only seen this byte be non-zero for an assigned device,
which can't migrate anyway, so hopefully we won't run into
any issues.

   +This patch does not touch pcibus_get_dev_path, as
   +bus number is guest assigned for nested buses,
   +so using it for migration is broken anyway.
   +Fix it properly later.
   +
Signed-off-by: Alex Williamson alex.william...@redhat.com
   +Signed-off-by: Michael S. Tsirkin m...@redhat.com

diff --git a/hw/pci.c b/hw/pci.c
   -index 6d0934d..15416dd 100644
   +index 962886e..8f6fcf8 100644
--- a/hw/pci.c
+++ b/hw/pci.c
   -@@ -1940,8 +1940,7 @@ static void pcibus_dev_print(Monitor *mon, 
   DeviceState *dev, int indent)
   +@@ -1806,8 +1806,7 @@ static void pcibus_dev_print(Monitor *mon, 
   DeviceState *dev, int indent)
 
 monitor_printf(mon, %*sclass %s, addr %02x:%02x.%x, 
pci id %04x:%04x (sub %04x:%04x)\n,
   @@ -29,14 +39,3 @@
PCI_SLOT(d-devfn), PCI_FUNC(d-devfn),
pci_get_word(d-config + PCI_VENDOR_ID),
pci_get_word(d-config + PCI_DEVICE_ID),
   -@@ -1965,7 +1964,7 @@ static char *pcibus_get_dev_path(DeviceState *dev)
   - char path[16];
   - 
   - snprintf(path, sizeof(path), %04x:%02x:%02x.%x,
   -- pci_find_domain(d-bus), d-config[PCI_SECONDARY_BUS],
   -+ pci_find_domain(d-bus), pci_bus_num(d-bus),
   -  PCI_SLOT(d-devfn), PCI_FUNC(d-devfn));
   - 
   - return strdup(path);
   -
   -
   
   So the chunk that fixed the part that I was actually interested in got
   dropped even though the existing code is clearly wrong.  Yes, we still
   have issues with nested bridges (not that we have many of those), but
   until the Fix it properly later part comes along, can we please
   include the obvious bug fix?  Thanks,
   
   Alex
  
  We can stick 0 in there - would that help?  I would much rather not
  create a version where we put the bus number there.
 
 Yep, 0 is good enough until we solve the nested bridge problem.  Thanks,
 
 Alex

I'm surprised you see that it matters in practice, but ok.
Like this?

diff --git a/hw/pci.c b/hw/pci.c
index 254647b..81231c5 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -1952,7 +1952,10 @@ static char *pcibus_get_dev_path(DeviceState *dev)
 char path[16];
 
 snprintf(path, sizeof(path), %04x:%02x:%02x.%x,
- pci_find_domain(d-bus), d-config[PCI_SECONDARY_BUS],
+ pci_find_domain(d-bus),
+ 0 /* TODO: need a persistent path for nested buses.
+* Note: pci_bus_num(d-bus) is not right as it's guest
+* assigned. */,
  PCI_SLOT(d-devfn), PCI_FUNC(d-devfn));
 
 return strdup(path);

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-12-13 Thread Alex Williamson

On Tue, 2010-12-14 at 06:57 +0200, Michael S. Tsirkin wrote:
 On Mon, Dec 13, 2010 at 09:49:21PM -0700, Alex Williamson wrote:
  On Tue, 2010-12-14 at 06:46 +0200, Michael S. Tsirkin wrote:
   On Mon, Dec 13, 2010 at 01:04:23PM -0700, Alex Williamson wrote:
On Mon, 2010-11-08 at 13:22 +0200, Michael S. Tsirkin wrote:
 On Mon, Oct 04, 2010 at 03:53:11PM -0600, Alex Williamson wrote:
  pcibus_dev_print() was erroneously retrieving the device bus
  number from the secondary bus number offset of the device
  instead of the bridge above the device.  This ends of landing
  in the 2nd byte of the 3rd BAR for devices, which thankfully
  is usually zero.  pcibus_get_dev_path() copied this code,
  inheriting the same bug.  pcibus_get_dev_path() is used for
  ramblock naming, so changing it can effect migration.  However,
  I've only seen this byte be non-zero for an assigned device,
  which can't migrate anyway, so hopefully we won't run into
  any issues.
  
  Signed-off-by: Alex Williamson alex.william...@redhat.com
 
 Good catch. Applied.

Um... submitted vs applied:

 PCI: Bus number from the bridge, not the device
 
@@ -6,20 +8,28 @@
 number from the secondary bus number offset of the device
 instead of the bridge above the device.  This ends of landing
 in the 2nd byte of the 3rd BAR for devices, which thankfully
-is usually zero.  pcibus_get_dev_path() copied this code,
+is usually zero.
+
+Note: pcibus_get_dev_path() copied this code,
 inheriting the same bug.  pcibus_get_dev_path() is used for
 ramblock naming, so changing it can effect migration.  However,
 I've only seen this byte be non-zero for an assigned device,
 which can't migrate anyway, so hopefully we won't run into
 any issues.
 
+This patch does not touch pcibus_get_dev_path, as
+bus number is guest assigned for nested buses,
+so using it for migration is broken anyway.
+Fix it properly later.
+
 Signed-off-by: Alex Williamson alex.william...@redhat.com
+Signed-off-by: Michael S. Tsirkin m...@redhat.com
 
 diff --git a/hw/pci.c b/hw/pci.c
-index 6d0934d..15416dd 100644
+index 962886e..8f6fcf8 100644
 --- a/hw/pci.c
 +++ b/hw/pci.c
-@@ -1940,8 +1940,7 @@ static void pcibus_dev_print(Monitor *mon, 
DeviceState *dev, int indent)
+@@ -1806,8 +1806,7 @@ static void pcibus_dev_print(Monitor *mon, 
DeviceState *dev, int indent)
  
  monitor_printf(mon, %*sclass %s, addr %02x:%02x.%x, 
 pci id %04x:%04x (sub %04x:%04x)\n,
@@ -29,14 +39,3 @@
 PCI_SLOT(d-devfn), PCI_FUNC(d-devfn),
 pci_get_word(d-config + PCI_VENDOR_ID),
 pci_get_word(d-config + PCI_DEVICE_ID),
-@@ -1965,7 +1964,7 @@ static char *pcibus_get_dev_path(DeviceState 
*dev)
- char path[16];
- 
- snprintf(path, sizeof(path), %04x:%02x:%02x.%x,
-- pci_find_domain(d-bus), d-config[PCI_SECONDARY_BUS],
-+ pci_find_domain(d-bus), pci_bus_num(d-bus),
-  PCI_SLOT(d-devfn), PCI_FUNC(d-devfn));
- 
- return strdup(path);
-
-

So the chunk that fixed the part that I was actually interested in got
dropped even though the existing code is clearly wrong.  Yes, we still
have issues with nested bridges (not that we have many of those), but
until the Fix it properly later part comes along, can we please
include the obvious bug fix?  Thanks,

Alex
   
   We can stick 0 in there - would that help?  I would much rather not
   create a version where we put the bus number there.
  
  Yep, 0 is good enough until we solve the nested bridge problem.  Thanks,
  
  Alex
 
 I'm surprised you see that it matters in practice, but ok.
 Like this?

I've only ever seen config[PCI_SECONDARY_BUS] be non-zero for an
assigned device, so I'm pretty sure we're not going to hurt migration,
but the code is clearly wrong and I'd like to make sure we don't trip on
a migration failure for a minor device config space change.

 diff --git a/hw/pci.c b/hw/pci.c
 index 254647b..81231c5 100644
 --- a/hw/pci.c
 +++ b/hw/pci.c
 @@ -1952,7 +1952,10 @@ static char *pcibus_get_dev_path(DeviceState *dev)
  char path[16];
  
  snprintf(path, sizeof(path), %04x:%02x:%02x.%x,
 - pci_find_domain(d-bus), d-config[PCI_SECONDARY_BUS],
 + pci_find_domain(d-bus),
 + 0 /* TODO: need a persistent path for nested buses.
 +* Note: pci_bus_num(d-bus) is not right as it's guest
 +* assigned. */,
   PCI_SLOT(d-devfn), PCI_FUNC(d-devfn));
  
  return strdup(path);

Sure, that's fine.

Acked-by: Alex Williamson alex.william...@redhat.com

Thanks,

Alex

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-22 Thread Michael S. Tsirkin

On Mon, Nov 22, 2010 at 09:37:07AM +0200, Gleb Natapov wrote:
It's probably required to make them stable anyway.

   Why?
  
  To avoid bus renumbering on reboot after you add a pci-to-pci bridge.
  
 Why should qemu care?

Stable bus numbering is a feature *users* care about, because
some Guest OSes get confused when a card gets moved to another
bus.

 --
   Gleb.

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-22 Thread Gleb Natapov

On Mon, Nov 22, 2010 at 10:16:57AM +0200, Michael S. Tsirkin wrote:
 On Mon, Nov 22, 2010 at 09:37:07AM +0200, Gleb Natapov wrote:
 It's probably required to make them stable anyway.
 
Why?
   
   To avoid bus renumbering on reboot after you add a pci-to-pci bridge.
   
  Why should qemu care?
 
 Stable bus numbering is a feature *users* care about, because
 some Guest OSes get confused when a card gets moved to another
 bus.
 
So if user cares about it it should not change HW configuration of QEMU.
I guess those OSes knows how to handle hot-pluggable equipment otherwise
they will get confused on real HW too. Why QEMU should care to preserve
something in a face of configuration change?

--
Gleb.

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-22 Thread Michael S. Tsirkin

On Mon, Nov 22, 2010 at 04:52:32PM +0200, Gleb Natapov wrote:
 On Mon, Nov 22, 2010 at 04:50:29PM +0200, Michael S. Tsirkin wrote:
  On Mon, Nov 22, 2010 at 03:04:51PM +0200, Gleb Natapov wrote:
   On Mon, Nov 22, 2010 at 10:16:57AM +0200, Michael S. Tsirkin wrote:
On Mon, Nov 22, 2010 at 09:37:07AM +0200, Gleb Natapov wrote:
It's probably required to make them stable anyway.

   Why?
  
  To avoid bus renumbering on reboot after you add a pci-to-pci 
  bridge.
  
 Why should qemu care?

Stable bus numbering is a feature *users* care about, because
some Guest OSes get confused when a card gets moved to another
bus.

   So if user cares about it it should not change HW configuration of QEMU.
   I guess those OSes knows how to handle hot-pluggable equipment otherwise
   they will get confused on real HW too. Why QEMU should care to preserve
   something in a face of configuration change?
   
   --
 Gleb.
  
  We've been there, weren't we? See
  http://fedoraproject.org/wiki/Features/KVM_Stable_PCI_Addresses#KVM_Stable_PCI_Addresses
  
 This is about stable HW configuration.

Exactly. We have the same need for nested bridges and devices behind
them.

 --
   Gleb.

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-22 Thread Gleb Natapov

On Mon, Nov 22, 2010 at 04:56:16PM +0200, Michael S. Tsirkin wrote:
 On Mon, Nov 22, 2010 at 04:52:32PM +0200, Gleb Natapov wrote:
  On Mon, Nov 22, 2010 at 04:50:29PM +0200, Michael S. Tsirkin wrote:
   On Mon, Nov 22, 2010 at 03:04:51PM +0200, Gleb Natapov wrote:
On Mon, Nov 22, 2010 at 10:16:57AM +0200, Michael S. Tsirkin wrote:
 On Mon, Nov 22, 2010 at 09:37:07AM +0200, Gleb Natapov wrote:
 It's probably required to make them stable anyway.
 
Why?
   
   To avoid bus renumbering on reboot after you add a pci-to-pci 
   bridge.
   
  Why should qemu care?
 
 Stable bus numbering is a feature *users* care about, because
 some Guest OSes get confused when a card gets moved to another
 bus.
 
So if user cares about it it should not change HW configuration of QEMU.
I guess those OSes knows how to handle hot-pluggable equipment otherwise
they will get confused on real HW too. Why QEMU should care to preserve
something in a face of configuration change?

--
Gleb.
   
   We've been there, weren't we? See
   http://fedoraproject.org/wiki/Features/KVM_Stable_PCI_Addresses#KVM_Stable_PCI_Addresses
   
  This is about stable HW configuration.
 
 Exactly. We have the same need for nested bridges and devices behind
 them.
 
And now you are talking about topology, not guest assigned bus numbers.
So suddenly you are on my side? I don't even get what are you arguing
about at this point.

--
Gleb.

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-22 Thread Michael S. Tsirkin

On Mon, Nov 22, 2010 at 04:58:11PM +0200, Gleb Natapov wrote:
 On Mon, Nov 22, 2010 at 04:56:16PM +0200, Michael S. Tsirkin wrote:
  On Mon, Nov 22, 2010 at 04:52:32PM +0200, Gleb Natapov wrote:
   On Mon, Nov 22, 2010 at 04:50:29PM +0200, Michael S. Tsirkin wrote:
On Mon, Nov 22, 2010 at 03:04:51PM +0200, Gleb Natapov wrote:
 On Mon, Nov 22, 2010 at 10:16:57AM +0200, Michael S. Tsirkin wrote:
  On Mon, Nov 22, 2010 at 09:37:07AM +0200, Gleb Natapov wrote:
  It's probably required to make them stable anyway.
  
 Why?

To avoid bus renumbering on reboot after you add a pci-to-pci 
bridge.

   Why should qemu care?
  
  Stable bus numbering is a feature *users* care about, because
  some Guest OSes get confused when a card gets moved to another
  bus.
  
 So if user cares about it it should not change HW configuration of 
 QEMU.
 I guess those OSes knows how to handle hot-pluggable equipment 
 otherwise
 they will get confused on real HW too. Why QEMU should care to 
 preserve
 something in a face of configuration change?
 
 --
   Gleb.

We've been there, weren't we? See
http://fedoraproject.org/wiki/Features/KVM_Stable_PCI_Addresses#KVM_Stable_PCI_Addresses

   This is about stable HW configuration.
  
  Exactly. We have the same need for nested bridges and devices behind
  them.
  
 And now you are talking about topology, not guest assigned bus numbers.

I suspect that if bus numbers change, it has the same effect as topology
change on the guest. Need to test though, I'm not sure.

 So suddenly you are on my side? I don't even get what are you arguing
 about at this point.

By this time, I forgot, too :).

 --
   Gleb.

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-22 Thread Gleb Natapov

On Mon, Nov 22, 2010 at 06:41:28PM +0200, Michael S. Tsirkin wrote:
 On Mon, Nov 22, 2010 at 04:58:11PM +0200, Gleb Natapov wrote:
  On Mon, Nov 22, 2010 at 04:56:16PM +0200, Michael S. Tsirkin wrote:
   On Mon, Nov 22, 2010 at 04:52:32PM +0200, Gleb Natapov wrote:
On Mon, Nov 22, 2010 at 04:50:29PM +0200, Michael S. Tsirkin wrote:
 On Mon, Nov 22, 2010 at 03:04:51PM +0200, Gleb Natapov wrote:
  On Mon, Nov 22, 2010 at 10:16:57AM +0200, Michael S. Tsirkin wrote:
   On Mon, Nov 22, 2010 at 09:37:07AM +0200, Gleb Natapov wrote:
   It's probably required to make them stable anyway.
   
  Why?
 
 To avoid bus renumbering on reboot after you add a pci-to-pci 
 bridge.
 
Why should qemu care?
   
   Stable bus numbering is a feature *users* care about, because
   some Guest OSes get confused when a card gets moved to another
   bus.
   
  So if user cares about it it should not change HW configuration of 
  QEMU.
  I guess those OSes knows how to handle hot-pluggable equipment 
  otherwise
  they will get confused on real HW too. Why QEMU should care to 
  preserve
  something in a face of configuration change?
  
  --
  Gleb.
 
 We've been there, weren't we? See
 http://fedoraproject.org/wiki/Features/KVM_Stable_PCI_Addresses#KVM_Stable_PCI_Addresses
 
This is about stable HW configuration.
   
   Exactly. We have the same need for nested bridges and devices behind
   them.
   
  And now you are talking about topology, not guest assigned bus numbers.
 
 I suspect that if bus numbers change, it has the same effect as topology
 change on the guest. Need to test though, I'm not sure.
 
Hard to believe. Unplugging card with internal pci-pci bridge may change
bus numbering.

  So suddenly you are on my side? I don't even get what are you arguing
  about at this point.
 
 By this time, I forgot, too :).
 
:)

--
Gleb.

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-21 Thread Gleb Natapov

On Sat, Nov 20, 2010 at 10:17:09PM +0200, Michael S. Tsirkin wrote:
 On Fri, Nov 19, 2010 at 10:38:42PM +0200, Gleb Natapov wrote:
  On Fri, Nov 19, 2010 at 06:02:58PM +0100, Markus Armbruster wrote:
   Michael S. Tsirkin m...@redhat.com writes:
   
On Tue, Nov 09, 2010 at 11:41:43AM +0900, Isaku Yamahata wrote:
On Mon, Nov 08, 2010 at 06:26:33PM +0200, Michael S. Tsirkin wrote:
 Replace bus number with slot numbers of parent bridges up to the 
 root.
 This works for root bridge in a compatible way because bus number 
 there
 is hard-coded to 0.
 IMO nested bridges are broken anyway, no way to be compatible there.
 
 
 Gleb, Markus, I think the following should be sufficient for PCI.  
 What
 do you think?  Also - do we need to update QMP/monitor to teach them 
 to
 work with these paths?
 
 This is on top of Alex's patch, completely untested.
 
 
 pci: fix device path for devices behind nested bridges
 
 We were using bus number in the device path, which is clearly
 broken as this number is guest-assigned for all devices
 except the root.
 
 Fix by using hierarchical list of slots, walking the path
 from root down to device, instead. Add :00 as bus number
 so that if there are no nested bridges, this is compatible
 with what we have now.

This format, Domain:00:Slot:Slot:Slot.Function, doesn't work
because pci-to-pci bridge is pci function.
So the format should be
Domain:00:Slot.Function:Slot.Function:Slot.Function

thanks,
   
Hmm, interesting. If we do this we aren't backwards compatible
though, so maybe we could try using openfirmware paths, just as well.
   
   Whatever we do, we need to make it work for all (qdevified) devices and
   buses.
   
   It should also be possible to use canonical addressing with device_add 
   friends.  I.e. permit naming a device by (a unique abbreviation of) its
   canonical address in addition to naming it by its user-defined ID.  For
   instance, something like
   
  device_del /pci/@1,1
   
  FWIW openbios allows this kind of abbreviation.
  
   in addition to
   
  device_del ID
   
   Open Firmware is a useful source of inspiration there, but should it
   come into conflict with usability, we should let usability win.
  
  --
  Gleb.
 
 
 I think that the domain (PCI segment group), bus, slot, function way to
 address pci devices is still the most familiar and the easiest to map to
Most familiar to whom? It looks like you identify yourself with most of
qemu users, but if most qemu users are like you then qemu has not enough
users :) Most users that consider themselves to be advanced may know
what eth1 or /dev/sdb means. This doesn't mean we should provide
device_del eth1 or device_add /dev/sdb command though. 

More important is that domain (encoded as number like you used to)
and bus number has no meaning from inside qemu. So while I said many
times I don't care about exact CLI syntax to much it should make sense
at least. It can use id to specify PCI bus in CLI like this:
device_del pci.0:1.1. Or it can even use device id too like this:
device_del pci.0:ide.0. Or it can use HW topology like in FO device
path. But doing ah-hoc device enumeration inside qemu and then using it
for CLI is not it.

 functionality in the guests.  Qemu is buggy in the moment in that is
 uses the bus addresses assigned by guest and not the ones in ACPI,
 but that can be fixed.
It looks like you confused ACPI _SEG for something it isn't. ACPI spec
says that PCI segment group is purely software concept managed by system
firmware. In fact one segment may include multiple PCI host bridges. _SEG
is not what OSPM uses to tie HW resource to ACPI resource. It used _CRS
(Current Resource Settings) for that just like OF. No surprise there.

 
 That should be enough for e.g. device_del. We do have the need to
 describe the topology when we interface with firmware, e.g. to describe
 the ACPI tables themselves to qemu (this is what Gleb's patches deal
 with), but that's probably the only case.
 
Describing HW topology is the only way to unambiguously describe device to
something or someone outside qemu and have persistent device naming
between different HW configuration.

--
Gleb.

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-21 Thread Michael S. Tsirkin

On Sun, Nov 21, 2010 at 10:32:11AM +0200, Gleb Natapov wrote:
 On Sat, Nov 20, 2010 at 10:17:09PM +0200, Michael S. Tsirkin wrote:
  On Fri, Nov 19, 2010 at 10:38:42PM +0200, Gleb Natapov wrote:
   On Fri, Nov 19, 2010 at 06:02:58PM +0100, Markus Armbruster wrote:
Michael S. Tsirkin m...@redhat.com writes:

 On Tue, Nov 09, 2010 at 11:41:43AM +0900, Isaku Yamahata wrote:
 On Mon, Nov 08, 2010 at 06:26:33PM +0200, Michael S. Tsirkin wrote:
  Replace bus number with slot numbers of parent bridges up to the 
  root.
  This works for root bridge in a compatible way because bus number 
  there
  is hard-coded to 0.
  IMO nested bridges are broken anyway, no way to be compatible 
  there.
  
  
  Gleb, Markus, I think the following should be sufficient for PCI.  
  What
  do you think?  Also - do we need to update QMP/monitor to teach 
  them to
  work with these paths?
  
  This is on top of Alex's patch, completely untested.
  
  
  pci: fix device path for devices behind nested bridges
  
  We were using bus number in the device path, which is clearly
  broken as this number is guest-assigned for all devices
  except the root.
  
  Fix by using hierarchical list of slots, walking the path
  from root down to device, instead. Add :00 as bus number
  so that if there are no nested bridges, this is compatible
  with what we have now.
 
 This format, Domain:00:Slot:Slot:Slot.Function, doesn't work
 because pci-to-pci bridge is pci function.
 So the format should be
 Domain:00:Slot.Function:Slot.Function:Slot.Function
 
 thanks,

 Hmm, interesting. If we do this we aren't backwards compatible
 though, so maybe we could try using openfirmware paths, just as well.

Whatever we do, we need to make it work for all (qdevified) devices and
buses.

It should also be possible to use canonical addressing with device_add 
friends.  I.e. permit naming a device by (a unique abbreviation of) its
canonical address in addition to naming it by its user-defined ID.  For
instance, something like

   device_del /pci/@1,1

   FWIW openbios allows this kind of abbreviation.
   
in addition to

   device_del ID

Open Firmware is a useful source of inspiration there, but should it
come into conflict with usability, we should let usability win.
   
   --
 Gleb.
  
  
  I think that the domain (PCI segment group), bus, slot, function way to
  address pci devices is still the most familiar and the easiest to map to
 Most familiar to whom?

The guests.
For CLI, we need an easy way to map a device in guest to the
device in qemu and back.

 It looks like you identify yourself with most of
 qemu users, but if most qemu users are like you then qemu has not enough
 users :) Most users that consider themselves to be advanced may know
 what eth1 or /dev/sdb means. This doesn't mean we should provide
 device_del eth1 or device_add /dev/sdb command though. 
 
 More important is that domain (encoded as number like you used to)
 and bus number has no meaning from inside qemu.
 So while I said many
 times I don't care about exact CLI syntax to much it should make sense
 at least. It can use id to specify PCI bus in CLI like this:
 device_del pci.0:1.1. Or it can even use device id too like this:
 device_del pci.0:ide.0. Or it can use HW topology like in FO device
 path. But doing ah-hoc device enumeration inside qemu and then using it
 for CLI is not it.
 
  functionality in the guests.  Qemu is buggy in the moment in that is
  uses the bus addresses assigned by guest and not the ones in ACPI,
  but that can be fixed.
 It looks like you confused ACPI _SEG for something it isn't.

Maybe I did. This is what linux does:

struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root
*root)
{
struct acpi_device *device = root-device;
int domain = root-segment;
int busnum = root-secondary.start;

And I think this is consistent with the spec.

 ACPI spec
 says that PCI segment group is purely software concept managed by system
 firmware. In fact one segment may include multiple PCI host bridges.

It can't I think:
Multiple Host Bridges

A platform may have multiple PCI Express or PCI-X host bridges. The base
address for the
MMCONFIG space for these host bridges may need to be allocated at
different locations. In such
cases, using MCFG table and _CBA method as defined in this section means
that each of these host
bridges must be in its own PCI Segment Group.


 _SEG
 is not what OSPM uses to tie HW resource to ACPI resource. It used _CRS
 (Current Resource Settings) for that just like OF. No surprise there.

OSPM uses both I think.

All I see linux do with CRS is get the bus number range.
And the spec says,

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-21 Thread Gleb Natapov

On Sun, Nov 21, 2010 at 11:50:18AM +0200, Michael S. Tsirkin wrote:
 On Sun, Nov 21, 2010 at 10:32:11AM +0200, Gleb Natapov wrote:
  On Sat, Nov 20, 2010 at 10:17:09PM +0200, Michael S. Tsirkin wrote:
   On Fri, Nov 19, 2010 at 10:38:42PM +0200, Gleb Natapov wrote:
On Fri, Nov 19, 2010 at 06:02:58PM +0100, Markus Armbruster wrote:
 Michael S. Tsirkin m...@redhat.com writes:
 
  On Tue, Nov 09, 2010 at 11:41:43AM +0900, Isaku Yamahata wrote:
  On Mon, Nov 08, 2010 at 06:26:33PM +0200, Michael S. Tsirkin wrote:
   Replace bus number with slot numbers of parent bridges up to the 
   root.
   This works for root bridge in a compatible way because bus 
   number there
   is hard-coded to 0.
   IMO nested bridges are broken anyway, no way to be compatible 
   there.
   
   
   Gleb, Markus, I think the following should be sufficient for 
   PCI.  What
   do you think?  Also - do we need to update QMP/monitor to teach 
   them to
   work with these paths?
   
   This is on top of Alex's patch, completely untested.
   
   
   pci: fix device path for devices behind nested bridges
   
   We were using bus number in the device path, which is clearly
   broken as this number is guest-assigned for all devices
   except the root.
   
   Fix by using hierarchical list of slots, walking the path
   from root down to device, instead. Add :00 as bus number
   so that if there are no nested bridges, this is compatible
   with what we have now.
  
  This format, Domain:00:Slot:Slot:Slot.Function, doesn't work
  because pci-to-pci bridge is pci function.
  So the format should be
  Domain:00:Slot.Function:Slot.Function:Slot.Function
  
  thanks,
 
  Hmm, interesting. If we do this we aren't backwards compatible
  though, so maybe we could try using openfirmware paths, just as 
  well.
 
 Whatever we do, we need to make it work for all (qdevified) devices 
 and
 buses.
 
 It should also be possible to use canonical addressing with 
 device_add 
 friends.  I.e. permit naming a device by (a unique abbreviation of) 
 its
 canonical address in addition to naming it by its user-defined ID.  
 For
 instance, something like
 
device_del /pci/@1,1
 
FWIW openbios allows this kind of abbreviation.

 in addition to
 
device_del ID
 
 Open Firmware is a useful source of inspiration there, but should it
 come into conflict with usability, we should let usability win.

--
Gleb.
   
   
   I think that the domain (PCI segment group), bus, slot, function way to
   address pci devices is still the most familiar and the easiest to map to
  Most familiar to whom?
 
 The guests.
Which one? There are many guests. Your favorite?

 For CLI, we need an easy way to map a device in guest to the
 device in qemu and back.
Then use eth0, /dev/sdb, or even C:. Your way is not less broken since what
you are saying is lets use name that guest assigned to a device. 

 
  It looks like you identify yourself with most of
  qemu users, but if most qemu users are like you then qemu has not enough
  users :) Most users that consider themselves to be advanced may know
  what eth1 or /dev/sdb means. This doesn't mean we should provide
  device_del eth1 or device_add /dev/sdb command though. 
  
  More important is that domain (encoded as number like you used to)
  and bus number has no meaning from inside qemu.
  So while I said many
  times I don't care about exact CLI syntax to much it should make sense
  at least. It can use id to specify PCI bus in CLI like this:
  device_del pci.0:1.1. Or it can even use device id too like this:
  device_del pci.0:ide.0. Or it can use HW topology like in FO device
  path. But doing ah-hoc device enumeration inside qemu and then using it
  for CLI is not it.
  
   functionality in the guests.  Qemu is buggy in the moment in that is
   uses the bus addresses assigned by guest and not the ones in ACPI,
   but that can be fixed.
  It looks like you confused ACPI _SEG for something it isn't.
 
 Maybe I did. This is what linux does:
 
 struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root
 *root)
 {
 struct acpi_device *device = root-device;
 int domain = root-segment;
 int busnum = root-secondary.start;
 
 And I think this is consistent with the spec.
 
It means that one domain may include several host bridges. At that level
domain is defined as something that have unique name for each device
inside it thus no two buses in one segment/domain can have same bus
number. This is what PCI spec tells you. 

And this further shows that using domain as defined by guest is very
bad idea. 

  ACPI spec
  says that PCI segment group is purely software concept managed by system
  firmware. In

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-21 Thread Michael S. Tsirkin

On Sun, Nov 21, 2010 at 12:19:03PM +0200, Gleb Natapov wrote:
 On Sun, Nov 21, 2010 at 11:50:18AM +0200, Michael S. Tsirkin wrote:
  On Sun, Nov 21, 2010 at 10:32:11AM +0200, Gleb Natapov wrote:
   On Sat, Nov 20, 2010 at 10:17:09PM +0200, Michael S. Tsirkin wrote:
On Fri, Nov 19, 2010 at 10:38:42PM +0200, Gleb Natapov wrote:
 On Fri, Nov 19, 2010 at 06:02:58PM +0100, Markus Armbruster wrote:
  Michael S. Tsirkin m...@redhat.com writes:
  
   On Tue, Nov 09, 2010 at 11:41:43AM +0900, Isaku Yamahata wrote:
   On Mon, Nov 08, 2010 at 06:26:33PM +0200, Michael S. Tsirkin 
   wrote:
Replace bus number with slot numbers of parent bridges up to 
the root.
This works for root bridge in a compatible way because bus 
number there
is hard-coded to 0.
IMO nested bridges are broken anyway, no way to be compatible 
there.


Gleb, Markus, I think the following should be sufficient for 
PCI.  What
do you think?  Also - do we need to update QMP/monitor to 
teach them to
work with these paths?

This is on top of Alex's patch, completely untested.


pci: fix device path for devices behind nested bridges

We were using bus number in the device path, which is clearly
broken as this number is guest-assigned for all devices
except the root.

Fix by using hierarchical list of slots, walking the path
from root down to device, instead. Add :00 as bus number
so that if there are no nested bridges, this is compatible
with what we have now.
   
   This format, Domain:00:Slot:Slot:Slot.Function, doesn't work
   because pci-to-pci bridge is pci function.
   So the format should be
   Domain:00:Slot.Function:Slot.Function:Slot.Function
   
   thanks,
  
   Hmm, interesting. If we do this we aren't backwards compatible
   though, so maybe we could try using openfirmware paths, just as 
   well.
  
  Whatever we do, we need to make it work for all (qdevified) devices 
  and
  buses.
  
  It should also be possible to use canonical addressing with 
  device_add 
  friends.  I.e. permit naming a device by (a unique abbreviation of) 
  its
  canonical address in addition to naming it by its user-defined ID.  
  For
  instance, something like
  
 device_del /pci/@1,1
  
 FWIW openbios allows this kind of abbreviation.
 
  in addition to
  
 device_del ID
  
  Open Firmware is a useful source of inspiration there, but should it
  come into conflict with usability, we should let usability win.
 
 --
   Gleb.


I think that the domain (PCI segment group), bus, slot, function way to
address pci devices is still the most familiar and the easiest to map to
   Most familiar to whom?
  
  The guests.
 Which one? There are many guests. Your favorite?
 
  For CLI, we need an easy way to map a device in guest to the
  device in qemu and back.
 Then use eth0, /dev/sdb, or even C:. Your way is not less broken since what
 you are saying is lets use name that guest assigned to a device. 

No I am saying let's use the name that our ACPI tables assigned.

  
   It looks like you identify yourself with most of
   qemu users, but if most qemu users are like you then qemu has not enough
   users :) Most users that consider themselves to be advanced may know
   what eth1 or /dev/sdb means. This doesn't mean we should provide
   device_del eth1 or device_add /dev/sdb command though. 
   
   More important is that domain (encoded as number like you used to)
   and bus number has no meaning from inside qemu.
   So while I said many
   times I don't care about exact CLI syntax to much it should make sense
   at least. It can use id to specify PCI bus in CLI like this:
   device_del pci.0:1.1. Or it can even use device id too like this:
   device_del pci.0:ide.0. Or it can use HW topology like in FO device
   path. But doing ah-hoc device enumeration inside qemu and then using it
   for CLI is not it.
   
functionality in the guests.  Qemu is buggy in the moment in that is
uses the bus addresses assigned by guest and not the ones in ACPI,
but that can be fixed.
   It looks like you confused ACPI _SEG for something it isn't.
  
  Maybe I did. This is what linux does:
  
  struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root
  *root)
  {
  struct acpi_device *device = root-device;
  int domain = root-segment;
  int busnum = root-secondary.start;
  
  And I think this is consistent with the spec.
  
 It means that one domain may include several host bridges.
 At that level
 domain is defined as something that have unique name for each device
 inside it thus no two buses in one

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-21 Thread Gleb Natapov

On Sun, Nov 21, 2010 at 01:53:26PM +0200, Michael S. Tsirkin wrote:
   The guests.
  Which one? There are many guests. Your favorite?
  
   For CLI, we need an easy way to map a device in guest to the
   device in qemu and back.
  Then use eth0, /dev/sdb, or even C:. Your way is not less broken since what
  you are saying is lets use name that guest assigned to a device. 
 
 No I am saying let's use the name that our ACPI tables assigned.
 
ACPI does not assign any name. In a best case ACPI tables describe resources
used by a device. And not all guests qemu supports has support for ACPI. Qemu
even support machines types that do not support ACPI.
 
   
   
It looks like you identify yourself with most of
qemu users, but if most qemu users are like you then qemu has not enough
users :) Most users that consider themselves to be advanced may know
what eth1 or /dev/sdb means. This doesn't mean we should provide
device_del eth1 or device_add /dev/sdb command though. 

More important is that domain (encoded as number like you used to)
and bus number has no meaning from inside qemu.
So while I said many
times I don't care about exact CLI syntax to much it should make sense
at least. It can use id to specify PCI bus in CLI like this:
device_del pci.0:1.1. Or it can even use device id too like this:
device_del pci.0:ide.0. Or it can use HW topology like in FO device
path. But doing ah-hoc device enumeration inside qemu and then using it
for CLI is not it.

 functionality in the guests.  Qemu is buggy in the moment in that is
 uses the bus addresses assigned by guest and not the ones in ACPI,
 but that can be fixed.
It looks like you confused ACPI _SEG for something it isn't.
   
   Maybe I did. This is what linux does:
   
   struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root
   *root)
   {
   struct acpi_device *device = root-device;
   int domain = root-segment;
   int busnum = root-secondary.start;
   
   And I think this is consistent with the spec.
   
  It means that one domain may include several host bridges.
  At that level
  domain is defined as something that have unique name for each device
  inside it thus no two buses in one segment/domain can have same bus
  number. This is what PCI spec tells you. 
 
 And that really is enough for CLI because all we need is locate the
 specific slot in a unique way.
 
At qemu level we do not have bus numbers. They are assigned by a guest.
So inside a guest domain:bus:slot.func points you to a device, but in
qemu does not enumerate buses.

  And this further shows that using domain as defined by guest is very
  bad idea. 
 
 As defined by ACPI, really.
 
ACPI is a part of a guest software that may not event present in the
guest. How is it relevant?

ACPI spec
says that PCI segment group is purely software concept managed by system
firmware. In fact one segment may include multiple PCI host bridges.
   
   It can't I think:
  Read _BBN definition:
   The _BBN object is located under a PCI host bridge and must be unique for
   every host bridge within a segment since it is the PCI bus number.
  
  Clearly above speaks about multiple host bridge within a segment.
 
 Yes, it looks like the firmware spec allows that.
It even have explicit example that shows it.

 
 Multiple Host Bridges
   
 A platform may have multiple PCI Express or PCI-X host bridges. The base
 address for the
 MMCONFIG space for these host bridges may need to be allocated at
 different locations. In such
 cases, using MCFG table and _CBA method as defined in this section means
 that each of these host
 bridges must be in its own PCI Segment Group.
   
  This is not from ACPI spec,
 
 PCI Firmware Specification 3.0
 
  but without going to deep into it above
  paragraph talks about some particular case when each host bridge must
  be in its own PCI Segment Group with is a definite prove that in other
  cases multiple host bridges can be in on segment group.
 
 I stand corrected. I think you are right. But note that if they are,
 they must have distinct bus numbers assigned by ACPI.
ACPI does not assign any numbers. Bios enumerates buses and assign
numbers. ACPI, in a base case, describes what BIOS did to OSPM. Qemu sits
one layer below all this and does not enumerate PC buses. Even if we make
it to do so there is not way to guaranty that guest will enumerate them
in the same order since there is more then one way to do enumeration. I
repeated this numerous times to you already.

 
   
_SEG
is not what OSPM uses to tie HW resource to ACPI resource. It used _CRS
(Current Resource Settings) for that just like OF. No surprise there.
   
   OSPM uses both I think.
   
   All I see linux do with CRS is get the bus number range.
  So lets assume that HW has two PCI host bridges and ACPI has:
  Device(PCI0) {
  Name

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-21 Thread Michael S. Tsirkin

On Sun, Nov 21, 2010 at 02:50:14PM +0200, Gleb Natapov wrote:
 On Sun, Nov 21, 2010 at 01:53:26PM +0200, Michael S. Tsirkin wrote:
The guests.
   Which one? There are many guests. Your favorite?
   
For CLI, we need an easy way to map a device in guest to the
device in qemu and back.
   Then use eth0, /dev/sdb, or even C:. Your way is not less broken since 
   what
   you are saying is lets use name that guest assigned to a device. 
  
  No I am saying let's use the name that our ACPI tables assigned.
  
 ACPI does not assign any name. In a best case ACPI tables describe resources
 used by a device.

Not only that. bus number and segment aren't resources as such.
They describe addressing.

 And not all guests qemu supports has support for ACPI. Qemu
 even support machines types that do not support ACPI.

So? Different machines - different names.



 It looks like you identify yourself with most of
 qemu users, but if most qemu users are like you then qemu has not 
 enough
 users :) Most users that consider themselves to be advanced may know
 what eth1 or /dev/sdb means. This doesn't mean we should provide
 device_del eth1 or device_add /dev/sdb command though. 
 
 More important is that domain (encoded as number like you used to)
 and bus number has no meaning from inside qemu.
 So while I said many
 times I don't care about exact CLI syntax to much it should make sense
 at least. It can use id to specify PCI bus in CLI like this:
 device_del pci.0:1.1. Or it can even use device id too like this:
 device_del pci.0:ide.0. Or it can use HW topology like in FO device
 path. But doing ah-hoc device enumeration inside qemu and then using 
 it
 for CLI is not it.
 
  functionality in the guests.  Qemu is buggy in the moment in that is
  uses the bus addresses assigned by guest and not the ones in ACPI,
  but that can be fixed.
 It looks like you confused ACPI _SEG for something it isn't.

Maybe I did. This is what linux does:

struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root
*root)
{
struct acpi_device *device = root-device;
int domain = root-segment;
int busnum = root-secondary.start;

And I think this is consistent with the spec.

   It means that one domain may include several host bridges.
   At that level
   domain is defined as something that have unique name for each device
   inside it thus no two buses in one segment/domain can have same bus
   number. This is what PCI spec tells you. 
  
  And that really is enough for CLI because all we need is locate the
  specific slot in a unique way.
  
 At qemu level we do not have bus numbers. They are assigned by a guest.
 So inside a guest domain:bus:slot.func points you to a device, but in
 qemu does not enumerate buses.
 
   And this further shows that using domain as defined by guest is very
   bad idea. 
  
  As defined by ACPI, really.
  
 ACPI is a part of a guest software that may not event present in the
 guest. How is it relevant?

It's relevant because this is what guests use. To access the root
device with cf8/cfc you need to know the bus number assigned to it
by firmware. How that was assigned is of interest to BIOS/ACPI but not
really interesting to the user or, I suspect, guest OS.

 ACPI spec
 says that PCI segment group is purely software concept managed by 
 system
 firmware. In fact one segment may include multiple PCI host bridges.

It can't I think:
   Read _BBN definition:
The _BBN object is located under a PCI host bridge and must be unique for
every host bridge within a segment since it is the PCI bus number.
   
   Clearly above speaks about multiple host bridge within a segment.
  
  Yes, it looks like the firmware spec allows that.
 It even have explicit example that shows it.
 
  
Multiple Host Bridges

A platform may have multiple PCI Express or PCI-X host bridges. 
The base
address for the
MMCONFIG space for these host bridges may need to be allocated 
at
different locations. In such
cases, using MCFG table and _CBA method as defined in this 
section means
that each of these host
bridges must be in its own PCI Segment Group.

   This is not from ACPI spec,
  
  PCI Firmware Specification 3.0
  
   but without going to deep into it above
   paragraph talks about some particular case when each host bridge must
   be in its own PCI Segment Group with is a definite prove that in other
   cases multiple host bridges can be in on segment group.
  
  I stand corrected. I think you are right. But note that if they are,
  they must have distinct bus numbers assigned by ACPI.
 ACPI does not assign any numbers.

For all root pci devices firmware must supply BBN number. This is the
bus number, isn't it?

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-21 Thread Gleb Natapov

On Sun, Nov 21, 2010 at 04:48:44PM +0200, Michael S. Tsirkin wrote:
 On Sun, Nov 21, 2010 at 02:50:14PM +0200, Gleb Natapov wrote:
  On Sun, Nov 21, 2010 at 01:53:26PM +0200, Michael S. Tsirkin wrote:
 The guests.
Which one? There are many guests. Your favorite?

 For CLI, we need an easy way to map a device in guest to the
 device in qemu and back.
Then use eth0, /dev/sdb, or even C:. Your way is not less broken since 
what
you are saying is lets use name that guest assigned to a device. 
   
   No I am saying let's use the name that our ACPI tables assigned.
   
  ACPI does not assign any name. In a best case ACPI tables describe resources
  used by a device.
 
 Not only that. bus number and segment aren't resources as such.
 They describe addressing.
 
  And not all guests qemu supports has support for ACPI. Qemu
  even support machines types that do not support ACPI.
 
 So? Different machines - different names.
 
You want to have different cli for different type of machines qemu
supports?

 
 
  It looks like you identify yourself with most of
  qemu users, but if most qemu users are like you then qemu has not 
  enough
  users :) Most users that consider themselves to be advanced may 
  know
  what eth1 or /dev/sdb means. This doesn't mean we should provide
  device_del eth1 or device_add /dev/sdb command though. 
  
  More important is that domain (encoded as number like you used to)
  and bus number has no meaning from inside qemu.
  So while I said many
  times I don't care about exact CLI syntax to much it should make 
  sense
  at least. It can use id to specify PCI bus in CLI like this:
  device_del pci.0:1.1. Or it can even use device id too like this:
  device_del pci.0:ide.0. Or it can use HW topology like in FO device
  path. But doing ah-hoc device enumeration inside qemu and then 
  using it
  for CLI is not it.
  
   functionality in the guests.  Qemu is buggy in the moment in that 
   is
   uses the bus addresses assigned by guest and not the ones in ACPI,
   but that can be fixed.
  It looks like you confused ACPI _SEG for something it isn't.
 
 Maybe I did. This is what linux does:
 
 struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root
 *root)
 {
 struct acpi_device *device = root-device;
 int domain = root-segment;
 int busnum = root-secondary.start;
 
 And I think this is consistent with the spec.
 
It means that one domain may include several host bridges.
At that level
domain is defined as something that have unique name for each device
inside it thus no two buses in one segment/domain can have same bus
number. This is what PCI spec tells you. 
   
   And that really is enough for CLI because all we need is locate the
   specific slot in a unique way.
   
  At qemu level we do not have bus numbers. They are assigned by a guest.
  So inside a guest domain:bus:slot.func points you to a device, but in
  qemu does not enumerate buses.
  
And this further shows that using domain as defined by guest is very
bad idea. 
   
   As defined by ACPI, really.
   
  ACPI is a part of a guest software that may not event present in the
  guest. How is it relevant?
 
 It's relevant because this is what guests use. To access the root
 device with cf8/cfc you need to know the bus number assigned to it
 by firmware. How that was assigned is of interest to BIOS/ACPI but not
 really interesting to the user or, I suspect, guest OS.
 
Of course this is incorrect. OS can re-enumerate PCI if it wishes. Linux
have cmd just for that. And saying that ACPI is relevant because this is
what guest software use in a reply to sentence that states that not all
guest even use ACPI is, well, strange.

And ACPI describes only HW that present at boot time. What if you
hot-plugged root pci bridge? How non existent PCI naming helps you?

  ACPI spec
  says that PCI segment group is purely software concept managed by 
  system
  firmware. In fact one segment may include multiple PCI host bridges.
 
 It can't I think:
Read _BBN definition:
 The _BBN object is located under a PCI host bridge and must be unique 
for
 every host bridge within a segment since it is the PCI bus number.

Clearly above speaks about multiple host bridge within a segment.
   
   Yes, it looks like the firmware spec allows that.
  It even have explicit example that shows it.
  
   
   Multiple Host Bridges
 
   A platform may have multiple PCI Express or PCI-X host bridges. 
 The base
   address for the
   MMCONFIG space for these host bridges may need to be allocated 
 at
   different locations. In such
   cases, using MCFG table and _CBA method as defined in this 
 section means

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-21 Thread Michael S. Tsirkin

On Sun, Nov 21, 2010 at 06:01:11PM +0200, Gleb Natapov wrote:
 On Sun, Nov 21, 2010 at 04:48:44PM +0200, Michael S. Tsirkin wrote:
  On Sun, Nov 21, 2010 at 02:50:14PM +0200, Gleb Natapov wrote:
   On Sun, Nov 21, 2010 at 01:53:26PM +0200, Michael S. Tsirkin wrote:
  The guests.
 Which one? There are many guests. Your favorite?
 
  For CLI, we need an easy way to map a device in guest to the
  device in qemu and back.
 Then use eth0, /dev/sdb, or even C:. Your way is not less broken 
 since what
 you are saying is lets use name that guest assigned to a device. 

No I am saying let's use the name that our ACPI tables assigned.

   ACPI does not assign any name. In a best case ACPI tables describe 
   resources
   used by a device.
  
  Not only that. bus number and segment aren't resources as such.
  They describe addressing.
  
   And not all guests qemu supports has support for ACPI. Qemu
   even support machines types that do not support ACPI.
  
  So? Different machines - different names.
  
 You want to have different cli for different type of machines qemu
 supports?

Different device names.

  
  
   It looks like you identify yourself with most of
   qemu users, but if most qemu users are like you then qemu has not 
   enough
   users :) Most users that consider themselves to be advanced may 
   know
   what eth1 or /dev/sdb means. This doesn't mean we should provide
   device_del eth1 or device_add /dev/sdb command though. 
   
   More important is that domain (encoded as number like you used 
   to)
   and bus number has no meaning from inside qemu.
   So while I said many
   times I don't care about exact CLI syntax to much it should make 
   sense
   at least. It can use id to specify PCI bus in CLI like this:
   device_del pci.0:1.1. Or it can even use device id too like this:
   device_del pci.0:ide.0. Or it can use HW topology like in FO 
   device
   path. But doing ah-hoc device enumeration inside qemu and then 
   using it
   for CLI is not it.
   
functionality in the guests.  Qemu is buggy in the moment in 
that is
uses the bus addresses assigned by guest and not the ones in 
ACPI,
but that can be fixed.
   It looks like you confused ACPI _SEG for something it isn't.
  
  Maybe I did. This is what linux does:
  
  struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root
  *root)
  {
  struct acpi_device *device = root-device;
  int domain = root-segment;
  int busnum = root-secondary.start;
  
  And I think this is consistent with the spec.
  
 It means that one domain may include several host bridges.
 At that level
 domain is defined as something that have unique name for each device
 inside it thus no two buses in one segment/domain can have same bus
 number. This is what PCI spec tells you. 

And that really is enough for CLI because all we need is locate the
specific slot in a unique way.

   At qemu level we do not have bus numbers. They are assigned by a guest.
   So inside a guest domain:bus:slot.func points you to a device, but in
   qemu does not enumerate buses.
   
 And this further shows that using domain as defined by guest is very
 bad idea. 

As defined by ACPI, really.

   ACPI is a part of a guest software that may not event present in the
   guest. How is it relevant?
  
  It's relevant because this is what guests use. To access the root
  device with cf8/cfc you need to know the bus number assigned to it
  by firmware. How that was assigned is of interest to BIOS/ACPI but not
  really interesting to the user or, I suspect, guest OS.
  
 Of course this is incorrect. OS can re-enumerate PCI if it wishes. Linux
 have cmd just for that.

I haven't looked but I suspect linux will simply assume cf8/cfc and
and start doing it from there. If that doesn't get you the root
device you wanted, tough.

 And saying that ACPI is relevant because this is
 what guest software use in a reply to sentence that states that not all
 guest even use ACPI is, well, strange.
 
 And ACPI describes only HW that present at boot time. What if you
 hot-plugged root pci bridge? How non existent PCI naming helps you?

that's described by ACPI as well.

   ACPI spec
   says that PCI segment group is purely software concept managed by 
   system
   firmware. In fact one segment may include multiple PCI host 
   bridges.
  
  It can't I think:
 Read _BBN definition:
  The _BBN object is located under a PCI host bridge and must be 
 unique for
  every host bridge within a segment since it is the PCI bus number.
 
 Clearly above speaks about multiple host bridge within a segment.

Yes, it looks like the firmware spec allows that.

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-21 Thread Gleb Natapov

On Sun, Nov 21, 2010 at 06:38:31PM +0200, Michael S. Tsirkin wrote:
 On Sun, Nov 21, 2010 at 06:01:11PM +0200, Gleb Natapov wrote:
  On Sun, Nov 21, 2010 at 04:48:44PM +0200, Michael S. Tsirkin wrote:
   On Sun, Nov 21, 2010 at 02:50:14PM +0200, Gleb Natapov wrote:
On Sun, Nov 21, 2010 at 01:53:26PM +0200, Michael S. Tsirkin wrote:
   The guests.
  Which one? There are many guests. Your favorite?
  
   For CLI, we need an easy way to map a device in guest to the
   device in qemu and back.
  Then use eth0, /dev/sdb, or even C:. Your way is not less broken 
  since what
  you are saying is lets use name that guest assigned to a device. 
 
 No I am saying let's use the name that our ACPI tables assigned.
 
ACPI does not assign any name. In a best case ACPI tables describe 
resources
used by a device.
   
   Not only that. bus number and segment aren't resources as such.
   They describe addressing.
   
And not all guests qemu supports has support for ACPI. Qemu
even support machines types that do not support ACPI.
   
   So? Different machines - different names.
   
  You want to have different cli for different type of machines qemu
  supports?
 
 Different device names.
 
You mean on qemu-sparc for deleting PCI device you will use different
device naming scheme?!

   
   
It looks like you identify yourself with most of
qemu users, but if most qemu users are like you then qemu has 
not enough
users :) Most users that consider themselves to be advanced 
may know
what eth1 or /dev/sdb means. This doesn't mean we should provide
device_del eth1 or device_add /dev/sdb command though. 

More important is that domain (encoded as number like you 
used to)
and bus number has no meaning from inside qemu.
So while I said many
times I don't care about exact CLI syntax to much it should 
make sense
at least. It can use id to specify PCI bus in CLI like this:
device_del pci.0:1.1. Or it can even use device id too like 
this:
device_del pci.0:ide.0. Or it can use HW topology like in FO 
device
path. But doing ah-hoc device enumeration inside qemu and then 
using it
for CLI is not it.

 functionality in the guests.  Qemu is buggy in the moment in 
 that is
 uses the bus addresses assigned by guest and not the ones in 
 ACPI,
 but that can be fixed.
It looks like you confused ACPI _SEG for something it isn't.
   
   Maybe I did. This is what linux does:
   
   struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root
   *root)
   {
   struct acpi_device *device = root-device;
   int domain = root-segment;
   int busnum = root-secondary.start;
   
   And I think this is consistent with the spec.
   
  It means that one domain may include several host bridges.
  At that level
  domain is defined as something that have unique name for each device
  inside it thus no two buses in one segment/domain can have same bus
  number. This is what PCI spec tells you. 
 
 And that really is enough for CLI because all we need is locate the
 specific slot in a unique way.
 
At qemu level we do not have bus numbers. They are assigned by a guest.
So inside a guest domain:bus:slot.func points you to a device, but in
qemu does not enumerate buses.

  And this further shows that using domain as defined by guest is 
  very
  bad idea. 
 
 As defined by ACPI, really.
 
ACPI is a part of a guest software that may not event present in the
guest. How is it relevant?
   
   It's relevant because this is what guests use. To access the root
   device with cf8/cfc you need to know the bus number assigned to it
   by firmware. How that was assigned is of interest to BIOS/ACPI but not
   really interesting to the user or, I suspect, guest OS.
   
  Of course this is incorrect. OS can re-enumerate PCI if it wishes. Linux
  have cmd just for that.
 
 I haven't looked but I suspect linux will simply assume cf8/cfc and
 and start doing it from there. If that doesn't get you the root
 device you wanted, tough.
 
Linux runs on many platforms and on most of them there is no such thing
as IO space. Also I believe Linux on x86 supported multi-root without
mmconfig in the past. It may be enough to describe IO resource second pci
host bridge uses in ACPI for Linux to use it.

  And saying that ACPI is relevant because this is
  what guest software use in a reply to sentence that states that not all
  guest even use ACPI is, well, strange.
  
  And ACPI describes only HW that present at boot time. What if you
  hot-plugged root pci bridge? How non existent PCI naming helps you?
 
 that's described by

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-21 Thread Michael S. Tsirkin

On Sun, Nov 21, 2010 at 06:01:11PM +0200, Gleb Natapov wrote:
  This FW is given to guest by qemu. It only assigns bus numbers
  because qemu told it to do so.
 Seabios is just a guest qemu ships. There are other FW for qemu.

We don't support them though, do we?

 Bochs
 bios, openfirmware, efi. All of them where developed outside of qemu
 project and all of them are usable without qemu. You can't consider them
 be part of qemu any more then Linux/Windows with virtio drivers.

You can also burn linuxbios onto your motherboard. If you do you
void your warranty.


  

  And the spec says, e.g.:
  
the memory mapped configuration base
  address (always corresponds to bus number 0) for the PCI 
  Segment Group
  of the host bridge is provided by _CBA and the bus range 
  covered by the
  base address is indicated by the corresponding bus range 
  specified in
  _CRS.
  
 Don't see how it is relevant. And _CBA is defined only for PCI 
 Express. Lets
 solve the problem for PCI first and then move to PCI Express. Jumping 
 from one
 to another destruct us from main discussion.

I think this is what confuses us.  As long as you are using cf8/cfc 
there's no
concept of a domain really.
Thus:
/p...@i0cf8

is probably enough for BIOS boot because we'll need to make root bus 
numbers
unique for legacy guests/option ROMs.  But this is not a hardware 
requirement
and might become easier to ignore eith EFI.

   You do not need MMCONFIG to have multiple PCI domains. You can have one
   configured via standard cf8/cfc and another one on ef8/efc and one more
   at mmio fce0 and you can address all of them:
   /p...@i0cf8
   /p...@i0ef8
   /p...@fce0

Isn't the mmio one relocatable?

   
   And each one of those PCI domains can have 256 subbridges.
  
  Will common guests such as windows or linux be able to use them? This
 With proper drivers yes. There is HW with more then one PCI bus and I
 think qemu emulates some of it (PPC MAC for instance). 

MAC is probably just poking in a couple of predefined places.  That's
not enumeration.

  seems to be outside the scope of the PCI Firmware specification, which
  says that bus numbers must be unique.
 They must be unique per PCI segment group.

We've come full circle, didn't we? i am saying we should let users
specify PCI Segment group+bus as opposed to the io port, which they
don't use.

  
  

That should be enough for e.g. device_del. We do have the need 
to
describe the topology when we interface with firmware, e.g. to 
describe
the ACPI tables themselves to qemu (this is what Gleb's patches 
deal
with), but that's probably the only case.

   Describing HW topology is the only way to unambiguously describe 
   device to
   something or someone outside qemu and have persistent device 
   naming
   between different HW configuration.
  
  Not really, since ACPI is a binary blob programmed by qemu.
  
 APCI is part of the guest, not qemu.

Yes it runs in the guest but it's generated by qemu. On real hardware,
it's supplied by the motherboard.

   It is not generated by qemu. Parts of it depend on HW and other part 
   depend
   on how BIOS configure HW. _BBN for instance is clearly defined to return
   address assigned bu the BIOS.
  
  BIOS is supplied on the motherboard and in our case by qemu as well.
 You can replace MB bios by coreboot+seabios on some of them.
 Manufacturer don't want you to do it and make it hard to do, but
 otherwise this is just software, not some magic dust.

They support common hardware but not all features will automatically work.

  There's no standard way for BIOS to assign bus number to the pci root,
  so it does it in device-specific way. Why should a management tool
  or a CLI user care about these? As far as they are concerned
  we could use some PV scheme to find root devices and assign bus
  numbers, and it would be exactly the same.
  
 Go write KVM userspace that does that. AFAIK there is project out there
 that tries to do that. No luck so far. Your world view is very x86/Linux
 centric. You need to broaden it a little bit. Next time you propose
 something ask yourself will it work with qemu-sparc, qemu-ppc, qemu-amd.

Your view is very qemu centric. Ask yourself whether what you propose
will work with libvirt. Better yet, ask libvirt developers.

 Just saying not really doesn't
 prove much. I still haven't seen any proposition from you that 
 actually
 solve the problem. No, lets use guest naming is not it. There is no
 such thing as The Guest. 
 
 --
   Gleb.

I am sorry if I didn't make this clear.  I think we should use the 
domain:bus
pair to name the root device. As these are unique

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-21 Thread Gleb Natapov

On Sun, Nov 21, 2010 at 08:22:03PM +0200, Michael S. Tsirkin wrote:
 On Sun, Nov 21, 2010 at 06:01:11PM +0200, Gleb Natapov wrote:
   This FW is given to guest by qemu. It only assigns bus numbers
   because qemu told it to do so.
  Seabios is just a guest qemu ships. There are other FW for qemu.
 
 We don't support them though, do we?
 
Who is we? As qemu community we support any guest code.

  Bochs
  bios, openfirmware, efi. All of them where developed outside of qemu
  project and all of them are usable without qemu. You can't consider them
  be part of qemu any more then Linux/Windows with virtio drivers.
 
 You can also burn linuxbios onto your motherboard. If you do you
 void your warranty.
And? What is your point?

 
 
   
 
   And the spec says, e.g.:
   
   the memory mapped configuration base
 address (always corresponds to bus number 0) for the PCI 
   Segment Group
 of the host bridge is provided by _CBA and the bus range 
   covered by the
 base address is indicated by the corresponding bus range 
   specified in
 _CRS.
   
  Don't see how it is relevant. And _CBA is defined only for PCI 
  Express. Lets
  solve the problem for PCI first and then move to PCI Express. 
  Jumping from one
  to another destruct us from main discussion.
 
 I think this is what confuses us.  As long as you are using cf8/cfc 
 there's no
 concept of a domain really.
 Thus:
   /p...@i0cf8
 
 is probably enough for BIOS boot because we'll need to make root bus 
 numbers
 unique for legacy guests/option ROMs.  But this is not a hardware 
 requirement
 and might become easier to ignore eith EFI.
 
You do not need MMCONFIG to have multiple PCI domains. You can have one
configured via standard cf8/cfc and another one on ef8/efc and one more
at mmio fce0 and you can address all of them:
/p...@i0cf8
/p...@i0ef8
/p...@fce0
 
 Isn't the mmio one relocatable?
 
No. If it is, there is a way to specify so.



And each one of those PCI domains can have 256 subbridges.
   
   Will common guests such as windows or linux be able to use them? This
  With proper drivers yes. There is HW with more then one PCI bus and I
  think qemu emulates some of it (PPC MAC for instance). 
 
 MAC is probably just poking in a couple of predefined places.  That's
 not enumeration.
 
System but is not enumerable on PC too. That is why you need ACPI to
describe resources or you just poke into predefined places (0cf8-0cff in
case of PCI).

   seems to be outside the scope of the PCI Firmware specification, which
   says that bus numbers must be unique.
  They must be unique per PCI segment group.
 
 We've come full circle, didn't we? i am saying we should let users
 specify PCI Segment group+bus as opposed to the io port, which they
 don't use.
 
When qemu creates device model there is no any bus number assigned to
pci bridges and thus bus number is meaningless for qemu.  PCI Segment
group has no HW meaning at all (you seems to ignore this completely). So
lets say you run qemu with -S and want to delete or add device. Where do
you get bus number an PCI Segment group number to specify it?

   
   
 
 That should be enough for e.g. device_del. We do have the 
 need to
 describe the topology when we interface with firmware, e.g. 
 to describe
 the ACPI tables themselves to qemu (this is what Gleb's 
 patches deal
 with), but that's probably the only case.
 
Describing HW topology is the only way to unambiguously 
describe device to
something or someone outside qemu and have persistent device 
naming
between different HW configuration.
   
   Not really, since ACPI is a binary blob programmed by qemu.
   
  APCI is part of the guest, not qemu.
 
 Yes it runs in the guest but it's generated by qemu. On real hardware,
 it's supplied by the motherboard.
 
It is not generated by qemu. Parts of it depend on HW and other part 
depend
on how BIOS configure HW. _BBN for instance is clearly defined to return
address assigned bu the BIOS.
   
   BIOS is supplied on the motherboard and in our case by qemu as well.
  You can replace MB bios by coreboot+seabios on some of them.
  Manufacturer don't want you to do it and make it hard to do, but
  otherwise this is just software, not some magic dust.
 
 They support common hardware but not all features will automatically work.
So what? The same can be said about Linux on a lot of HW.

 
   There's no standard way for BIOS to assign bus number to the pci root,
   so it does it in device-specific way. Why should a management tool
   or a CLI user care about these? As far as they are concerned
   we could use some PV scheme to find root devices and assign bus
   numbers, and it

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-21 Thread Michael S. Tsirkin

On Sun, Nov 21, 2010 at 09:29:59PM +0200, Gleb Natapov wrote:
  We can control the bus numbers that the BIOS will assign to pci root.
 And if you need to insert device that sits behind pci-to-pci bridge? Do
 you want to control that bus address too? And bus number do not unambiguously
 describe pci root since two roots can have same bus number just fine.

They have to have different segment numbers.

  It's probably required to make them stable anyway.
  
 Why?

To avoid bus renumbering on reboot after you add a pci-to-pci bridge.

-- 
MST

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-21 Thread Gleb Natapov

On Sun, Nov 21, 2010 at 10:39:31PM +0200, Michael S. Tsirkin wrote:
 On Sun, Nov 21, 2010 at 09:29:59PM +0200, Gleb Natapov wrote:
   We can control the bus numbers that the BIOS will assign to pci root.
  And if you need to insert device that sits behind pci-to-pci bridge? Do
  you want to control that bus address too? And bus number do not 
  unambiguously
  describe pci root since two roots can have same bus number just fine.
 
 They have to have different segment numbers.
 
AKA PCI domains AKA one more number assigned by a guest.

   It's probably required to make them stable anyway.
   
  Why?
 
 To avoid bus renumbering on reboot after you add a pci-to-pci bridge.
 
Why should qemu care?

--
Gleb.

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-20 Thread Michael S. Tsirkin

On Fri, Nov 19, 2010 at 10:38:42PM +0200, Gleb Natapov wrote:
 On Fri, Nov 19, 2010 at 06:02:58PM +0100, Markus Armbruster wrote:
  Michael S. Tsirkin m...@redhat.com writes:
  
   On Tue, Nov 09, 2010 at 11:41:43AM +0900, Isaku Yamahata wrote:
   On Mon, Nov 08, 2010 at 06:26:33PM +0200, Michael S. Tsirkin wrote:
Replace bus number with slot numbers of parent bridges up to the root.
This works for root bridge in a compatible way because bus number there
is hard-coded to 0.
IMO nested bridges are broken anyway, no way to be compatible there.


Gleb, Markus, I think the following should be sufficient for PCI.  What
do you think?  Also - do we need to update QMP/monitor to teach them to
work with these paths?

This is on top of Alex's patch, completely untested.


pci: fix device path for devices behind nested bridges

We were using bus number in the device path, which is clearly
broken as this number is guest-assigned for all devices
except the root.

Fix by using hierarchical list of slots, walking the path
from root down to device, instead. Add :00 as bus number
so that if there are no nested bridges, this is compatible
with what we have now.
   
   This format, Domain:00:Slot:Slot:Slot.Function, doesn't work
   because pci-to-pci bridge is pci function.
   So the format should be
   Domain:00:Slot.Function:Slot.Function:Slot.Function
   
   thanks,
  
   Hmm, interesting. If we do this we aren't backwards compatible
   though, so maybe we could try using openfirmware paths, just as well.
  
  Whatever we do, we need to make it work for all (qdevified) devices and
  buses.
  
  It should also be possible to use canonical addressing with device_add 
  friends.  I.e. permit naming a device by (a unique abbreviation of) its
  canonical address in addition to naming it by its user-defined ID.  For
  instance, something like
  
 device_del /pci/@1,1
  
 FWIW openbios allows this kind of abbreviation.
 
  in addition to
  
 device_del ID
  
  Open Firmware is a useful source of inspiration there, but should it
  come into conflict with usability, we should let usability win.
 
 --
   Gleb.


I think that the domain (PCI segment group), bus, slot, function way to
address pci devices is still the most familiar and the easiest to map to
functionality in the guests.  Qemu is buggy in the moment in that is
uses the bus addresses assigned by guest and not the ones in ACPI,
but that can be fixed.

That should be enough for e.g. device_del. We do have the need to
describe the topology when we interface with firmware, e.g. to describe
the ACPI tables themselves to qemu (this is what Gleb's patches deal
with), but that's probably the only case.

-- 
MST

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-19 Thread Markus Armbruster

Michael S. Tsirkin m...@redhat.com writes:

 On Tue, Nov 09, 2010 at 11:41:43AM +0900, Isaku Yamahata wrote:
 On Mon, Nov 08, 2010 at 06:26:33PM +0200, Michael S. Tsirkin wrote:
  Replace bus number with slot numbers of parent bridges up to the root.
  This works for root bridge in a compatible way because bus number there
  is hard-coded to 0.
  IMO nested bridges are broken anyway, no way to be compatible there.
  
  
  Gleb, Markus, I think the following should be sufficient for PCI.  What
  do you think?  Also - do we need to update QMP/monitor to teach them to
  work with these paths?
  
  This is on top of Alex's patch, completely untested.
  
  
  pci: fix device path for devices behind nested bridges
  
  We were using bus number in the device path, which is clearly
  broken as this number is guest-assigned for all devices
  except the root.
  
  Fix by using hierarchical list of slots, walking the path
  from root down to device, instead. Add :00 as bus number
  so that if there are no nested bridges, this is compatible
  with what we have now.
 
 This format, Domain:00:Slot:Slot:Slot.Function, doesn't work
 because pci-to-pci bridge is pci function.
 So the format should be
 Domain:00:Slot.Function:Slot.Function:Slot.Function
 
 thanks,

 Hmm, interesting. If we do this we aren't backwards compatible
 though, so maybe we could try using openfirmware paths, just as well.

Whatever we do, we need to make it work for all (qdevified) devices and
buses.

It should also be possible to use canonical addressing with device_add 
friends.  I.e. permit naming a device by (a unique abbreviation of) its
canonical address in addition to naming it by its user-defined ID.  For
instance, something like

   device_del /pci/@1,1

in addition to

   device_del ID

Open Firmware is a useful source of inspiration there, but should it
come into conflict with usability, we should let usability win.

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-19 Thread Gleb Natapov

On Fri, Nov 19, 2010 at 06:02:58PM +0100, Markus Armbruster wrote:
 Michael S. Tsirkin m...@redhat.com writes:
 
  On Tue, Nov 09, 2010 at 11:41:43AM +0900, Isaku Yamahata wrote:
  On Mon, Nov 08, 2010 at 06:26:33PM +0200, Michael S. Tsirkin wrote:
   Replace bus number with slot numbers of parent bridges up to the root.
   This works for root bridge in a compatible way because bus number there
   is hard-coded to 0.
   IMO nested bridges are broken anyway, no way to be compatible there.
   
   
   Gleb, Markus, I think the following should be sufficient for PCI.  What
   do you think?  Also - do we need to update QMP/monitor to teach them to
   work with these paths?
   
   This is on top of Alex's patch, completely untested.
   
   
   pci: fix device path for devices behind nested bridges
   
   We were using bus number in the device path, which is clearly
   broken as this number is guest-assigned for all devices
   except the root.
   
   Fix by using hierarchical list of slots, walking the path
   from root down to device, instead. Add :00 as bus number
   so that if there are no nested bridges, this is compatible
   with what we have now.
  
  This format, Domain:00:Slot:Slot:Slot.Function, doesn't work
  because pci-to-pci bridge is pci function.
  So the format should be
  Domain:00:Slot.Function:Slot.Function:Slot.Function
  
  thanks,
 
  Hmm, interesting. If we do this we aren't backwards compatible
  though, so maybe we could try using openfirmware paths, just as well.
 
 Whatever we do, we need to make it work for all (qdevified) devices and
 buses.
 
 It should also be possible to use canonical addressing with device_add 
 friends.  I.e. permit naming a device by (a unique abbreviation of) its
 canonical address in addition to naming it by its user-defined ID.  For
 instance, something like
 
device_del /pci/@1,1
 
FWIW openbios allows this kind of abbreviation.

 in addition to
 
device_del ID
 
 Open Firmware is a useful source of inspiration there, but should it
 come into conflict with usability, we should let usability win.

--
Gleb.

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-09 Thread Michael S. Tsirkin

On Tue, Nov 09, 2010 at 11:41:43AM +0900, Isaku Yamahata wrote:
 On Mon, Nov 08, 2010 at 06:26:33PM +0200, Michael S. Tsirkin wrote:
  Replace bus number with slot numbers of parent bridges up to the root.
  This works for root bridge in a compatible way because bus number there
  is hard-coded to 0.
  IMO nested bridges are broken anyway, no way to be compatible there.
  
  
  Gleb, Markus, I think the following should be sufficient for PCI.  What
  do you think?  Also - do we need to update QMP/monitor to teach them to
  work with these paths?
  
  This is on top of Alex's patch, completely untested.
  
  
  pci: fix device path for devices behind nested bridges
  
  We were using bus number in the device path, which is clearly
  broken as this number is guest-assigned for all devices
  except the root.
  
  Fix by using hierarchical list of slots, walking the path
  from root down to device, instead. Add :00 as bus number
  so that if there are no nested bridges, this is compatible
  with what we have now.
 
 This format, Domain:00:Slot:Slot:Slot.Function, doesn't work
 because pci-to-pci bridge is pci function.
 So the format should be
 Domain:00:Slot.Function:Slot.Function:Slot.Function
 
 thanks,

Hmm, interesting. If we do this we aren't backwards compatible
though, so maybe we could try using openfirmware paths, just as well.

  
  Signed-off-by: Michael S. Tsirkin m...@redhat.com
  
  diff --git a/hw/pci.c b/hw/pci.c
  index 7d12473..fa98d94 100644
  --- a/hw/pci.c
  +++ b/hw/pci.c
  @@ -1826,13 +1826,45 @@ static void pcibus_dev_print(Monitor *mon, 
  DeviceState *dev, int indent)
   
   static char *pcibus_get_dev_path(DeviceState *dev)
   {
  -PCIDevice *d = (PCIDevice *)dev;
  -char path[16];
  -
  -snprintf(path, sizeof(path), %04x:%02x:%02x.%x,
  - pci_find_domain(d-bus), pci_bus_num(d-bus),
  - PCI_SLOT(d-devfn), PCI_FUNC(d-devfn));
  -
  -return strdup(path);
  +PCIDevice *d = container_of(dev, PCIDevice, qdev);
  +PCIDevice *t;
  +int slot_depth;
  +/* Path format: Domain:00:Slot:Slot:Slot.Function.
  + * 00 is added here to make this format compatible with
  + * domain:Bus:Slot.Func for systems without nested PCI bridges.
  + * Slot list specifies the slot numbers for all devices on the
  + * path from root to the specific device. */
  +int domain_len = strlen(:00);
  +int func_len = strlen(.F);
  +int slot_len = strlen(:SS);
  +int path_len;
  +char *path, *p;
  +
  +/* Calculate # of slots on path between device and root. */;
  +slot_depth = 0;
  +for (t = d; t; t = t-bus-parent_dev)
  +++slot_depth;
  +
  +path_len = domain_len + bus_len + slot_len * slot_depth + func_len;
  +
  +/* Allocate memory, fill in the terminating null byte. */
  +path = malloc(path_len + 1 /* For '\0' */);
  +path[path_len] = '\0';
  +
  +/* First field is the domain. */
  +snprintf(path, domain_len, %04x, pci_find_domain(d-bus));
  +
  +/* Leave space for slot numbers and fill in function number. */
  +p = path + domain_len + slot_len * slot_depth;
  +snprintf(p, func_len, .%02x, PCI_FUNC(d-devfn));
  +
  +/* Fill in slot numbers. We walk up from device to root, so need to 
  print
  + * them in the reverse order, last to first. */
  +for (t = d; t; t = t-bus-parent_dev) {
  +p -= slot_len;
  +snprintf(p, slot_len, :%x, PCI_SLOT(t-devfn));
  +}
  +
  +return path;
   }
   
  
 
 -- 
 yamahata

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-08 Thread Isaku Yamahata

On Mon, Nov 08, 2010 at 06:26:33PM +0200, Michael S. Tsirkin wrote:
 Replace bus number with slot numbers of parent bridges up to the root.
 This works for root bridge in a compatible way because bus number there
 is hard-coded to 0.
 IMO nested bridges are broken anyway, no way to be compatible there.
 
 
 Gleb, Markus, I think the following should be sufficient for PCI.  What
 do you think?  Also - do we need to update QMP/monitor to teach them to
 work with these paths?
 
 This is on top of Alex's patch, completely untested.
 
 
 pci: fix device path for devices behind nested bridges
 
 We were using bus number in the device path, which is clearly
 broken as this number is guest-assigned for all devices
 except the root.
 
 Fix by using hierarchical list of slots, walking the path
 from root down to device, instead. Add :00 as bus number
 so that if there are no nested bridges, this is compatible
 with what we have now.

This format, Domain:00:Slot:Slot:Slot.Function, doesn't work
because pci-to-pci bridge is pci function.
So the format should be
Domain:00:Slot.Function:Slot.Function:Slot.Function

thanks,

 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 
 diff --git a/hw/pci.c b/hw/pci.c
 index 7d12473..fa98d94 100644
 --- a/hw/pci.c
 +++ b/hw/pci.c
 @@ -1826,13 +1826,45 @@ static void pcibus_dev_print(Monitor *mon, 
 DeviceState *dev, int indent)
  
  static char *pcibus_get_dev_path(DeviceState *dev)
  {
 -PCIDevice *d = (PCIDevice *)dev;
 -char path[16];
 -
 -snprintf(path, sizeof(path), %04x:%02x:%02x.%x,
 - pci_find_domain(d-bus), pci_bus_num(d-bus),
 - PCI_SLOT(d-devfn), PCI_FUNC(d-devfn));
 -
 -return strdup(path);
 +PCIDevice *d = container_of(dev, PCIDevice, qdev);
 +PCIDevice *t;
 +int slot_depth;
 +/* Path format: Domain:00:Slot:Slot:Slot.Function.
 + * 00 is added here to make this format compatible with
 + * domain:Bus:Slot.Func for systems without nested PCI bridges.
 + * Slot list specifies the slot numbers for all devices on the
 + * path from root to the specific device. */
 +int domain_len = strlen(:00);
 +int func_len = strlen(.F);
 +int slot_len = strlen(:SS);
 +int path_len;
 +char *path, *p;
 +
 +/* Calculate # of slots on path between device and root. */;
 +slot_depth = 0;
 +for (t = d; t; t = t-bus-parent_dev)
 +++slot_depth;
 +
 +path_len = domain_len + bus_len + slot_len * slot_depth + func_len;
 +
 +/* Allocate memory, fill in the terminating null byte. */
 +path = malloc(path_len + 1 /* For '\0' */);
 +path[path_len] = '\0';
 +
 +/* First field is the domain. */
 +snprintf(path, domain_len, %04x, pci_find_domain(d-bus));
 +
 +/* Leave space for slot numbers and fill in function number. */
 +p = path + domain_len + slot_len * slot_depth;
 +snprintf(p, func_len, .%02x, PCI_FUNC(d-devfn));
 +
 +/* Fill in slot numbers. We walk up from device to root, so need to print
 + * them in the reverse order, last to first. */
 +for (t = d; t; t = t-bus-parent_dev) {
 +p -= slot_len;
 +snprintf(p, slot_len, :%x, PCI_SLOT(t-devfn));
 +}
 +
 +return path;
  }
  
 

-- 
yamahata

37 matches

Mail list logo