Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

2018-03-26 Thread Sridhar Pitchai
  
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree.  Please read:
>  
> for how to do this properly.

My bad. I am sending the patch to the mainline kernel with a Cc: tag for stable
and hoping the patch will be automatically merged to the stable kernels.

Thanks,
Sridhar Pitchai

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

2018-03-23 Thread Greg KH
On Fri, Mar 23, 2018 at 04:41:02PM +, Sridhar Pitchai wrote:
> Please read the link I sent you in relation to email formatting.
> 
> Then add your description above in a way that anyone not 100% familiar
> with hyperv can understand it - that's what the commit log is for.
> 
> You are sending this patch to stable kernels, patch above has been in
> the kernel from v4.14. The patch you are fixing since v4.11, you ought
> to be careful since you do not want to have broken kernel versions owing
> to stable patches mismatches, that's why I asked and I will ask again,
> are you sure you won't trigger a regression by sending this fix to
> stable ?
> 
> I assume the bond driver mechanism is now done and dusted.
> 
> That is correct. I have sent a v4 version of the patch. I am sending this
> patch for stable kernel. We have tested and I am sure this should not trigger
> regression by sending this fix to stable.




This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.


___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

2018-03-23 Thread Sridhar Pitchai
Please read the link I sent you in relation to email formatting.

Then add your description above in a way that anyone not 100% familiar
with hyperv can understand it - that's what the commit log is for.

You are sending this patch to stable kernels, patch above has been in
the kernel from v4.14. The patch you are fixing since v4.11, you ought
to be careful since you do not want to have broken kernel versions owing
to stable patches mismatches, that's why I asked and I will ask again,
are you sure you won't trigger a regression by sending this fix to
stable ?

I assume the bond driver mechanism is now done and dusted.

That is correct. I have sent a v4 version of the patch. I am sending this
patch for stable kernel. We have tested and I am sure this should not trigger
regression by sending this fix to stable.

Thanks
Sridhar Pitchai 
 

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

2018-03-23 Thread Sridhar Pitchai
> Previously, when using the bond driver, unique and persistent VF NIC name
> was required, so we used serial number as PCI domain which is included as
> part of the VF NIC name.  Transparent SRIOV mode puts VF NIC based on MAC
> match as a slave of synthetic NIC, so VF NIC’s name is no longer 
important.

Hi Sridhar,

A few hints about submitting patches more efficiently:

1) You never have to ask "Are we OK with the explanation?  If so, I'll
send a patch with updated changelog."  That forces an extra
round-trip.  Simply post a new version with your proposed update.  If
Lorenzo has more questions, he'll say so and you can do another
version.

2) When Lorenzo is asking for clarification, he's not really asking
for the clarification in an email response, because the email thread
will soon be forgotten and lost in the archives.  What we really want
is for the permanent git changelog to make sense to someone in the
future.  The easiest way is to post a new patch version with a
revised changelog that answers the questions.

3) Please capitalize and punctuate consistently with previous history,
e.g., "PCI: hv: Fix domain ID corruption" for your title, "SR-IOV"
(not "SRIOV") and "bus" (not "BUS") in changelog.  Both "para-virtual"
and "paravirtual" are used in the kernel, but "paravirtual" is much
more common.  Run "git log" and "git log --oneline" on your file and
follow the same style.

4) When you reference a previous commit, please use this style:
0c195567a8f6 ("netvsc: transparent VF management"), i.e., 12-char SHA1
followed by title.  You seem to have removed some spaces from the
commit you mention in the "Fixes" tag.

And a few content questions/observations of my own:

1) "Fix domain ID corruption" isn't a very good title because it
suggests you're fixing a memory corruption or similar defect.  But in
fact, I think you're removing something that used to be a feature
(added by 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI
domain")) but is now no longer needed and in fact now causes a
problem.

2) Your changelog does mention 4a9b0933bdfc, which is good, but
there must be some other  that makes it safe to remove
4a9b0933bdfc, i.e.,  removes the need for using the device
serial number as the PCI domain.   *must* be mentioned in
the changelog.  Otherwise, people may backport this patch to a kernel
that doesn't include , and things will break.

3) I don't understand what you mean by "transparent SR-IOV mode".  Is
that something different than regular SR-IOV?  If so, what exactly is
the difference?  I don't think the PCIe specs mention a "transparent
mode", so is it a Hyper-V thing?  It seems important, but I don't see
any pci-hyperv.c commits that mention it.

Here's a stab at the sort of changelog I would be looking for.
Obviously I don't understand much about Hyper-V and pci-hyperv.c, so
please correct the things I got wrong:

  When Linux runs as a guest in a Hyper-V VM, pci-hyperv.c
  paravirtualizes access to PCI devices assigned to the guest.  For
  each of those devices, hv_pci_probe() creates a virtual PCI bus in
  its own unique PCI domain.

  4a9b0933bdfc ("PCI: hv: Use device serial number as PCI domain")
  overrode that unique PCI domain to be the Hyper-V device serial
  number to make device names more convenient .

  One problem with 4a9b0933bdfc is that the Hyper-V device serial
  number is not necessarily unique, so we may end up with two buses
  with the same domain and bus number, and adding the second bus
  fails.

  We no longer need to override the PCI domain numbers because  removed the need for that.

  Revert 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI
  domain") so we can reliably support multiple devices being assigned
  to a guest.

  This revert should only be backported to kernels that contain
  .

Bjorn

Thanks for your comments Bjorn. I took some time to go over the all the
comments and sending a new version of the patch with all the comments 
incorporated.

Thanks
Sridhar

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

2018-03-21 Thread Bjorn Helgaas
[I composed most of this before seeing Lorenzo's response, so sorry
about the duplication.  Maybe seeing things stated a different way
will help :)]

On Tue, Mar 20, 2018 at 11:00:36PM +, Sridhar Pitchai wrote:
> Hi Lorenzo,
>Transparent SRIOV is exposing the NIC directly to the kernel via
> para-virtual device, unlike creating a netdev and associating it with the bond
> driver. Further descriptions here,
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=0c195567a8f6e82ea5535cd9f1d54a1626dd233e
> 
> Previously, when using the bond driver, unique and persistent VF NIC name
> was required, so we used serial number as PCI domain which is included as
> part of the VF NIC name.  Transparent SRIOV mode puts VF NIC based on MAC
> match as a slave of synthetic NIC, so VF NIC’s name is no longer important.

Hi Sridhar,

A few hints about submitting patches more efficiently:

1) You never have to ask "Are we OK with the explanation?  If so, I'll
send a patch with updated changelog."  That forces an extra
round-trip.  Simply post a new version with your proposed update.  If
Lorenzo has more questions, he'll say so and you can do another
version.

2) When Lorenzo is asking for clarification, he's not really asking
for the clarification in an email response, because the email thread
will soon be forgotten and lost in the archives.  What we really want
is for the permanent git changelog to make sense to someone in the
future.  The easiest way is to post a new patch version with a
revised changelog that answers the questions.

3) Please capitalize and punctuate consistently with previous history,
e.g., "PCI: hv: Fix domain ID corruption" for your title, "SR-IOV"
(not "SRIOV") and "bus" (not "BUS") in changelog.  Both "para-virtual"
and "paravirtual" are used in the kernel, but "paravirtual" is much
more common.  Run "git log" and "git log --oneline" on your file and
follow the same style.

4) When you reference a previous commit, please use this style:
0c195567a8f6 ("netvsc: transparent VF management"), i.e., 12-char SHA1
followed by title.  You seem to have removed some spaces from the
commit you mention in the "Fixes" tag.

And a few content questions/observations of my own:

1) "Fix domain ID corruption" isn't a very good title because it
suggests you're fixing a memory corruption or similar defect.  But in
fact, I think you're removing something that used to be a feature
(added by 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI
domain")) but is now no longer needed and in fact now causes a
problem.

2) Your changelog does mention 4a9b0933bdfc, which is good, but
there must be some other  that makes it safe to remove
4a9b0933bdfc, i.e.,  removes the need for using the device
serial number as the PCI domain.   *must* be mentioned in
the changelog.  Otherwise, people may backport this patch to a kernel
that doesn't include , and things will break.

3) I don't understand what you mean by "transparent SR-IOV mode".  Is
that something different than regular SR-IOV?  If so, what exactly is
the difference?  I don't think the PCIe specs mention a "transparent
mode", so is it a Hyper-V thing?  It seems important, but I don't see
any pci-hyperv.c commits that mention it.

Here's a stab at the sort of changelog I would be looking for.
Obviously I don't understand much about Hyper-V and pci-hyperv.c, so
please correct the things I got wrong:

  When Linux runs as a guest in a Hyper-V VM, pci-hyperv.c
  paravirtualizes access to PCI devices assigned to the guest.  For
  each of those devices, hv_pci_probe() creates a virtual PCI bus in
  its own unique PCI domain.

  4a9b0933bdfc ("PCI: hv: Use device serial number as PCI domain")
  overrode that unique PCI domain to be the Hyper-V device serial
  number to make device names more convenient .

  One problem with 4a9b0933bdfc is that the Hyper-V device serial
  number is not necessarily unique, so we may end up with two buses
  with the same domain and bus number, and adding the second bus
  fails.

  We no longer need to override the PCI domain numbers because  removed the need for that.

  Revert 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI
  domain") so we can reliably support multiple devices being assigned
  to a guest.

  This revert should only be backported to kernels that contain
  .

Bjorn
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

2018-03-21 Thread Lorenzo Pieralisi
On Tue, Mar 20, 2018 at 11:00:36PM +, Sridhar Pitchai wrote:
> Hi Lorenzo,
>Transparent SRIOV is exposing the NIC directly to the kernel via
>para-virtual device, unlike creating a netdev and associating it
>with the bond driver. Further descriptions here,
>
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=0c195567a8f6e82ea5535cd9f1d54a1626dd233e
> 
> Previously, when using the bond driver, unique and persistent VF NIC
> name was required, so we used serial number as PCI domain which is
> included as part of the VF NIC name.  Transparent SRIOV mode puts VF
> NIC based on MAC match as a slave of synthetic NIC, so VF NIC’s name
> is no longer important.

Please read the link I sent you in relation to email formatting.

Then add your description above in a way that anyone not 100% familiar
with hyperv can understand it - that's what the commit log is for.

You are sending this patch to stable kernels, patch above has been in
the kernel from v4.14. The patch you are fixing since v4.11, you ought
to be careful since you do not want to have broken kernel versions owing
to stable patches mismatches, that's why I asked and I will ask again,
are you sure you won't trigger a regression by sending this fix to
stable ?

I assume the bond driver mechanism is now done and dusted.

Thanks,
Lorenzo

> Thanks,
> Sridhar
> 
> On 3/20/18, 11:32 AM, "Lorenzo Pieralisi"  wrote:
> 
> On Tue, Mar 20, 2018 at 05:56:15PM +, Sridhar Pitchai wrote:
> > Hi Lorenzo,
> 
> > Are we good with the explanation? Can I send the patch with the
> > updated commit comments?
> 
> Almost.
> 
> [...]
> 
> > Since we have the transparent SRIOV mode now, the short VF device 
> name
> > is no longer needed.
> 
> Can you correlate transparent SRIOV mode to the point you are making
> below ? Please explain what transparent SRIOV mode allows you to remove
> and why. The rest of the explanation seems OK.
> 
> Please follow this email format:
> 
> 
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fvger.kernel.org%2Flkml%2F%23s3-9=04%7C01%7CSridhar.Pitchai%40microsoft.com%7Cc5cdcb7951f64318e52708d58e90e6f2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636571675366181738%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1=yBdqc4NQZsO7O9vfgJsr5olU8GfLNjF5e9EAaCb7vq4%3D=0
> 
> Thanks,
> Lorenzo
> 
> > I still do not understand what this means and how it is related to 
> the
> > patch below, it may be clear to you, it is not to me, at all.
> > 
> > Sridhar >> the patch below, was introduced to make the device name 
> small, by taking only
> > 16bits of the serial number. Since we are not going to have the 
> serial number
> > updated to the BUS id, this has to be removed.
> > 
> > Fixes: 4a9b0933bdfc("PCI:hv:Use device serial number as PCI domain")
> > 
> > Fixes: 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI 
> domain")
> > Sridhr >> yes
> > 
> > I asked you an explicit question. Commit above was added for a 
> reason
> > I assume. This patch implies that kernel has been broken since v4.11
> > which is almost a year ago and nobody every noticed ? Or there are
> > systems where commit above is _necessary_ and this patch would break
> > them ?
> > 
> > I want a detailed explanation that highlights *why* it is safe to 
> apply
> > this patch and send it to stable kernels, commit log above won't do.
> > 
> > Sridhar>> HyperV provides a unique domain ID for PCI BUS. But it is 
> modified by the child
> > device when it is added. This cannot produce a unique domain ID all 
> the time.
> > Here in the bug, we see the collision between the serial number and 
> already
> > existing PCI bus. The cleaner way is never touch the domain ID 
> provided by
> > hyperV during the PCI bus creation. As long as hyperV make sure it 
> provides a
> > unique domain ID for the PCI for a VM it will not break, and HyperV 
> will
> > guarantees that the domain for the PCI bus for a given VM will be 
> always unique.
> > The original patch was also intending to have a unique domain ID 
> for the PCI
> > bus, by taking the serial number of the device, but it is not 
> sufficient, when
> > the device serial number is number which is the domain ID of the 
> existing PCI
> > bus.  With the current kernel we can repro this issue by adding a 
> device with a
> > serial number matching the existing PCI bus domain id. (in this 
> case that
> > happens to be zero).
> > 
> > 
> > Thanks,
> > Lorenzo
> > 
> > Cc: sta...@vger.kernel.org
> 

Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

2018-03-20 Thread Sridhar Pitchai
Hi Lorenzo,
   Transparent SRIOV is exposing the NIC directly to the kernel via
para-virtual device, unlike creating a netdev and associating it with the bond
driver. Further descriptions here,

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=0c195567a8f6e82ea5535cd9f1d54a1626dd233e

Previously, when using the bond driver, unique and persistent VF NIC name
was required, so we used serial number as PCI domain which is included as
part of the VF NIC name.  Transparent SRIOV mode puts VF NIC based on MAC
match as a slave of synthetic NIC, so VF NIC’s name is no longer important.

Thanks,
Sridhar

On 3/20/18, 11:32 AM, "Lorenzo Pieralisi"  wrote:

On Tue, Mar 20, 2018 at 05:56:15PM +, Sridhar Pitchai wrote:
> Hi Lorenzo,

> Are we good with the explanation? Can I send the patch with the
> updated commit comments?

Almost.

[...]

> Since we have the transparent SRIOV mode now, the short VF device name
> is no longer needed.

Can you correlate transparent SRIOV mode to the point you are making
below ? Please explain what transparent SRIOV mode allows you to remove
and why. The rest of the explanation seems OK.

Please follow this email format:


https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fvger.kernel.org%2Flkml%2F%23s3-9=04%7C01%7CSridhar.Pitchai%40microsoft.com%7Cc5cdcb7951f64318e52708d58e90e6f2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636571675366181738%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1=yBdqc4NQZsO7O9vfgJsr5olU8GfLNjF5e9EAaCb7vq4%3D=0

Thanks,
Lorenzo

> I still do not understand what this means and how it is related to the
> patch below, it may be clear to you, it is not to me, at all.
> 
> Sridhar >> the patch below, was introduced to make the device name 
small, by taking only
> 16bits of the serial number. Since we are not going to have the 
serial number
> updated to the BUS id, this has to be removed.
> 
> Fixes: 4a9b0933bdfc("PCI:hv:Use device serial number as PCI domain")
> 
> Fixes: 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI 
domain")
> Sridhr >> yes
> 
> I asked you an explicit question. Commit above was added for a reason
> I assume. This patch implies that kernel has been broken since v4.11
> which is almost a year ago and nobody every noticed ? Or there are
> systems where commit above is _necessary_ and this patch would break
> them ?
> 
> I want a detailed explanation that highlights *why* it is safe to 
apply
> this patch and send it to stable kernels, commit log above won't do.
> 
> Sridhar>> HyperV provides a unique domain ID for PCI BUS. But it is 
modified by the child
> device when it is added. This cannot produce a unique domain ID all 
the time.
> Here in the bug, we see the collision between the serial number and 
already
> existing PCI bus. The cleaner way is never touch the domain ID 
provided by
> hyperV during the PCI bus creation. As long as hyperV make sure it 
provides a
> unique domain ID for the PCI for a VM it will not break, and HyperV 
will
> guarantees that the domain for the PCI bus for a given VM will be 
always unique.
> The original patch was also intending to have a unique domain ID for 
the PCI
> bus, by taking the serial number of the device, but it is not 
sufficient, when
> the device serial number is number which is the domain ID of the 
existing PCI
> bus.  With the current kernel we can repro this issue by adding a 
device with a
> serial number matching the existing PCI bus domain id. (in this case 
that
> happens to be zero).
> 
> 
> Thanks,
> Lorenzo
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Sridhar Pitchai 
> ---
> Changes in v3:
> * fix the commit comment. [KY Srinivasan, Michael Kelley]
> ---
>   drivers/pci/host/pci-hyperv.c | 11 ---
>   1 file changed, 11 deletions(-)
> diff --git a/drivers/pci/host/pci-hyperv.c 
b/drivers/pci/host/pci-hyperv.c
> index 2faf38e..ac67e56 100644
> --- a/drivers/pci/host/pci-hyperv.c
> +++ b/drivers/pci/host/pci-hyperv.c
> @@ -1518,17 +1518,6 @@ static struct hv_pci_dev 
*new_pcichild_device(struct hv_pcibus_device *hbus,
>   get_pcichild(hpdev, hv_pcidev_ref_childlist);
>   spin_lock_irqsave(>device_list_lock, flags);
>   
> - /*
> -  * When a device is being added to the bus, we set the PCI 
domain
> -  * number to be the device serial number, which 

Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

2018-03-20 Thread Lorenzo Pieralisi
On Tue, Mar 20, 2018 at 05:56:15PM +, Sridhar Pitchai wrote:
> Hi Lorenzo,

> Are we good with the explanation? Can I send the patch with the
> updated commit comments?

Almost.

[...]

> Since we have the transparent SRIOV mode now, the short VF device name
> is no longer needed.

Can you correlate transparent SRIOV mode to the point you are making
below ? Please explain what transparent SRIOV mode allows you to remove
and why. The rest of the explanation seems OK.

Please follow this email format:

http://vger.kernel.org/lkml/#s3-9

Thanks,
Lorenzo

> I still do not understand what this means and how it is related to the
> patch below, it may be clear to you, it is not to me, at all.
> 
> Sridhar >> the patch below, was introduced to make the device name small, 
> by taking only
> 16bits of the serial number. Since we are not going to have the serial 
> number
> updated to the BUS id, this has to be removed.
> 
> Fixes: 4a9b0933bdfc("PCI:hv:Use device serial number as PCI domain")
> 
> Fixes: 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI domain")
> Sridhr >> yes
> 
> I asked you an explicit question. Commit above was added for a reason
> I assume. This patch implies that kernel has been broken since v4.11
> which is almost a year ago and nobody every noticed ? Or there are
> systems where commit above is _necessary_ and this patch would break
> them ?
> 
> I want a detailed explanation that highlights *why* it is safe to apply
> this patch and send it to stable kernels, commit log above won't do.
> 
> Sridhar>> HyperV provides a unique domain ID for PCI BUS. But it is 
> modified by the child
> device when it is added. This cannot produce a unique domain ID all the 
> time.
> Here in the bug, we see the collision between the serial number and 
> already
> existing PCI bus. The cleaner way is never touch the domain ID provided by
> hyperV during the PCI bus creation. As long as hyperV make sure it 
> provides a
> unique domain ID for the PCI for a VM it will not break, and HyperV will
> guarantees that the domain for the PCI bus for a given VM will be always 
> unique.
> The original patch was also intending to have a unique domain ID for the 
> PCI
> bus, by taking the serial number of the device, but it is not sufficient, 
> when
> the device serial number is number which is the domain ID of the existing 
> PCI
> bus.  With the current kernel we can repro this issue by adding a device 
> with a
> serial number matching the existing PCI bus domain id. (in this case that
> happens to be zero).
> 
> 
> Thanks,
> Lorenzo
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Sridhar Pitchai 
> ---
> Changes in v3:
> * fix the commit comment. [KY Srinivasan, Michael Kelley]
> ---
>   drivers/pci/host/pci-hyperv.c | 11 ---
>   1 file changed, 11 deletions(-)
> diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
> index 2faf38e..ac67e56 100644
> --- a/drivers/pci/host/pci-hyperv.c
> +++ b/drivers/pci/host/pci-hyperv.c
> @@ -1518,17 +1518,6 @@ static struct hv_pci_dev 
> *new_pcichild_device(struct hv_pcibus_device *hbus,
>   get_pcichild(hpdev, hv_pcidev_ref_childlist);
>   spin_lock_irqsave(>device_list_lock, flags);
>   
> - /*
> -  * When a device is being added to the bus, we set the PCI domain
> -  * number to be the device serial number, which is non-zero and
> -  * unique on the same VM.  The serial numbers start with 1, and
> -  * increase by 1 for each device.  So device names including this
> -  * can have shorter names than based on the bus instance UUID.
> -  * Only the first device serial number is used for domain, so the
> -  * domain number will not change after the first device is added.
> -  */
> - if (list_empty(>children))
> - hbus->sysdata.domain = desc->ser;
>   list_add_tail(>list_entry, >children);
>   spin_unlock_irqrestore(>device_list_lock, flags);
>   return hpdev;
> --
> 2.7.4 
> 
> 
> 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

2018-03-20 Thread Sridhar Pitchai
Hi Lorenzo,
Are we good with the explanation? Can I send the patch with the updated commit 
comments?

Thanks
Sridhar

On 3/15/18, 11:24 AM, "Sridhar Pitchai" <sridhar.pitc...@microsoft.com> wrote:

Apologies for not aligning my reply, I just realized after looking into
the mail archive in LKML.  I have aligned the replay now.


On 3/15/18, 10:56 AM, "Sridhar Pitchai" <sridhar.pitc...@microsoft.com> 
wrote:

Hi Lorenzo,
Answering the question inline.
Kindly let me know if it clarifies. I will send out another patch after we 
agree on the clarification.

Thanks
Sridhar Pitchai

-Original Message-
From: Lorenzo Pieralisi <lorenzo.pieral...@arm.com>
Sent: Thursday, March 15, 2018 5:05 AM
To: Sridhar Pitchai <sridhar.pitc...@microsoft.com>
Cc: Bjorn Helgaas <bhelg...@google.com>; Jake Oshins <ja...@microsoft.com>; 
Haiyang Zhang <haiya...@microsoft.com>; Stephen Hemminger 
<sthem...@microsoft.com>; Dexuan Cui <de...@microsoft.com>; KY Srinivasan 
<k...@microsoft.com>; Michael Kelley (EOSG) <michael.h.kel...@microsoft.com>; 
de...@linuxdriverproject.org; linux-...@vger.kernel.org; 
linux-ker...@vger.kernel.org
Subject: Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

On Thu, Mar 15, 2018 at 12:03:07AM +, Sridhar Pitchai wrote:
Whenever PCI bus is added, HyperV guarantees the BUS id is unique. Even

"Whenever a PCI bus is added"
Sridhar>> yes

with that when a first device is added to the bus, it overrides bus domain
ID with the device serial number. Sometime this can result in BUS ID not

Define "Sometime".

Sridhar>> HyperV when it creates a PCI bus it guarantees it provide a 
unique ID for it.
But, that unique BUS ID is replaced with device serial number. 0 is a valid
device serial number, and if there exists a PCI bus with domain ID 0 (Gen 1
version of hyperV VM have this for para virtual devices), this will result 
in
PCI bus id not being unique.


being unique. In this case, when PCI_BUS and a device to bus is added, the
first device overwrites the bus domain ID to the device serial number,
which is 0. Since there exsist a PCI bus with domain ID 0 already the PCI

s/exsist/exist

Sridhar>> yes

bus addition fails. This patch make sure when a device is added to a bus,
it never updated the bus domain ID.

s/updated/updates
Sridhar >> yes

Since we have the transparent SRIOV mode now, the short VF device name
is no longer needed.

I still do not understand what this means and how it is related to the
patch below, it may be clear to you, it is not to me, at all.

Sridhar >> the patch below, was introduced to make the device name small, 
by taking only
16bits of the serial number. Since we are not going to have the serial 
number
updated to the BUS id, this has to be removed.

Fixes: 4a9b0933bdfc("PCI:hv:Use device serial number as PCI domain")

Fixes: 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI domain")
Sridhr >> yes

I asked you an explicit question. Commit above was added for a reason
I assume. This patch implies that kernel has been broken since v4.11
which is almost a year ago and nobody every noticed ? Or there are
systems where commit above is _necessary_ and this patch would break
them ?

I want a detailed explanation that highlights *why* it is safe to apply
this patch and send it to stable kernels, commit log above won't do.

Sridhar>> HyperV provides a unique domain ID for PCI BUS. But it is 
modified by the child
device when it is added. This cannot produce a unique domain ID all the 
time.
Here in the bug, we see the collision between the serial number and already
existing PCI bus. The cleaner way is never touch the domain ID provided by
hyperV during the PCI bus creation. As long as hyperV make sure it provides 
a
unique domain ID for the PCI for a VM it will not break, and HyperV will
guarantees that the domain for the PCI bus for a given VM will be always 
unique.
The original patch was also intending to have a unique domain ID for the PCI
bus, by taking the serial number of the device, but it is not sufficient, 
when
the device serial number is number which is the domain ID of the existing 
PCI
bus.  With the current kernel we can repro this issue by adding a device 
with a
serial number matching the existing PCI bus domain id. (in this case that
happens to be zero).


Thanks,
Lorenzo

Cc: sta...@vger.kernel.org
Signed-off-by: Sridhar Pitchai <srpit...@microsoft.com>
---
Changes in v3:
* fix the commit

Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

2018-03-15 Thread Sridhar Pitchai
Apologies for not aligning my reply, I just realized after looking into
the mail archive in LKML.  I have aligned the replay now.


On 3/15/18, 10:56 AM, "Sridhar Pitchai" <sridhar.pitc...@microsoft.com> wrote:

Hi Lorenzo,
Answering the question inline.
Kindly let me know if it clarifies. I will send out another patch after we 
agree on the clarification.

Thanks
Sridhar Pitchai

-Original Message-
From: Lorenzo Pieralisi <lorenzo.pieral...@arm.com>
Sent: Thursday, March 15, 2018 5:05 AM
To: Sridhar Pitchai <sridhar.pitc...@microsoft.com>
Cc: Bjorn Helgaas <bhelg...@google.com>; Jake Oshins <ja...@microsoft.com>; 
Haiyang Zhang <haiya...@microsoft.com>; Stephen Hemminger 
<sthem...@microsoft.com>; Dexuan Cui <de...@microsoft.com>; KY Srinivasan 
<k...@microsoft.com>; Michael Kelley (EOSG) <michael.h.kel...@microsoft.com>; 
de...@linuxdriverproject.org; linux-...@vger.kernel.org; 
linux-ker...@vger.kernel.org
Subject: Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

On Thu, Mar 15, 2018 at 12:03:07AM +, Sridhar Pitchai wrote:
Whenever PCI bus is added, HyperV guarantees the BUS id is unique. Even

"Whenever a PCI bus is added"
Sridhar>> yes

with that when a first device is added to the bus, it overrides bus domain
ID with the device serial number. Sometime this can result in BUS ID not

Define "Sometime".

Sridhar>> HyperV when it creates a PCI bus it guarantees it provide a unique ID 
for it.
But, that unique BUS ID is replaced with device serial number. 0 is a valid
device serial number, and if there exists a PCI bus with domain ID 0 (Gen 1
version of hyperV VM have this for para virtual devices), this will result in
PCI bus id not being unique.


being unique. In this case, when PCI_BUS and a device to bus is added, the
first device overwrites the bus domain ID to the device serial number,
which is 0. Since there exsist a PCI bus with domain ID 0 already the PCI

s/exsist/exist

Sridhar>> yes

bus addition fails. This patch make sure when a device is added to a bus,
it never updated the bus domain ID.

s/updated/updates
Sridhar >> yes

Since we have the transparent SRIOV mode now, the short VF device name
is no longer needed.

I still do not understand what this means and how it is related to the
patch below, it may be clear to you, it is not to me, at all.

Sridhar >> the patch below, was introduced to make the device name small, by 
taking only
16bits of the serial number. Since we are not going to have the serial number
updated to the BUS id, this has to be removed.

Fixes: 4a9b0933bdfc("PCI:hv:Use device serial number as PCI domain")

Fixes: 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI domain")
Sridhr >> yes

I asked you an explicit question. Commit above was added for a reason
I assume. This patch implies that kernel has been broken since v4.11
which is almost a year ago and nobody every noticed ? Or there are
systems where commit above is _necessary_ and this patch would break
them ?

I want a detailed explanation that highlights *why* it is safe to apply
this patch and send it to stable kernels, commit log above won't do.

Sridhar>> HyperV provides a unique domain ID for PCI BUS. But it is modified by 
the child
device when it is added. This cannot produce a unique domain ID all the time.
Here in the bug, we see the collision between the serial number and already
existing PCI bus. The cleaner way is never touch the domain ID provided by
hyperV during the PCI bus creation. As long as hyperV make sure it provides a
unique domain ID for the PCI for a VM it will not break, and HyperV will
guarantees that the domain for the PCI bus for a given VM will be always unique.
The original patch was also intending to have a unique domain ID for the PCI
bus, by taking the serial number of the device, but it is not sufficient, when
the device serial number is number which is the domain ID of the existing PCI
bus.  With the current kernel we can repro this issue by adding a device with a
serial number matching the existing PCI bus domain id. (in this case that
happens to be zero).


Thanks,
Lorenzo

Cc: sta...@vger.kernel.org
Signed-off-by: Sridhar Pitchai <srpit...@microsoft.com>
---
Changes in v3:
* fix the commit comment. [KY Srinivasan, Michael Kelley]
---
  drivers/pci/host/pci-hyperv.c | 11 ---
  1 file changed, 11 deletions(-)
diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index 2faf38e..ac67e56 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -1518,17 +1518,6 @@ static struct hv_pci_dev *new_pcichild_device(struct 
hv_pcibus_device *hbus,
get_pcichild(hpdev, hv_pcidev_ref_childlist);
spin_lock_irqsave(>device_list_lock, flags);
  
-   /*
-* When a device is being added to the bus, we set the PCI domain
-* numbe

RE: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

2018-03-15 Thread Sridhar Pitchai
Hi Lorenzo,
Answering the question inline.
Kindly let me know if it clarifies. I will send out another patch after we 
agree on the clarification.

Thanks
Sridhar Pitchai

-Original Message-
From: Lorenzo Pieralisi <lorenzo.pieral...@arm.com> 
Sent: Thursday, March 15, 2018 5:05 AM
To: Sridhar Pitchai <sridhar.pitc...@microsoft.com>
Cc: Bjorn Helgaas <bhelg...@google.com>; Jake Oshins <ja...@microsoft.com>; 
Haiyang Zhang <haiya...@microsoft.com>; Stephen Hemminger 
<sthem...@microsoft.com>; Dexuan Cui <de...@microsoft.com>; KY Srinivasan 
<k...@microsoft.com>; Michael Kelley (EOSG) <michael.h.kel...@microsoft.com>; 
de...@linuxdriverproject.org; linux-...@vger.kernel.org; 
linux-ker...@vger.kernel.org
Subject: Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

On Thu, Mar 15, 2018 at 12:03:07AM +, Sridhar Pitchai wrote:
> Whenever PCI bus is added, HyperV guarantees the BUS id is unique. Even

"Whenever a PCI bus is added"
Sridhar>> yes

> with that when a first device is added to the bus, it overrides bus domain
> ID with the device serial number. Sometime this can result in BUS ID not

Define "Sometime".

Sridhar>> HyperV when it creates a PCI bus it guarantees it provide a unique ID 
for it. But, that unique BUS ID is replaced with device serial number. 0 is a 
valid device serial number, and if there exists a PCI bus with domain ID 0 (Gen 
1 version of hyperV VM have this for para virtual devices), this will result in 
PCI bus id not being unique. 

> being unique. In this case, when PCI_BUS and a device to bus is added, the
> first device overwrites the bus domain ID to the device serial number,
> which is 0. Since there exsist a PCI bus with domain ID 0 already the PCI

s/exsist/exist

Sridhar>> yes

> bus addition fails. This patch make sure when a device is added to a bus,
> it never updated the bus domain ID. 

s/updated/updates
Sridhar >> yes

> Since we have the transparent SRIOV mode now, the short VF device name
> is no longer needed.

I still do not understand what this means and how it is related to the
patch below, it may be clear to you, it is not to me, at all.

Sridhar >> the patch below, was introduced to make the device name small, by 
taking only 16bits of the serial number. Since we are not going to have the 
serial number updated to the BUS id, this has to be removed.

> Fixes: 4a9b0933bdfc("PCI:hv:Use device serial number as PCI domain")

Fixes: 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI domain")
Sridhr >> yes

I asked you an explicit question. Commit above was added for a reason
I assume. This patch implies that kernel has been broken since v4.11
which is almost a year ago and nobody every noticed ? Or there are
systems where commit above is _necessary_ and this patch would break
them ?

I want a detailed explanation that highlights *why* it is safe to apply
this patch and send it to stable kernels, commit log above won't do.

Sridhar>> HyperV provides a unique domain ID for PCI BUS. But it is modified by 
the child device when it is added. This cannot produce a unique domain ID all 
the time. Here in the bug, we see the collision between the serial number and 
already existing PCI bus. The cleaner way is never touch the domain ID provided 
by hyperV during the PCI bus creation. As long as hyperV make sure it provides 
a unique domain ID for the PCI for a VM it will not break, and HyperV will 
guarantees that the domain for the PCI bus for a given VM will be always unique.
The original patch was also intending to have a unique domain ID for the PCI 
bus, by taking the serial number of the device, but it is not sufficient, when 
the device serial number is number which is the domain ID of the existing PCI 
bus.
With the current kernel we can repro this issue by adding a device with a 
serial number matching the existing PCI bus domain id. (in this case that 
happens to be zero).

Thanks,
Lorenzo

> Cc: sta...@vger.kernel.org
> Signed-off-by: Sridhar Pitchai <srpit...@microsoft.com>
> ---
> 
> Changes in v3:
> * fix the commit comment. [KY Srinivasan, Michael Kelley]
> ---
>  drivers/pci/host/pci-hyperv.c | 11 ---
>  1 file changed, 11 deletions(-)
> 
> diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
> index 2faf38e..ac67e56 100644
> --- a/drivers/pci/host/pci-hyperv.c
> +++ b/drivers/pci/host/pci-hyperv.c
> @@ -1518,17 +1518,6 @@ static struct hv_pci_dev *new_pcichild_device(struct 
> hv_pcibus_device *hbus,
>   get_pcichild(hpdev, hv_pcidev_ref_childlist);
>   spin_lock_irqsave(>device_list_lock, flags);
>  
> - /*
> -  * When a device is being added to the bus, we set the PCI domain
> -  * number to be the device serial number, which is non-zero and
> 

Re: [PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

2018-03-15 Thread Lorenzo Pieralisi
On Thu, Mar 15, 2018 at 12:03:07AM +, Sridhar Pitchai wrote:
> Whenever PCI bus is added, HyperV guarantees the BUS id is unique. Even

"Whenever a PCI bus is added"

> with that when a first device is added to the bus, it overrides bus domain
> ID with the device serial number. Sometime this can result in BUS ID not

Define "Sometime".

> being unique. In this case, when PCI_BUS and a device to bus is added, the
> first device overwrites the bus domain ID to the device serial number,
> which is 0. Since there exsist a PCI bus with domain ID 0 already the PCI

s/exsist/exist

> bus addition fails. This patch make sure when a device is added to a bus,
> it never updated the bus domain ID. 

s/updated/updates

> Since we have the transparent SRIOV mode now, the short VF device name
> is no longer needed.

I still do not understand what this means and how it is related to the
patch below, it may be clear to you, it is not to me, at all.

> Fixes: 4a9b0933bdfc("PCI:hv:Use device serial number as PCI domain")

Fixes: 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI domain")

I asked you an explicit question. Commit above was added for a reason
I assume. This patch implies that kernel has been broken since v4.11
which is almost a year ago and nobody every noticed ? Or there are
systems where commit above is _necessary_ and this patch would break
them ?

I want a detailed explanation that highlights *why* it is safe to apply
this patch and send it to stable kernels, commit log above won't do.

Thanks,
Lorenzo

> Cc: sta...@vger.kernel.org
> Signed-off-by: Sridhar Pitchai 
> ---
> 
> Changes in v3:
> * fix the commit comment. [KY Srinivasan, Michael Kelley]
> ---
>  drivers/pci/host/pci-hyperv.c | 11 ---
>  1 file changed, 11 deletions(-)
> 
> diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
> index 2faf38e..ac67e56 100644
> --- a/drivers/pci/host/pci-hyperv.c
> +++ b/drivers/pci/host/pci-hyperv.c
> @@ -1518,17 +1518,6 @@ static struct hv_pci_dev *new_pcichild_device(struct 
> hv_pcibus_device *hbus,
>   get_pcichild(hpdev, hv_pcidev_ref_childlist);
>   spin_lock_irqsave(>device_list_lock, flags);
>  
> - /*
> -  * When a device is being added to the bus, we set the PCI domain
> -  * number to be the device serial number, which is non-zero and
> -  * unique on the same VM.  The serial numbers start with 1, and
> -  * increase by 1 for each device.  So device names including this
> -  * can have shorter names than based on the bus instance UUID.
> -  * Only the first device serial number is used for domain, so the
> -  * domain number will not change after the first device is added.
> -  */
> - if (list_empty(>children))
> - hbus->sysdata.domain = desc->ser;
>   list_add_tail(>list_entry, >children);
>   spin_unlock_irqrestore(>device_list_lock, flags);
>   return hpdev;
> -- 
> 2.7.4 
> 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v3]PCI: hv: fix PCI-BUS domainID corruption

2018-03-14 Thread Sridhar Pitchai
Whenever PCI bus is added, HyperV guarantees the BUS id is unique. Even
with that when a first device is added to the bus, it overrides bus domain
ID with the device serial number. Sometime this can result in BUS ID not
being unique. In this case, when PCI_BUS and a device to bus is added, the
first device overwrites the bus domain ID to the device serial number,
which is 0. Since there exsist a PCI bus with domain ID 0 already the PCI
bus addition fails. This patch make sure when a device is added to a bus,
it never updated the bus domain ID. Since we have the transparent SRIOV
mode now, the short VF device name is no longer needed.

Fixes: 4a9b0933bdfc("PCI:hv:Use device serial number as PCI domain")
Cc: sta...@vger.kernel.org
Signed-off-by: Sridhar Pitchai 
---

Changes in v3:
* fix the commit comment. [KY Srinivasan, Michael Kelley]
---
 drivers/pci/host/pci-hyperv.c | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index 2faf38e..ac67e56 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -1518,17 +1518,6 @@ static struct hv_pci_dev *new_pcichild_device(struct 
hv_pcibus_device *hbus,
get_pcichild(hpdev, hv_pcidev_ref_childlist);
spin_lock_irqsave(>device_list_lock, flags);
 
-   /*
-* When a device is being added to the bus, we set the PCI domain
-* number to be the device serial number, which is non-zero and
-* unique on the same VM.  The serial numbers start with 1, and
-* increase by 1 for each device.  So device names including this
-* can have shorter names than based on the bus instance UUID.
-* Only the first device serial number is used for domain, so the
-* domain number will not change after the first device is added.
-*/
-   if (list_empty(>children))
-   hbus->sysdata.domain = desc->ser;
list_add_tail(>list_entry, >children);
spin_unlock_irqrestore(>device_list_lock, flags);
return hpdev;
-- 
2.7.4 

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel