Re: [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Fri, Aug 16, 2019 at 12:43:07AM +, Jason Gunthorpe wrote: > On Thu, Aug 15, 2019 at 04:51:33PM -0400, Jerome Glisse wrote: > > > struct page. In this case any way we can update the > > nouveau_dmem_page() to check that page page->pgmap == the > > expected pgmap. > > I was also wondering if that is a problem.. just blindly doing a > container_of on the page->pgmap does seem like it assumes that only > this driver is using DEVICE_PRIVATE. > > It seems like something missing in hmm_range_fault, it should be told > what DEVICE_PRIVATE is acceptable to trigger HMM_PFN_DEVICE_PRIVATE > and fault all others? The whole device private handling in hmm and migrate_vma seems pretty broken as far as I can tell, and I have some WIP patches. Basically we should not touch (or possibly eventually call migrate to ram eventually in the future) device private pages not owned by the caller, where I try to defined the caller by the dev_pagemap_ops instance.
Re: [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Thu, Aug 15, 2019 at 04:33:06PM -0400, Jerome Glisse wrote: > So nor HMM nor driver should dereference the struct page (i do not > think any iommu driver would either), Both current hmm drivers convert the hmm pfn back to a page and eventually call dma_map_page on it. As do the ODP patches from you.
Re: [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Thu, Aug 15, 2019 at 5:41 PM Jason Gunthorpe wrote: > > On Thu, Aug 15, 2019 at 01:47:12PM -0700, Dan Williams wrote: > > On Thu, Aug 15, 2019 at 1:41 PM Jason Gunthorpe wrote: > > > > > > On Thu, Aug 15, 2019 at 04:33:06PM -0400, Jerome Glisse wrote: > > > > > > > So nor HMM nor driver should dereference the struct page (i do not > > > > think any iommu driver would either), > > > > > > Er, they do technically deref the struct page: > > > > > > nouveau_dmem_convert_pfn(struct nouveau_drm *drm, > > > struct hmm_range *range) > > > struct page *page; > > > page = hmm_pfn_to_page(range, range->pfns[i]); > > > if (!nouveau_dmem_page(drm, page)) { > > > > > > > > > nouveau_dmem_page(struct nouveau_drm *drm, struct page *page) > > > { > > > return is_device_private_page(page) && drm->dmem == > > > page_to_dmem(page) > > > > > > > > > Which does touch 'page->pgmap' > > > > > > Is this OK without having a get_dev_pagemap() ? > > > > > > Noting that the collision-retry scheme doesn't protect anything here > > > as we can have a concurrent invalidation while doing the above deref. > > > > As long take_driver_page_table_lock() in Jerome's flow can replace > > percpu_ref_tryget_live() on the pagemap reference. It seems > > nouveau_dmem_convert_pfn() happens after: > > > > mutex_lock(>mutex); > > if (!nouveau_range_done()) { > > > > ...so I would expect that to be functionally equivalent to validating > > the reference count. > > Yes, OK, that makes sense, I was mostly surprised by the statement the > driver doesn't touch the struct page.. > > I suppose "doesn't touch the struct page out of the driver lock" is > the case. > > However, this means we cannot do any processing of ZONE_DEVICE pages > outside the driver lock, so eg, doing any DMA map that might rely on > MEMORY_DEVICE_PCI_P2PDMA has to be done in the driver lock, which is > a bit unfortunate. Wouldn't P2PDMA use page pins? Not needing to hold a lock over ZONE_DEVICE page operations was one of the motivations for plumbing get_dev_pagemap() with a percpu-ref.
Re: [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Thu, Aug 15, 2019 at 04:51:33PM -0400, Jerome Glisse wrote: > struct page. In this case any way we can update the > nouveau_dmem_page() to check that page page->pgmap == the > expected pgmap. I was also wondering if that is a problem.. just blindly doing a container_of on the page->pgmap does seem like it assumes that only this driver is using DEVICE_PRIVATE. It seems like something missing in hmm_range_fault, it should be told what DEVICE_PRIVATE is acceptable to trigger HMM_PFN_DEVICE_PRIVATE and fault all others? Jason
Re: [Nouveau] [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Thu, Aug 15, 2019 at 01:47:12PM -0700, Dan Williams wrote: > On Thu, Aug 15, 2019 at 1:41 PM Jason Gunthorpe wrote: > > > > On Thu, Aug 15, 2019 at 04:33:06PM -0400, Jerome Glisse wrote: > > > > > So nor HMM nor driver should dereference the struct page (i do not > > > think any iommu driver would either), > > > > Er, they do technically deref the struct page: > > > > nouveau_dmem_convert_pfn(struct nouveau_drm *drm, > > struct hmm_range *range) > > struct page *page; > > page = hmm_pfn_to_page(range, range->pfns[i]); > > if (!nouveau_dmem_page(drm, page)) { > > > > > > nouveau_dmem_page(struct nouveau_drm *drm, struct page *page) > > { > > return is_device_private_page(page) && drm->dmem == > > page_to_dmem(page) > > > > > > Which does touch 'page->pgmap' > > > > Is this OK without having a get_dev_pagemap() ? > > > > Noting that the collision-retry scheme doesn't protect anything here > > as we can have a concurrent invalidation while doing the above deref. > > As long take_driver_page_table_lock() in Jerome's flow can replace > percpu_ref_tryget_live() on the pagemap reference. It seems > nouveau_dmem_convert_pfn() happens after: > > mutex_lock(>mutex); > if (!nouveau_range_done()) { > > ...so I would expect that to be functionally equivalent to validating > the reference count. Yes, OK, that makes sense, I was mostly surprised by the statement the driver doesn't touch the struct page.. I suppose "doesn't touch the struct page out of the driver lock" is the case. However, this means we cannot do any processing of ZONE_DEVICE pages outside the driver lock, so eg, doing any DMA map that might rely on MEMORY_DEVICE_PCI_P2PDMA has to be done in the driver lock, which is a bit unfortunate. Jason ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [PATCH] PCI: Use pci_reset_bus() in quirk_reset_lenovo_thinkpad_50_nvgpu()
On Thu, Aug 01, 2019 at 06:01:17PM -0400, Lyude Paul wrote: > Since quirk_nvidia_hda() was added there's now two nvidia device > functions on any laptops with nvidia GPUs: the HDA controller, and the > GPU itself. Unfortunately this has the sideaffect of breaking > quirk_reset_lenovo_thinkpad_50_nvgpu() since pci_reset_function() was > using pci_parent_bus_reset() to reset the GPU's respective PCI bus, and > pci_parent_bus_reset() does not work on busses which have more then a > single device function present. > > So, fix this by simply calling pci_reset_bus() instead which properly > resets the GPU bus and all device functions under it, including both the > GPU and the HDA controller. > > Fixes: b516ea586d71 ("PCI: Enable NVIDIA HDA controllers") > Cc: Lukas Wunner > Cc: Daniel Drake > Cc: Bjorn Helgaas > Cc: Aaron Plattner > Cc: Peter Wu > Cc: Ilia Mirkin > Cc: Karol Herbst > Cc: Maik Freudenberg > Cc: linux-...@vger.kernel.org > Signed-off-by: Lyude Paul We merged b516ea586d71 for v5.3, so I applied this with Ben's ack to for-linus for v5.3, thanks! > --- > drivers/pci/quirks.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index 208aacf39329..44c4ae1abd00 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -5256,7 +5256,7 @@ static void > quirk_reset_lenovo_thinkpad_p50_nvgpu(struct pci_dev *pdev) >*/ > if (ioread32(map + 0x2240c) & 0x2) { > pci_info(pdev, FW_BUG "GPU left initialized by EFI, > resetting\n"); > - ret = pci_reset_function(pdev); > + ret = pci_reset_bus(pdev); > if (ret < 0) > pci_err(pdev, "Failed to reset GPU: %d\n", ret); > } > -- > 2.21.0 >
Re: [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Thu, Aug 15, 2019 at 08:41:33PM +, Jason Gunthorpe wrote: > On Thu, Aug 15, 2019 at 04:33:06PM -0400, Jerome Glisse wrote: > > > So nor HMM nor driver should dereference the struct page (i do not > > think any iommu driver would either), > > Er, they do technically deref the struct page: > > nouveau_dmem_convert_pfn(struct nouveau_drm *drm, >struct hmm_range *range) > struct page *page; > page = hmm_pfn_to_page(range, range->pfns[i]); > if (!nouveau_dmem_page(drm, page)) { > > > nouveau_dmem_page(struct nouveau_drm *drm, struct page *page) > { > return is_device_private_page(page) && drm->dmem == page_to_dmem(page) > > > Which does touch 'page->pgmap' > > Is this OK without having a get_dev_pagemap() ? > > Noting that the collision-retry scheme doesn't protect anything here > as we can have a concurrent invalidation while doing the above deref. Uh ? How so ? We are not reading the same code i think. My read is that function is call when holding the device lock which exclude any racing mmu notifier from making forward progress and it is also protected by the range so at the time this happens it is safe to dereference the struct page. In this case any way we can update the nouveau_dmem_page() to check that page page->pgmap == the expected pgmap. Cheers, Jérôme
Re: [Nouveau] [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Thu, Aug 15, 2019 at 1:41 PM Jason Gunthorpe wrote: > > On Thu, Aug 15, 2019 at 04:33:06PM -0400, Jerome Glisse wrote: > > > So nor HMM nor driver should dereference the struct page (i do not > > think any iommu driver would either), > > Er, they do technically deref the struct page: > > nouveau_dmem_convert_pfn(struct nouveau_drm *drm, > struct hmm_range *range) > struct page *page; > page = hmm_pfn_to_page(range, range->pfns[i]); > if (!nouveau_dmem_page(drm, page)) { > > > nouveau_dmem_page(struct nouveau_drm *drm, struct page *page) > { > return is_device_private_page(page) && drm->dmem == page_to_dmem(page) > > > Which does touch 'page->pgmap' > > Is this OK without having a get_dev_pagemap() ? > > Noting that the collision-retry scheme doesn't protect anything here > as we can have a concurrent invalidation while doing the above deref. As long take_driver_page_table_lock() in Jerome's flow can replace percpu_ref_tryget_live() on the pagemap reference. It seems nouveau_dmem_convert_pfn() happens after: mutex_lock(>mutex); if (!nouveau_range_done()) { ...so I would expect that to be functionally equivalent to validating the reference count. ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Thu, Aug 15, 2019 at 04:33:06PM -0400, Jerome Glisse wrote: > So nor HMM nor driver should dereference the struct page (i do not > think any iommu driver would either), Er, they do technically deref the struct page: nouveau_dmem_convert_pfn(struct nouveau_drm *drm, struct hmm_range *range) struct page *page; page = hmm_pfn_to_page(range, range->pfns[i]); if (!nouveau_dmem_page(drm, page)) { nouveau_dmem_page(struct nouveau_drm *drm, struct page *page) { return is_device_private_page(page) && drm->dmem == page_to_dmem(page) Which does touch 'page->pgmap' Is this OK without having a get_dev_pagemap() ? Noting that the collision-retry scheme doesn't protect anything here as we can have a concurrent invalidation while doing the above deref. Jason
Re: [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Thu, Aug 15, 2019 at 01:12:22PM -0700, Dan Williams wrote: > On Thu, Aug 15, 2019 at 12:44 PM Jerome Glisse wrote: > > > > On Thu, Aug 15, 2019 at 12:36:58PM -0700, Dan Williams wrote: > > > On Thu, Aug 15, 2019 at 11:07 AM Jerome Glisse wrote: > > > > > > > > On Wed, Aug 14, 2019 at 07:48:28AM -0700, Dan Williams wrote: > > > > > On Wed, Aug 14, 2019 at 6:28 AM Jason Gunthorpe > > > > > wrote: > > > > > > > > > > > > On Wed, Aug 14, 2019 at 09:38:54AM +0200, Christoph Hellwig wrote: > > > > > > > On Tue, Aug 13, 2019 at 06:36:33PM -0700, Dan Williams wrote: > > > > > > > > Section alignment constraints somewhat save us here. The only > > > > > > > > example > > > > > > > > I can think of a PMD not containing a uniform pgmap association > > > > > > > > for > > > > > > > > each pte is the case when the pgmap overlaps normal dram, i.e. > > > > > > > > shares > > > > > > > > the same 'struct memory_section' for a given span. Otherwise, > > > > > > > > distinct > > > > > > > > pgmaps arrange to manage their own exclusive sections (and now > > > > > > > > subsections as of v5.3). Otherwise the implementation could not > > > > > > > > guarantee different mapping lifetimes. > > > > > > > > > > > > > > > > That said, this seems to want a better mechanism to determine > > > > > > > > "pfn is > > > > > > > > ZONE_DEVICE". > > > > > > > > > > > > > > So I guess this patch is fine for now, and once you provide a > > > > > > > better > > > > > > > mechanism we can switch over to it? > > > > > > > > > > > > What about the version I sent to just get rid of all the strange > > > > > > put_dev_pagemaps while scanning? Odds are good we will work with > > > > > > only > > > > > > a single pagemap, so it makes some sense to cache it once we find > > > > > > it? > > > > > > > > > > Yes, if the scan is over a single pmd then caching it makes sense. > > > > > > > > Quite frankly an easier an better solution is to remove the pagemap > > > > lookup as HMM user abide by mmu notifier it means we will not make > > > > use or dereference the struct page so that we are safe from any > > > > racing hotunplug of dax memory (as long as device driver using hmm > > > > do not have a bug). > > > > > > Yes, as long as the driver remove is synchronized against HMM > > > operations via another mechanism then there is no need to take pagemap > > > references. Can you briefly describe what that other mechanism is? > > > > So if you hotunplug some dax memory i assume that this can only > > happens once all the pages are unmapped (as it must have the > > zero refcount, well 1 because of the bias) and any unmap will > > trigger a mmu notifier callback. User of hmm mirror abiding by > > the API will never make use of information they get through the > > fault or snapshot function until checking for racing notifier > > under lock. > > Hmm that first assumption is not guaranteed by the dev_pagemap core. > The dev_pagemap end of life model is "disable, invalidate, drain" so > it's possible to call devm_munmap_pages() while pages are still mapped > it just won't complete the teardown of the pagemap until the last > reference is dropped. New references are blocked during this teardown. > > However, if the driver is validating the liveness of the mapping in > the mmu-notifier path and blocking new references it sounds like it > should be ok. Might there be GPU driver unit tests that cover this > racing teardown case? So nor HMM nor driver should dereference the struct page (i do not think any iommu driver would either), they only care about the pfn. So even if we race with a teardown as soon as we get the mmu notifier callback to invalidate the mmapping we will do so. The pattern is: mydriver_populate_vaddr_range(start, end) { hmm_range_register(range, start, end) again: ret = hmm_range_fault(start, end) if (ret < 0) return ret; take_driver_page_table_lock(); if (range.valid) { populate_device_page_table(); release_driver_page_table_lock(); } else { release_driver_page_table_lock(); goto again; } } The mmu notifier callback do use the same page table lock and we also have the range tracking going on. So either we populate device page table before racing with teardown in which case the device page table entry are clear through the mmu notifier call back. Or if we race, but then we can see the racing mmu notifier calls and retry again which will trigger a regular page fault which will return an error i assume. So in the end we have the exact same behavior as if a CPU was trying to access that virtual address. This is the whole point of HMM, to behave exactly as if it was a CPU access. Fails in the same way, race in the same way. So if DAX teardown are safe versus racing CPU access to some vma that have that memory map, it will be the same for HMM users. GPU driver test suite are not good at testing this.
Re: [Nouveau] [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Thu, Aug 15, 2019 at 12:44 PM Jerome Glisse wrote: > > On Thu, Aug 15, 2019 at 12:36:58PM -0700, Dan Williams wrote: > > On Thu, Aug 15, 2019 at 11:07 AM Jerome Glisse wrote: > > > > > > On Wed, Aug 14, 2019 at 07:48:28AM -0700, Dan Williams wrote: > > > > On Wed, Aug 14, 2019 at 6:28 AM Jason Gunthorpe > > > > wrote: > > > > > > > > > > On Wed, Aug 14, 2019 at 09:38:54AM +0200, Christoph Hellwig wrote: > > > > > > On Tue, Aug 13, 2019 at 06:36:33PM -0700, Dan Williams wrote: > > > > > > > Section alignment constraints somewhat save us here. The only > > > > > > > example > > > > > > > I can think of a PMD not containing a uniform pgmap association > > > > > > > for > > > > > > > each pte is the case when the pgmap overlaps normal dram, i.e. > > > > > > > shares > > > > > > > the same 'struct memory_section' for a given span. Otherwise, > > > > > > > distinct > > > > > > > pgmaps arrange to manage their own exclusive sections (and now > > > > > > > subsections as of v5.3). Otherwise the implementation could not > > > > > > > guarantee different mapping lifetimes. > > > > > > > > > > > > > > That said, this seems to want a better mechanism to determine > > > > > > > "pfn is > > > > > > > ZONE_DEVICE". > > > > > > > > > > > > So I guess this patch is fine for now, and once you provide a better > > > > > > mechanism we can switch over to it? > > > > > > > > > > What about the version I sent to just get rid of all the strange > > > > > put_dev_pagemaps while scanning? Odds are good we will work with only > > > > > a single pagemap, so it makes some sense to cache it once we find it? > > > > > > > > Yes, if the scan is over a single pmd then caching it makes sense. > > > > > > Quite frankly an easier an better solution is to remove the pagemap > > > lookup as HMM user abide by mmu notifier it means we will not make > > > use or dereference the struct page so that we are safe from any > > > racing hotunplug of dax memory (as long as device driver using hmm > > > do not have a bug). > > > > Yes, as long as the driver remove is synchronized against HMM > > operations via another mechanism then there is no need to take pagemap > > references. Can you briefly describe what that other mechanism is? > > So if you hotunplug some dax memory i assume that this can only > happens once all the pages are unmapped (as it must have the > zero refcount, well 1 because of the bias) and any unmap will > trigger a mmu notifier callback. User of hmm mirror abiding by > the API will never make use of information they get through the > fault or snapshot function until checking for racing notifier > under lock. Hmm that first assumption is not guaranteed by the dev_pagemap core. The dev_pagemap end of life model is "disable, invalidate, drain" so it's possible to call devm_munmap_pages() while pages are still mapped it just won't complete the teardown of the pagemap until the last reference is dropped. New references are blocked during this teardown. However, if the driver is validating the liveness of the mapping in the mmu-notifier path and blocking new references it sounds like it should be ok. Might there be GPU driver unit tests that cover this racing teardown case? ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Thu, Aug 15, 2019 at 12:36:58PM -0700, Dan Williams wrote: > On Thu, Aug 15, 2019 at 11:07 AM Jerome Glisse wrote: > > > > On Wed, Aug 14, 2019 at 07:48:28AM -0700, Dan Williams wrote: > > > On Wed, Aug 14, 2019 at 6:28 AM Jason Gunthorpe wrote: > > > > > > > > On Wed, Aug 14, 2019 at 09:38:54AM +0200, Christoph Hellwig wrote: > > > > > On Tue, Aug 13, 2019 at 06:36:33PM -0700, Dan Williams wrote: > > > > > > Section alignment constraints somewhat save us here. The only > > > > > > example > > > > > > I can think of a PMD not containing a uniform pgmap association for > > > > > > each pte is the case when the pgmap overlaps normal dram, i.e. > > > > > > shares > > > > > > the same 'struct memory_section' for a given span. Otherwise, > > > > > > distinct > > > > > > pgmaps arrange to manage their own exclusive sections (and now > > > > > > subsections as of v5.3). Otherwise the implementation could not > > > > > > guarantee different mapping lifetimes. > > > > > > > > > > > > That said, this seems to want a better mechanism to determine "pfn > > > > > > is > > > > > > ZONE_DEVICE". > > > > > > > > > > So I guess this patch is fine for now, and once you provide a better > > > > > mechanism we can switch over to it? > > > > > > > > What about the version I sent to just get rid of all the strange > > > > put_dev_pagemaps while scanning? Odds are good we will work with only > > > > a single pagemap, so it makes some sense to cache it once we find it? > > > > > > Yes, if the scan is over a single pmd then caching it makes sense. > > > > Quite frankly an easier an better solution is to remove the pagemap > > lookup as HMM user abide by mmu notifier it means we will not make > > use or dereference the struct page so that we are safe from any > > racing hotunplug of dax memory (as long as device driver using hmm > > do not have a bug). > > Yes, as long as the driver remove is synchronized against HMM > operations via another mechanism then there is no need to take pagemap > references. Can you briefly describe what that other mechanism is? So if you hotunplug some dax memory i assume that this can only happens once all the pages are unmapped (as it must have the zero refcount, well 1 because of the bias) and any unmap will trigger a mmu notifier callback. User of hmm mirror abiding by the API will never make use of information they get through the fault or snapshot function until checking for racing notifier under lock. Cheers, Jérôme
Re: [Nouveau] [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Thu, Aug 15, 2019 at 11:07 AM Jerome Glisse wrote: > > On Wed, Aug 14, 2019 at 07:48:28AM -0700, Dan Williams wrote: > > On Wed, Aug 14, 2019 at 6:28 AM Jason Gunthorpe wrote: > > > > > > On Wed, Aug 14, 2019 at 09:38:54AM +0200, Christoph Hellwig wrote: > > > > On Tue, Aug 13, 2019 at 06:36:33PM -0700, Dan Williams wrote: > > > > > Section alignment constraints somewhat save us here. The only example > > > > > I can think of a PMD not containing a uniform pgmap association for > > > > > each pte is the case when the pgmap overlaps normal dram, i.e. shares > > > > > the same 'struct memory_section' for a given span. Otherwise, distinct > > > > > pgmaps arrange to manage their own exclusive sections (and now > > > > > subsections as of v5.3). Otherwise the implementation could not > > > > > guarantee different mapping lifetimes. > > > > > > > > > > That said, this seems to want a better mechanism to determine "pfn is > > > > > ZONE_DEVICE". > > > > > > > > So I guess this patch is fine for now, and once you provide a better > > > > mechanism we can switch over to it? > > > > > > What about the version I sent to just get rid of all the strange > > > put_dev_pagemaps while scanning? Odds are good we will work with only > > > a single pagemap, so it makes some sense to cache it once we find it? > > > > Yes, if the scan is over a single pmd then caching it makes sense. > > Quite frankly an easier an better solution is to remove the pagemap > lookup as HMM user abide by mmu notifier it means we will not make > use or dereference the struct page so that we are safe from any > racing hotunplug of dax memory (as long as device driver using hmm > do not have a bug). Yes, as long as the driver remove is synchronized against HMM operations via another mechanism then there is no need to take pagemap references. Can you briefly describe what that other mechanism is? ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Thu, Aug 15, 2019 at 02:03:25PM -0400, Jerome Glisse wrote: > On Wed, Aug 14, 2019 at 07:48:28AM -0700, Dan Williams wrote: > > On Wed, Aug 14, 2019 at 6:28 AM Jason Gunthorpe wrote: > > > > > > On Wed, Aug 14, 2019 at 09:38:54AM +0200, Christoph Hellwig wrote: > > > > On Tue, Aug 13, 2019 at 06:36:33PM -0700, Dan Williams wrote: > > > > > Section alignment constraints somewhat save us here. The only example > > > > > I can think of a PMD not containing a uniform pgmap association for > > > > > each pte is the case when the pgmap overlaps normal dram, i.e. shares > > > > > the same 'struct memory_section' for a given span. Otherwise, distinct > > > > > pgmaps arrange to manage their own exclusive sections (and now > > > > > subsections as of v5.3). Otherwise the implementation could not > > > > > guarantee different mapping lifetimes. > > > > > > > > > > That said, this seems to want a better mechanism to determine "pfn is > > > > > ZONE_DEVICE". > > > > > > > > So I guess this patch is fine for now, and once you provide a better > > > > mechanism we can switch over to it? > > > > > > What about the version I sent to just get rid of all the strange > > > put_dev_pagemaps while scanning? Odds are good we will work with only > > > a single pagemap, so it makes some sense to cache it once we find it? > > > > Yes, if the scan is over a single pmd then caching it makes sense. > > Quite frankly an easier an better solution is to remove the pagemap > lookup as HMM user abide by mmu notifier it means we will not make > use or dereference the struct page so that we are safe from any > racing hotunplug of dax memory (as long as device driver using hmm > do not have a bug). Yes, I also would prefer to drop the confusing checks entirely - Christoph can you resend this patch? Thanks, Jason ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, Aug 15, 2019 at 12:19 PM wrote: > > > -Original Message- > > From: Takashi Iwai > > Sent: Thursday, August 15, 2019 9:57 AM > > To: Alex Deucher > > Cc: Karol Herbst; Limonciello, Mario; nouveau; Rafael J . Wysocki; LKML; > > dri-devel; > > Linux ACPI Mailing List; Alex Hung; Ben Skeggs; David Airlie > > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string > > to > > enable dGPU direct output" > > > > > > [EXTERNAL EMAIL] > > > > On Thu, 15 Aug 2019 16:37:05 +0200, > > Alex Deucher wrote: > > > > > > On Thu, Aug 15, 2019 at 10:25 AM Karol Herbst wrote: > > > > > > > > On Thu, Aug 15, 2019 at 4:20 PM wrote: > > > > > > > > > > > > There are definitely going to be regressions on machines in the > > > > > > > field > > with the > > > > > > > in tree drivers by reverting this. I think we should have an > > > > > > > answer for all > > of > > > > > > those > > > > > > > before this revert is accepted. > > > > > > > > > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with > > > > > > > partners > > to > > > > > > collect > > > > > > > some information on the impact of reverting this. > > > > > > > > > > > > > > When this is used on a system with Intel+AMD the ASL configures > > > > > > > AMD > > GPU to > > > > > > use > > > > > > > "Hybrid Graphics" when on Windows and "Power Express" and > > "Switchable > > > > > > Graphics" > > > > > > > when on Linux. > > > > > > > > > > > > and what's exactly the difference between those? And what's the > > > > > > actual > > > > > > issue here? > > > > > > > > > > DP/HDMI is not detected unless plugged in at bootup. It's due to > > > > > missing > > HPD > > > > > events. > > > > > > > > > > > > > afaik Lyude was working on fixing all that, at least for some drivers. > > > > If there is something wrong, we still should fix the drivers, not > > > > adding ACPI workarounds. > > > > > > > > Alex: do you know if there are remaining issues regarding that with > > > > amdgpu? > > > > > > There was an issue with hpd events not making it to the audio side > > > when things were powered down that was fixed with this patch set: > > > https://patchwork.freedesktop.org/patch/316793/ > > > Those patches depended on a bunch of alsa changes as well which may > > > have not been available in the distro used for a particular OEM > > > program. > > > > FYI, the corresponding commit for ALSA part is destined for 5.4 > > kernel: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/commit/?id=ade > > 49db337a9d44ac5835cfce1ee873549011b27 > > > > BTW, Nouveau should suffer from the same problem. The patch to add > > the audio component support is found at: > > https://patchwork.freedesktop.org/patch/319131/ > > > > > > It sounds like 5.3rcX won't be a useful check then. > > So am I correct to understand that everything related to the AMD failures > described in this thread should be in linux-next at this point? > Assuming you mean the missing audio hotplug events, then yes. Alex ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [RFC] drm: Bump encoder limit from 32 to 64
On Thu, Aug 15, 2019 at 8:03 PM Lyude Paul wrote: > > Assuming that GPUs would never have even close to 32 separate video > encoders is quite honestly a pretty reasonable assumption. Unfortunately > we do not live in a reasonable world, as it looks like it is actually > possible to find devices that will create more drm_encoder objects then > this. Case in point: the ThinkPad P71's discrete GPU, which exposes 1 > eDP port and 5 DP ports. On the P71, nouveau attempts to create one > encoder for the eDP port, and two encoders for each DP++/USB-C port > along with 4 MST encoders for each DP port. This comes out to 35 > different encoders. Unfortunately, this can't really be optimized to > make less encoders either. > > So, what if we bumped the limit to 64? Unfortunately this has one very > awkward drawback: we already expose 32-bit bitmasks for encoders to > userspace in drm_encoder->possible_clones. Yikes. Luckily for us > however, the year is 2019 and modern hardware that supports cloning is > basically non-existent. > > So, let's try to compromise here: allow encoders with indexes <32 to > have non-zero values in drm_encoder->possible_clones, and don't allow > encoders with higher indexes to set drm_encoder->possible_clones to a > non-zero value. This allows us to avoid breaking UAPI, while still being > able to bump up the encoder limit. > > This also fixes driver probing for nouveau on the ThinkPad P71. > > Signed-off-by: Lyude Paul > Cc: nouveau@lists.freedesktop.org We should never have exposed encoders, that was such a mistake ... Does nouveau really still use encoders as datastructures internally for modeset? Otherwise could we perhaps just expose a set of fake encoders (one per connector) and give up on all these tricks? Maybe even as a drm level generic thing, i.e. at drm_register_time we check whether there's any encoders with ->possible_clones set (the only real information in an encoder, and fairly useless on modern hw like you point out). And if that's the case we hide all the "real" encoders from userspace and put up a pile of fake ones. Which do nothing else than keep userspace happy. Thoughts? > --- > drivers/gpu/drm/drm_atomic.c | 2 +- > drivers/gpu/drm/drm_encoder.c | 10 -- > include/drm/drm_crtc.h| 2 +- > include/drm/drm_encoder.h | 20 +++- > 4 files changed, 25 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c > index 419381abbdd1..27ce988ef0cc 100644 > --- a/drivers/gpu/drm/drm_atomic.c > +++ b/drivers/gpu/drm/drm_atomic.c > @@ -392,7 +392,7 @@ static void drm_atomic_crtc_print_state(struct > drm_printer *p, > drm_printf(p, "\tcolor_mgmt_changed=%d\n", state->color_mgmt_changed); > drm_printf(p, "\tplane_mask=%x\n", state->plane_mask); > drm_printf(p, "\tconnector_mask=%x\n", state->connector_mask); > - drm_printf(p, "\tencoder_mask=%x\n", state->encoder_mask); > + drm_printf(p, "\tencoder_mask=%llx\n", state->encoder_mask); > drm_printf(p, "\tmode: " DRM_MODE_FMT "\n", > DRM_MODE_ARG(>mode)); > > if (crtc->funcs->atomic_print_state) > diff --git a/drivers/gpu/drm/drm_encoder.c b/drivers/gpu/drm/drm_encoder.c > index 7fb47b7b8b44..e4b8f675aa81 100644 > --- a/drivers/gpu/drm/drm_encoder.c > +++ b/drivers/gpu/drm/drm_encoder.c > @@ -112,8 +112,14 @@ int drm_encoder_init(struct drm_device *dev, > { > int ret; > > - /* encoder index is used with 32bit bitmasks */ > - if (WARN_ON(dev->mode_config.num_encoder >= 32)) > + /* > +* Since possible_clones has been exposed to userspace as a 32bit > +* bitmask, we don't allow creating encoders with an index >=32 which > +* are capable of cloning. > +*/ > + if (WARN_ON(dev->mode_config.num_encoder >= 64) || > + WARN_ON(dev->mode_config.num_encoder >= 32 && > + encoder->possible_clones)) This is too early, drivers are free to change ->possible_clones later on. We've have to recheck that at drm_dev_register() time I think. -Daniel > return -EINVAL; > > ret = drm_mode_object_add(dev, >base, > DRM_MODE_OBJECT_ENCODER); > diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h > index 7d14c11bdc0a..fd0b2438c3d5 100644 > --- a/include/drm/drm_crtc.h > +++ b/include/drm/drm_crtc.h > @@ -210,7 +210,7 @@ struct drm_crtc_state { > * @encoder_mask: Bitmask of drm_encoder_mask(encoder) of encoders > * attached to this CRTC. > */ > - u32 encoder_mask; > + u64 encoder_mask; > > /** > * @adjusted_mode: > diff --git a/include/drm/drm_encoder.h b/include/drm/drm_encoder.h > index 70cfca03d812..3f9cb65694e1 100644 > --- a/include/drm/drm_encoder.h > +++ b/include/drm/drm_encoder.h > @@ -159,7 +159,15 @@ struct drm_encoder { > * encoders can be used in a cloned configuration, they both should > have > * each
Re: [PATCH v7 1/9] drm_dp_cec: add connector info support.
Reviewed-by: Lyude Paul On Wed, 2019-08-14 at 12:44 +0200, Dariusz Marcinkiewicz wrote: > Pass the connector info to the CEC adapter. This makes it possible > to associate the CEC adapter with the corresponding drm connector. > > Signed-off-by: Dariusz Marcinkiewicz > Signed-off-by: Hans Verkuil > Tested-by: Hans Verkuil > --- > .../display/amdgpu_dm/amdgpu_dm_mst_types.c | 2 +- > drivers/gpu/drm/drm_dp_cec.c | 25 --- > drivers/gpu/drm/i915/display/intel_dp.c | 4 +-- > drivers/gpu/drm/nouveau/nouveau_connector.c | 3 +-- > include/drm/drm_dp_helper.h | 17 ++--- > 5 files changed, 27 insertions(+), 24 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c > index 16218a202b591..5ec14efd4d8cb 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c > @@ -416,7 +416,7 @@ void amdgpu_dm_initialize_dp_connector(struct > amdgpu_display_manager *dm, > > drm_dp_aux_register(>dm_dp_aux.aux); > drm_dp_cec_register_connector(>dm_dp_aux.aux, > - aconnector->base.name, dm->adev->dev); > + >base); > aconnector->mst_mgr.cbs = _mst_cbs; > drm_dp_mst_topology_mgr_init( > >mst_mgr, > diff --git a/drivers/gpu/drm/drm_dp_cec.c b/drivers/gpu/drm/drm_dp_cec.c > index b15cee85b702b..b457c16c3a8bb 100644 > --- a/drivers/gpu/drm/drm_dp_cec.c > +++ b/drivers/gpu/drm/drm_dp_cec.c > @@ -8,7 +8,9 @@ > #include > #include > #include > +#include > #include > +#include > #include > > /* > @@ -295,7 +297,10 @@ static void drm_dp_cec_unregister_work(struct work_struct > *work) > */ > void drm_dp_cec_set_edid(struct drm_dp_aux *aux, const struct edid *edid) > { > - u32 cec_caps = CEC_CAP_DEFAULTS | CEC_CAP_NEEDS_HPD; > + struct drm_connector *connector = aux->cec.connector; > + u32 cec_caps = CEC_CAP_DEFAULTS | CEC_CAP_NEEDS_HPD | > +CEC_CAP_CONNECTOR_INFO; > + struct cec_connector_info conn_info; > unsigned int num_las = 1; > u8 cap; > > @@ -344,13 +349,17 @@ void drm_dp_cec_set_edid(struct drm_dp_aux *aux, const > struct edid *edid) > > /* Create a new adapter */ > aux->cec.adap = cec_allocate_adapter(_dp_cec_adap_ops, > - aux, aux->cec.name, cec_caps, > + aux, connector->name, cec_caps, >num_las); > if (IS_ERR(aux->cec.adap)) { > aux->cec.adap = NULL; > goto unlock; > } > - if (cec_register_adapter(aux->cec.adap, aux->cec.parent)) { > + > + cec_fill_conn_info_from_drm(_info, connector); > + cec_s_conn_info(aux->cec.adap, _info); > + > + if (cec_register_adapter(aux->cec.adap, connector->dev->dev)) { > cec_delete_adapter(aux->cec.adap); > aux->cec.adap = NULL; > } else { > @@ -406,22 +415,20 @@ EXPORT_SYMBOL(drm_dp_cec_unset_edid); > /** > * drm_dp_cec_register_connector() - register a new connector > * @aux: DisplayPort AUX channel > - * @name: name of the CEC device > - * @parent: parent device > + * @connector: drm connector > * > * A new connector was registered with associated CEC adapter name and > * CEC adapter parent device. After registering the name and parent > * drm_dp_cec_set_edid() is called to check if the connector supports > * CEC and to register a CEC adapter if that is the case. > */ > -void drm_dp_cec_register_connector(struct drm_dp_aux *aux, const char *name, > -struct device *parent) > +void drm_dp_cec_register_connector(struct drm_dp_aux *aux, > +struct drm_connector *connector) > { > WARN_ON(aux->cec.adap); > if (WARN_ON(!aux->transfer)) > return; > - aux->cec.name = name; > - aux->cec.parent = parent; > + aux->cec.connector = connector; > INIT_DELAYED_WORK(>cec.unregister_work, > drm_dp_cec_unregister_work); > } > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c > b/drivers/gpu/drm/i915/display/intel_dp.c > index 1092499115760..de2486fe7bf2d 100644 > --- a/drivers/gpu/drm/i915/display/intel_dp.c > +++ b/drivers/gpu/drm/i915/display/intel_dp.c > @@ -5497,7 +5497,6 @@ static int > intel_dp_connector_register(struct drm_connector *connector) > { > struct intel_dp *intel_dp = intel_attached_dp(connector); > - struct drm_device *dev = connector->dev; > int ret; > > ret = intel_connector_register(connector); > @@ -5512,8 +5511,7 @@ intel_dp_connector_register(struct drm_connector > *connector) > intel_dp->aux.dev = connector->kdev; > ret =
[Nouveau] [RFC] drm: Bump encoder limit from 32 to 64
Assuming that GPUs would never have even close to 32 separate video encoders is quite honestly a pretty reasonable assumption. Unfortunately we do not live in a reasonable world, as it looks like it is actually possible to find devices that will create more drm_encoder objects then this. Case in point: the ThinkPad P71's discrete GPU, which exposes 1 eDP port and 5 DP ports. On the P71, nouveau attempts to create one encoder for the eDP port, and two encoders for each DP++/USB-C port along with 4 MST encoders for each DP port. This comes out to 35 different encoders. Unfortunately, this can't really be optimized to make less encoders either. So, what if we bumped the limit to 64? Unfortunately this has one very awkward drawback: we already expose 32-bit bitmasks for encoders to userspace in drm_encoder->possible_clones. Yikes. Luckily for us however, the year is 2019 and modern hardware that supports cloning is basically non-existent. So, let's try to compromise here: allow encoders with indexes <32 to have non-zero values in drm_encoder->possible_clones, and don't allow encoders with higher indexes to set drm_encoder->possible_clones to a non-zero value. This allows us to avoid breaking UAPI, while still being able to bump up the encoder limit. This also fixes driver probing for nouveau on the ThinkPad P71. Signed-off-by: Lyude Paul Cc: nouveau@lists.freedesktop.org --- drivers/gpu/drm/drm_atomic.c | 2 +- drivers/gpu/drm/drm_encoder.c | 10 -- include/drm/drm_crtc.h| 2 +- include/drm/drm_encoder.h | 20 +++- 4 files changed, 25 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index 419381abbdd1..27ce988ef0cc 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -392,7 +392,7 @@ static void drm_atomic_crtc_print_state(struct drm_printer *p, drm_printf(p, "\tcolor_mgmt_changed=%d\n", state->color_mgmt_changed); drm_printf(p, "\tplane_mask=%x\n", state->plane_mask); drm_printf(p, "\tconnector_mask=%x\n", state->connector_mask); - drm_printf(p, "\tencoder_mask=%x\n", state->encoder_mask); + drm_printf(p, "\tencoder_mask=%llx\n", state->encoder_mask); drm_printf(p, "\tmode: " DRM_MODE_FMT "\n", DRM_MODE_ARG(>mode)); if (crtc->funcs->atomic_print_state) diff --git a/drivers/gpu/drm/drm_encoder.c b/drivers/gpu/drm/drm_encoder.c index 7fb47b7b8b44..e4b8f675aa81 100644 --- a/drivers/gpu/drm/drm_encoder.c +++ b/drivers/gpu/drm/drm_encoder.c @@ -112,8 +112,14 @@ int drm_encoder_init(struct drm_device *dev, { int ret; - /* encoder index is used with 32bit bitmasks */ - if (WARN_ON(dev->mode_config.num_encoder >= 32)) + /* +* Since possible_clones has been exposed to userspace as a 32bit +* bitmask, we don't allow creating encoders with an index >=32 which +* are capable of cloning. +*/ + if (WARN_ON(dev->mode_config.num_encoder >= 64) || + WARN_ON(dev->mode_config.num_encoder >= 32 && + encoder->possible_clones)) return -EINVAL; ret = drm_mode_object_add(dev, >base, DRM_MODE_OBJECT_ENCODER); diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h index 7d14c11bdc0a..fd0b2438c3d5 100644 --- a/include/drm/drm_crtc.h +++ b/include/drm/drm_crtc.h @@ -210,7 +210,7 @@ struct drm_crtc_state { * @encoder_mask: Bitmask of drm_encoder_mask(encoder) of encoders * attached to this CRTC. */ - u32 encoder_mask; + u64 encoder_mask; /** * @adjusted_mode: diff --git a/include/drm/drm_encoder.h b/include/drm/drm_encoder.h index 70cfca03d812..3f9cb65694e1 100644 --- a/include/drm/drm_encoder.h +++ b/include/drm/drm_encoder.h @@ -159,7 +159,15 @@ struct drm_encoder { * encoders can be used in a cloned configuration, they both should have * each another bits set. * -* In reality almost every driver gets this wrong. +* In reality almost every driver gets this wrong, and most modern +* display hardware does not have support for cloning. As well, while we +* expose this mask to userspace as 32bits long, we do sure purely to +* avoid breaking pre-existing UAPI since the limitation on the number +* of encoders has been increased from 32 bits to 64 bits. In order to +* maintain functionality for drivers which do actually support cloning, +* we only allow cloning with encoders that have an index <32. Encoders +* with indexes higher than 32 are not allowed to specify a non-zero +* value here. * * Note that since encoder objects can't be hotplugged the assigned indices * are stable and hence known before registering all objects. @@ -198,13 +206,15 @@ static inline unsigned int drm_encoder_index(const struct drm_encoder *encoder) } /** - *
Re: [Nouveau] [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk
On Wed, Aug 14, 2019 at 07:48:28AM -0700, Dan Williams wrote: > On Wed, Aug 14, 2019 at 6:28 AM Jason Gunthorpe wrote: > > > > On Wed, Aug 14, 2019 at 09:38:54AM +0200, Christoph Hellwig wrote: > > > On Tue, Aug 13, 2019 at 06:36:33PM -0700, Dan Williams wrote: > > > > Section alignment constraints somewhat save us here. The only example > > > > I can think of a PMD not containing a uniform pgmap association for > > > > each pte is the case when the pgmap overlaps normal dram, i.e. shares > > > > the same 'struct memory_section' for a given span. Otherwise, distinct > > > > pgmaps arrange to manage their own exclusive sections (and now > > > > subsections as of v5.3). Otherwise the implementation could not > > > > guarantee different mapping lifetimes. > > > > > > > > That said, this seems to want a better mechanism to determine "pfn is > > > > ZONE_DEVICE". > > > > > > So I guess this patch is fine for now, and once you provide a better > > > mechanism we can switch over to it? > > > > What about the version I sent to just get rid of all the strange > > put_dev_pagemaps while scanning? Odds are good we will work with only > > a single pagemap, so it makes some sense to cache it once we find it? > > Yes, if the scan is over a single pmd then caching it makes sense. Quite frankly an easier an better solution is to remove the pagemap lookup as HMM user abide by mmu notifier it means we will not make use or dereference the struct page so that we are safe from any racing hotunplug of dax memory (as long as device driver using hmm do not have a bug). Cheers, Jérôme ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, 15 Aug 2019 18:19:52 +0200, wrote: > > > -Original Message- > > From: Takashi Iwai > > Sent: Thursday, August 15, 2019 9:57 AM > > To: Alex Deucher > > Cc: Karol Herbst; Limonciello, Mario; nouveau; Rafael J . Wysocki; LKML; > > dri-devel; > > Linux ACPI Mailing List; Alex Hung; Ben Skeggs; David Airlie > > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string > > to > > enable dGPU direct output" > > > > > > [EXTERNAL EMAIL] > > > > On Thu, 15 Aug 2019 16:37:05 +0200, > > Alex Deucher wrote: > > > > > > On Thu, Aug 15, 2019 at 10:25 AM Karol Herbst wrote: > > > > > > > > On Thu, Aug 15, 2019 at 4:20 PM wrote: > > > > > > > > > > > > There are definitely going to be regressions on machines in the > > > > > > > field > > with the > > > > > > > in tree drivers by reverting this. I think we should have an > > > > > > > answer for all > > of > > > > > > those > > > > > > > before this revert is accepted. > > > > > > > > > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with > > > > > > > partners > > to > > > > > > collect > > > > > > > some information on the impact of reverting this. > > > > > > > > > > > > > > When this is used on a system with Intel+AMD the ASL configures > > > > > > > AMD > > GPU to > > > > > > use > > > > > > > "Hybrid Graphics" when on Windows and "Power Express" and > > "Switchable > > > > > > Graphics" > > > > > > > when on Linux. > > > > > > > > > > > > and what's exactly the difference between those? And what's the > > > > > > actual > > > > > > issue here? > > > > > > > > > > DP/HDMI is not detected unless plugged in at bootup. It's due to > > > > > missing > > HPD > > > > > events. > > > > > > > > > > > > > afaik Lyude was working on fixing all that, at least for some drivers. > > > > If there is something wrong, we still should fix the drivers, not > > > > adding ACPI workarounds. > > > > > > > > Alex: do you know if there are remaining issues regarding that with > > > > amdgpu? > > > > > > There was an issue with hpd events not making it to the audio side > > > when things were powered down that was fixed with this patch set: > > > https://patchwork.freedesktop.org/patch/316793/ > > > Those patches depended on a bunch of alsa changes as well which may > > > have not been available in the distro used for a particular OEM > > > program. > > > > FYI, the corresponding commit for ALSA part is destined for 5.4 > > kernel: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/commit/?id=ade > > 49db337a9d44ac5835cfce1ee873549011b27 > > > > BTW, Nouveau should suffer from the same problem. The patch to add > > the audio component support is found at: > > https://patchwork.freedesktop.org/patch/319131/ > > > > > > It sounds like 5.3rcX won't be a useful check then. For HDMI/DP audio, right, the main pieces are still missing in 5.3. But there is no regression in this regard, OTOH. > So am I correct to understand that everything related to the AMD failures > described in this thread should be in linux-next at this point? I'm not sure what you're referring as "AMD failures", so leave this question to AMD people :) Takashi ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH] nouveau/hmm: map pages after migration
On Tue, Aug 13, 2019 at 05:58:52PM -0400, Jerome Glisse wrote: > On Wed, Aug 07, 2019 at 08:02:14AM -0700, Ralph Campbell wrote: > > When memory is migrated to the GPU it is likely to be accessed by GPU > > code soon afterwards. Instead of waiting for a GPU fault, map the > > migrated memory into the GPU page tables with the same access permissions > > as the source CPU page table entries. This preserves copy on write > > semantics. > > > > Signed-off-by: Ralph Campbell > > Cc: Christoph Hellwig > > Cc: Jason Gunthorpe > > Cc: "Jérôme Glisse" > > Cc: Ben Skeggs > > Sorry for delay i am swamp, couple issues: > - nouveau_pfns_map() is never call, it should be call after > the dma copy is done (iirc it is lacking proper fencing > so that would need to be implemented first) > > - the migrate ioctl is disconnected from the svm part and > thus we would need first to implement svm reference counting > and take a reference at begining of migration process and > release it at the end ie struct nouveau_svmm needs refcounting > of some sort. I let Ben decides what he likes best for that. Thinking more about that the svm lifetime is bound to the file descriptor on the device driver file held by the process. So when you call migrate ioctl the svm should not go away because you are in an ioctl against the file descriptor. I need to double check all that in respect of process that open the device file multiple time with different file descriptor (or fork thing and all). > - i rather not have an extra pfns array, i am pretty sure we > can directly feed what we get from the dma array to the svm > code to update the GPU page table > > Observation that can be delayed to latter patches: > - i do not think we want to map directly if the dma engine > is queue up and thus if the fence will take time to signal > > This is why i did not implement this in the first place. > Maybe using a workqueue to schedule a pre-fill of the GPU > page table and wakeup the workqueue with the fence notify > event. > > > > --- > > > > This patch is based on top of Christoph Hellwig's 9 patch series > > https://lore.kernel.org/linux-mm/20190729234611.gc7...@redhat.com/T/#u > > "turn the hmm migrate_vma upside down" but without patch 9 > > "mm: remove the unused MIGRATE_PFN_WRITE" and adds a use for the flag. > > > > > > drivers/gpu/drm/nouveau/nouveau_dmem.c | 45 +- > > drivers/gpu/drm/nouveau/nouveau_svm.c | 86 ++ > > drivers/gpu/drm/nouveau/nouveau_svm.h | 19 ++ > > 3 files changed, 133 insertions(+), 17 deletions(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c > > b/drivers/gpu/drm/nouveau/nouveau_dmem.c > > index ef9de82b0744..c83e6f118817 100644 > > --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c > > +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c > > @@ -25,11 +25,13 @@ > > #include "nouveau_dma.h" > > #include "nouveau_mem.h" > > #include "nouveau_bo.h" > > +#include "nouveau_svm.h" > > > > #include > > #include > > #include > > #include > > +#include > > > > #include > > #include > > @@ -560,11 +562,12 @@ nouveau_dmem_init(struct nouveau_drm *drm) > > } > > > > static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm, > > - struct vm_area_struct *vma, unsigned long addr, > > - unsigned long src, dma_addr_t *dma_addr) > > + struct vm_area_struct *vma, unsigned long src, > > + dma_addr_t *dma_addr, u64 *pfn) > > { > > struct device *dev = drm->dev->dev; > > struct page *dpage, *spage; > > + unsigned long paddr; > > > > spage = migrate_pfn_to_page(src); > > if (!spage || !(src & MIGRATE_PFN_MIGRATE)) > > @@ -572,17 +575,21 @@ static unsigned long > > nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm, > > > > dpage = nouveau_dmem_page_alloc_locked(drm); > > if (!dpage) > > - return 0; > > + goto out; > > > > *dma_addr = dma_map_page(dev, spage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL); > > if (dma_mapping_error(dev, *dma_addr)) > > goto out_free_page; > > > > + paddr = nouveau_dmem_page_addr(dpage); > > if (drm->dmem->migrate.copy_func(drm, 1, NOUVEAU_APER_VRAM, > > - nouveau_dmem_page_addr(dpage), NOUVEAU_APER_HOST, > > - *dma_addr)) > > + paddr, NOUVEAU_APER_HOST, *dma_addr)) > > goto out_dma_unmap; > > > > + *pfn = NVIF_VMM_PFNMAP_V0_V | NVIF_VMM_PFNMAP_V0_VRAM | > > + ((paddr >> PAGE_SHIFT) << NVIF_VMM_PFNMAP_V0_ADDR_SHIFT); > > + if (src & MIGRATE_PFN_WRITE) > > + *pfn |= NVIF_VMM_PFNMAP_V0_W; > > return migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; > > > > out_dma_unmap: > > @@ -590,18 +597,19 @@ static unsigned long > > nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm, > > out_free_page: > >
RE: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
> -Original Message- > From: Takashi Iwai > Sent: Thursday, August 15, 2019 9:57 AM > To: Alex Deucher > Cc: Karol Herbst; Limonciello, Mario; nouveau; Rafael J . Wysocki; LKML; > dri-devel; > Linux ACPI Mailing List; Alex Hung; Ben Skeggs; David Airlie > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to > enable dGPU direct output" > > > [EXTERNAL EMAIL] > > On Thu, 15 Aug 2019 16:37:05 +0200, > Alex Deucher wrote: > > > > On Thu, Aug 15, 2019 at 10:25 AM Karol Herbst wrote: > > > > > > On Thu, Aug 15, 2019 at 4:20 PM wrote: > > > > > > > > > > There are definitely going to be regressions on machines in the > > > > > > field > with the > > > > > > in tree drivers by reverting this. I think we should have an > > > > > > answer for all > of > > > > > those > > > > > > before this revert is accepted. > > > > > > > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with > > > > > > partners > to > > > > > collect > > > > > > some information on the impact of reverting this. > > > > > > > > > > > > When this is used on a system with Intel+AMD the ASL configures AMD > GPU to > > > > > use > > > > > > "Hybrid Graphics" when on Windows and "Power Express" and > "Switchable > > > > > Graphics" > > > > > > when on Linux. > > > > > > > > > > and what's exactly the difference between those? And what's the actual > > > > > issue here? > > > > > > > > DP/HDMI is not detected unless plugged in at bootup. It's due to > > > > missing > HPD > > > > events. > > > > > > > > > > afaik Lyude was working on fixing all that, at least for some drivers. > > > If there is something wrong, we still should fix the drivers, not > > > adding ACPI workarounds. > > > > > > Alex: do you know if there are remaining issues regarding that with > > > amdgpu? > > > > There was an issue with hpd events not making it to the audio side > > when things were powered down that was fixed with this patch set: > > https://patchwork.freedesktop.org/patch/316793/ > > Those patches depended on a bunch of alsa changes as well which may > > have not been available in the distro used for a particular OEM > > program. > > FYI, the corresponding commit for ALSA part is destined for 5.4 > kernel: > > https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/commit/?id=ade > 49db337a9d44ac5835cfce1ee873549011b27 > > BTW, Nouveau should suffer from the same problem. The patch to add > the audio component support is found at: > https://patchwork.freedesktop.org/patch/319131/ > > It sounds like 5.3rcX won't be a useful check then. So am I correct to understand that everything related to the AMD failures described in this thread should be in linux-next at this point?
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, Aug 15, 2019 at 10:37 AM Alex Deucher wrote: > > On Thu, Aug 15, 2019 at 10:25 AM Karol Herbst wrote: > > > > On Thu, Aug 15, 2019 at 4:20 PM wrote: > > > > > > > > There are definitely going to be regressions on machines in the field > > > > > with the > > > > > in tree drivers by reverting this. I think we should have an answer > > > > > for all of > > > > those > > > > > before this revert is accepted. > > > > > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with partners > > > > > to > > > > collect > > > > > some information on the impact of reverting this. > > > > > > > > > > When this is used on a system with Intel+AMD the ASL configures AMD > > > > > GPU to > > > > use > > > > > "Hybrid Graphics" when on Windows and "Power Express" and "Switchable > > > > Graphics" > > > > > when on Linux. > > > > > > > > and what's exactly the difference between those? And what's the actual > > > > issue here? > > > > > > DP/HDMI is not detected unless plugged in at bootup. It's due to missing > > > HPD > > > events. > > > > > > > afaik Lyude was working on fixing all that, at least for some drivers. > > If there is something wrong, we still should fix the drivers, not > > adding ACPI workarounds. > > > > Alex: do you know if there are remaining issues regarding that with amdgpu? > > There was an issue with hpd events not making it to the audio side > when things were powered down that was fixed with this patch set: > https://patchwork.freedesktop.org/patch/316793/ > Those patches depended on a bunch of alsa changes as well which may > have not been available in the distro used for a particular OEM > program. > > > > > > > > > > > We already have the PRIME offloading in place and if that's not > > > > enough, we should work on extending it, not adding some ACPI based > > > > workarounds, because that's exactly how that looks like. > > > > > > > > Also, was this discussed with anybody involved in the drm subsystem? > > > > > > > > > > > > > > I feel we need a knob and/or DMI detection to affect the changes that > > > > > the ASL > > > > > normally performs. > > > > > > > > Why do we have to do that on a firmware level at all? > > > > > > Folks from AMD Graphics team recommended this approach. From their > > > perspective > > > it's not a workaround. They view this as a different architecture for > > > AMD graphics driver on > > > Windows and AMD graphics w/ amdgpu driver. They have different ASL paths > > > used for > > > each. > > > > @alex: is this true? > > I'm not familiar with this patches in particular, but I know we've > done things with OEM programs to support Linux on platforms where > Linux support is lacking for in new features for the target distros. > E.g., when the first hybrid graphics laptops were coming out, Linux > didn't support it too well or at all depending on the timing, so the > bios exposed power express which was working well at the time if the > OS told ACPI it was Linux. FWIW, windows does something similar. I don't think windows 7 supports hybrid graphics either so if the OS tells ACPI it's windows 7, it gets power express instead of hybrid graphics as well. At least on laptops that support windows 7 in the first place. Alex > > Alex ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, 15 Aug 2019 16:37:05 +0200, Alex Deucher wrote: > > On Thu, Aug 15, 2019 at 10:25 AM Karol Herbst wrote: > > > > On Thu, Aug 15, 2019 at 4:20 PM wrote: > > > > > > > > There are definitely going to be regressions on machines in the field > > > > > with the > > > > > in tree drivers by reverting this. I think we should have an answer > > > > > for all of > > > > those > > > > > before this revert is accepted. > > > > > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with partners > > > > > to > > > > collect > > > > > some information on the impact of reverting this. > > > > > > > > > > When this is used on a system with Intel+AMD the ASL configures AMD > > > > > GPU to > > > > use > > > > > "Hybrid Graphics" when on Windows and "Power Express" and "Switchable > > > > Graphics" > > > > > when on Linux. > > > > > > > > and what's exactly the difference between those? And what's the actual > > > > issue here? > > > > > > DP/HDMI is not detected unless plugged in at bootup. It's due to missing > > > HPD > > > events. > > > > > > > afaik Lyude was working on fixing all that, at least for some drivers. > > If there is something wrong, we still should fix the drivers, not > > adding ACPI workarounds. > > > > Alex: do you know if there are remaining issues regarding that with amdgpu? > > There was an issue with hpd events not making it to the audio side > when things were powered down that was fixed with this patch set: > https://patchwork.freedesktop.org/patch/316793/ > Those patches depended on a bunch of alsa changes as well which may > have not been available in the distro used for a particular OEM > program. FYI, the corresponding commit for ALSA part is destined for 5.4 kernel: https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/commit/?id=ade49db337a9d44ac5835cfce1ee873549011b27 BTW, Nouveau should suffer from the same problem. The patch to add the audio component support is found at: https://patchwork.freedesktop.org/patch/319131/ thanks, Takashi ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, Aug 15, 2019 at 10:30 AM wrote: > > > On Thu, Aug 15, 2019 at 10:15 AM Karol Herbst wrote: > > > > > > On Thu, Aug 15, 2019 at 4:13 PM Alex Deucher > > wrote: > > > > > > > > On Thu, Aug 15, 2019 at 10:04 AM Karol Herbst > > > > wrote: > > > > > > > > > > On Thu, Aug 15, 2019 at 3:56 PM wrote: > > > > > > > > > > > > > -Original Message- > > > > > > > From: linux-acpi-ow...@vger.kernel.org > ow...@vger.kernel.org> On > > > > > > > Behalf Of Dave Airlie > > > > > > > Sent: Wednesday, August 14, 2019 5:48 PM > > > > > > > To: Karol Herbst > > > > > > > Cc: LKML; Linux ACPI; dri-devel; nouveau; Rafael J . Wysocki; > > > > > > > Alex Hung; > > Ben > > > > > > > Skeggs; Dave Airlie > > > > > > > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM > > > > > > > _OSI > > string to > > > > > > > enable dGPU direct output" > > > > > > > > > > > > > > On Thu, 15 Aug 2019 at 07:31, Karol Herbst > > wrote: > > > > > > > > > > > > > > > > This reverts commit 28586a51eea666d5531bcaef2f68e4abbd87242c. > > > > > > > > > > > > > > > > The original commit message didn't even make sense. AMD _does_ > > support it and > > > > > > > > it works with Nouveau as well. > > > > > > > > > > > > > > > > Also what was the issue being solved here? No references to any > > > > > > > > bugs > > and not > > > > > > > > even explaining any issue at all isn't the way we do things. > > > > > > > > > > > > > > > > And even if it means a muxed design, then the fix is to make it > > > > > > > > work > > inside the > > > > > > > > driver, not adding some hacky workaround through ACPI tricks. > > > > > > > > > > > > > > > > And what out of tree drivers do or do not support we don't care > > > > > > > > one > > bit anyway. > > > > > > > > > > > > > > > > > > > > > > I think the reverts should be merged via Rafael's tree as the > > > > > > > original > > > > > > > patches went in via there, and we should get them in asap. > > > > > > > > > > > > > > Acked-by: Dave Airlie > > > > > > > Dave. > > > > > > > > > > > > There are definitely going to be regressions on machines in the > > > > > > field with > > the > > > > > > in tree drivers by reverting this. I think we should have an > > > > > > answer for all > > of those > > > > > > before this revert is accepted. > > > > > > > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with > > > > > > partners to > > collect > > > > > > some information on the impact of reverting this. > > > > > > > > > > > > When this is used on a system with Intel+AMD the ASL configures AMD > > GPU to use > > > > > > "Hybrid Graphics" when on Windows and "Power Express" and > > "Switchable Graphics" > > > > > > when on Linux. > > > > > > > > > > and what's exactly the difference between those? And what's the actual > > > > > issue here? > > > > > > > > Hybrid Graphics is the new "standard" way of handling these laptops. > > > > It uses the standard _PR3 APCI method to handle dGPU power. Support > > > > for this was added to Linux relatively later compared to when the > > > > laptops were launched. "Power Express" used the other AMD specific > > > > ATPX ACPI method to handle dGPU power. The driver supports both so > > > > either method will work. > > > > > > > > Alex > > > > > > > > > > thanks for clarifying. But that still means that we won't need such > > > workarounds for AMD users, right? amdgpu handles hybrid graphics just > > > fine, right? > > > > Yeah it should, assuming you have a new enough kernel which supports > > HG, which has been several years at this point IIRC. > > > > Alex > > > > Can you define how new of a kernel is a new enough kernel? > > Looking on my side these problems were on new hardware (Precision 7740) and > are checked as recently as start of this summer, w/ kernel 4.15. I don't remember off hand. It was before the _PR3 support in the pci subsystem was upstream. What was the problem you were seeing with this system? Alex > > We can arrange to have it checked again on 5.3rcX w/ the OSI disabled. ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, Aug 15, 2019 at 10:25 AM Karol Herbst wrote: > > On Thu, Aug 15, 2019 at 4:20 PM wrote: > > > > > > There are definitely going to be regressions on machines in the field > > > > with the > > > > in tree drivers by reverting this. I think we should have an answer > > > > for all of > > > those > > > > before this revert is accepted. > > > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with partners to > > > collect > > > > some information on the impact of reverting this. > > > > > > > > When this is used on a system with Intel+AMD the ASL configures AMD GPU > > > > to > > > use > > > > "Hybrid Graphics" when on Windows and "Power Express" and "Switchable > > > Graphics" > > > > when on Linux. > > > > > > and what's exactly the difference between those? And what's the actual > > > issue here? > > > > DP/HDMI is not detected unless plugged in at bootup. It's due to missing > > HPD > > events. > > > > afaik Lyude was working on fixing all that, at least for some drivers. > If there is something wrong, we still should fix the drivers, not > adding ACPI workarounds. > > Alex: do you know if there are remaining issues regarding that with amdgpu? There was an issue with hpd events not making it to the audio side when things were powered down that was fixed with this patch set: https://patchwork.freedesktop.org/patch/316793/ Those patches depended on a bunch of alsa changes as well which may have not been available in the distro used for a particular OEM program. > > > > > > > We already have the PRIME offloading in place and if that's not > > > enough, we should work on extending it, not adding some ACPI based > > > workarounds, because that's exactly how that looks like. > > > > > > Also, was this discussed with anybody involved in the drm subsystem? > > > > > > > > > > > I feel we need a knob and/or DMI detection to affect the changes that > > > > the ASL > > > > normally performs. > > > > > > Why do we have to do that on a firmware level at all? > > > > Folks from AMD Graphics team recommended this approach. From their > > perspective > > it's not a workaround. They view this as a different architecture for AMD > > graphics driver on > > Windows and AMD graphics w/ amdgpu driver. They have different ASL paths > > used for > > each. > > @alex: is this true? I'm not familiar with this patches in particular, but I know we've done things with OEM programs to support Linux on platforms where Linux support is lacking for in new features for the target distros. E.g., when the first hybrid graphics laptops were coming out, Linux didn't support it too well or at all depending on the timing, so the bios exposed power express which was working well at the time if the OS told ACPI it was Linux. Alex
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, Aug 15, 2019 at 4:34 PM wrote: > > > -Original Message- > > From: Karol Herbst > > Sent: Thursday, August 15, 2019 9:25 AM > > To: Limonciello, Mario > > Cc: Dave Airlie; LKML; Linux ACPI Mailing List; dri-devel; nouveau; Rafael > > J . > > Wysocki; Alex Hung; Ben Skeggs; David Airlie > > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string > > to > > enable dGPU direct output" > > > > > > [EXTERNAL EMAIL] > > > > On Thu, Aug 15, 2019 at 4:20 PM wrote: > > > > > > > > There are definitely going to be regressions on machines in the field > > > > > with > > the > > > > > in tree drivers by reverting this. I think we should have an answer > > > > > for all of > > > > those > > > > > before this revert is accepted. > > > > > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with partners > > > > > to > > > > collect > > > > > some information on the impact of reverting this. > > > > > > > > > > When this is used on a system with Intel+AMD the ASL configures AMD > > GPU to > > > > use > > > > > "Hybrid Graphics" when on Windows and "Power Express" and "Switchable > > > > Graphics" > > > > > when on Linux. > > > > > > > > and what's exactly the difference between those? And what's the actual > > > > issue here? > > > > > > DP/HDMI is not detected unless plugged in at bootup. It's due to missing > > > HPD > > > events. > > > > > > > afaik Lyude was working on fixing all that, at least for some drivers. > > If there is something wrong, we still should fix the drivers, not > > adding ACPI workarounds. > > I don't disagree, but timing is frequently a limitation if you want the > hardware to > work when you put it on the market. > > The whole idea behind the OSI string was that it could be reverted when the > time > was right. From this discussion it very well may be for systems with AMD > GPU, but > it needs to be checked again. > > > > > Alex: do you know if there are remaining issues regarding that with amdgpu? > > > > > > > > > > We already have the PRIME offloading in place and if that's not > > > > enough, we should work on extending it, not adding some ACPI based > > > > workarounds, because that's exactly how that looks like. > > > > > > > > Also, was this discussed with anybody involved in the drm subsystem? > > > > > > > > > > > > > > I feel we need a knob and/or DMI detection to affect the changes that > > > > > the > > ASL > > > > > normally performs. > > > > > > > > Why do we have to do that on a firmware level at all? > > > > > > Folks from AMD Graphics team recommended this approach. > > I should clarify this is from the folks on AMD graphics team that interact to > OEMs > like Dell which are not necessarily the same folks who work on the drivers > directly. > yeah, I understand the pressure, and it might have been fine if we would have this discussion upstream, if it's about pushing things upstream. The commits itself didn't even make the impression that anybody with the drm drivers involved even looked at those patches and this is the worst part of those commits. > > From their perspective > > > it's not a workaround. They view this as a different architecture for AMD > > graphics driver on > > > Windows and AMD graphics w/ amdgpu driver. They have different ASL paths > > used for > > > each. > > > > @alex: is this true?
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, Aug 15, 2019 at 4:30 PM wrote: > > > On Thu, Aug 15, 2019 at 10:15 AM Karol Herbst wrote: > > > > > > On Thu, Aug 15, 2019 at 4:13 PM Alex Deucher > > wrote: > > > > > > > > On Thu, Aug 15, 2019 at 10:04 AM Karol Herbst > > > > wrote: > > > > > > > > > > On Thu, Aug 15, 2019 at 3:56 PM wrote: > > > > > > > > > > > > > -Original Message- > > > > > > > From: linux-acpi-ow...@vger.kernel.org > ow...@vger.kernel.org> On > > > > > > > Behalf Of Dave Airlie > > > > > > > Sent: Wednesday, August 14, 2019 5:48 PM > > > > > > > To: Karol Herbst > > > > > > > Cc: LKML; Linux ACPI; dri-devel; nouveau; Rafael J . Wysocki; > > > > > > > Alex Hung; > > Ben > > > > > > > Skeggs; Dave Airlie > > > > > > > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM > > > > > > > _OSI > > string to > > > > > > > enable dGPU direct output" > > > > > > > > > > > > > > On Thu, 15 Aug 2019 at 07:31, Karol Herbst > > wrote: > > > > > > > > > > > > > > > > This reverts commit 28586a51eea666d5531bcaef2f68e4abbd87242c. > > > > > > > > > > > > > > > > The original commit message didn't even make sense. AMD _does_ > > support it and > > > > > > > > it works with Nouveau as well. > > > > > > > > > > > > > > > > Also what was the issue being solved here? No references to any > > > > > > > > bugs > > and not > > > > > > > > even explaining any issue at all isn't the way we do things. > > > > > > > > > > > > > > > > And even if it means a muxed design, then the fix is to make it > > > > > > > > work > > inside the > > > > > > > > driver, not adding some hacky workaround through ACPI tricks. > > > > > > > > > > > > > > > > And what out of tree drivers do or do not support we don't care > > > > > > > > one > > bit anyway. > > > > > > > > > > > > > > > > > > > > > > I think the reverts should be merged via Rafael's tree as the > > > > > > > original > > > > > > > patches went in via there, and we should get them in asap. > > > > > > > > > > > > > > Acked-by: Dave Airlie > > > > > > > Dave. > > > > > > > > > > > > There are definitely going to be regressions on machines in the > > > > > > field with > > the > > > > > > in tree drivers by reverting this. I think we should have an > > > > > > answer for all > > of those > > > > > > before this revert is accepted. > > > > > > > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with > > > > > > partners to > > collect > > > > > > some information on the impact of reverting this. > > > > > > > > > > > > When this is used on a system with Intel+AMD the ASL configures AMD > > GPU to use > > > > > > "Hybrid Graphics" when on Windows and "Power Express" and > > "Switchable Graphics" > > > > > > when on Linux. > > > > > > > > > > and what's exactly the difference between those? And what's the actual > > > > > issue here? > > > > > > > > Hybrid Graphics is the new "standard" way of handling these laptops. > > > > It uses the standard _PR3 APCI method to handle dGPU power. Support > > > > for this was added to Linux relatively later compared to when the > > > > laptops were launched. "Power Express" used the other AMD specific > > > > ATPX ACPI method to handle dGPU power. The driver supports both so > > > > either method will work. > > > > > > > > Alex > > > > > > > > > > thanks for clarifying. But that still means that we won't need such > > > workarounds for AMD users, right? amdgpu handles hybrid graphics just > > > fine, right? > > > > Yeah it should, assuming you have a new enough kernel which supports > > HG, which has been several years at this point IIRC. > > > > Alex > > > > Can you define how new of a kernel is a new enough kernel? > > Looking on my side these problems were on new hardware (Precision 7740) and > are checked as recently as start of this summer, w/ kernel 4.15. That's not even a long term one. And it shouldn't get any fixes. I just checked, last update was 16 months ago. > > We can arrange to have it checked again on 5.3rcX w/ the OSI disabled. yeah, please do. If there are any issues, we (as in drm developers) are happy to fix the issues inside the drivers. ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
RE: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
> -Original Message- > From: Karol Herbst > Sent: Thursday, August 15, 2019 9:25 AM > To: Limonciello, Mario > Cc: Dave Airlie; LKML; Linux ACPI Mailing List; dri-devel; nouveau; Rafael J . > Wysocki; Alex Hung; Ben Skeggs; David Airlie > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to > enable dGPU direct output" > > > [EXTERNAL EMAIL] > > On Thu, Aug 15, 2019 at 4:20 PM wrote: > > > > > > There are definitely going to be regressions on machines in the field > > > > with > the > > > > in tree drivers by reverting this. I think we should have an answer > > > > for all of > > > those > > > > before this revert is accepted. > > > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with partners to > > > collect > > > > some information on the impact of reverting this. > > > > > > > > When this is used on a system with Intel+AMD the ASL configures AMD > GPU to > > > use > > > > "Hybrid Graphics" when on Windows and "Power Express" and "Switchable > > > Graphics" > > > > when on Linux. > > > > > > and what's exactly the difference between those? And what's the actual > > > issue here? > > > > DP/HDMI is not detected unless plugged in at bootup. It's due to missing > > HPD > > events. > > > > afaik Lyude was working on fixing all that, at least for some drivers. > If there is something wrong, we still should fix the drivers, not > adding ACPI workarounds. I don't disagree, but timing is frequently a limitation if you want the hardware to work when you put it on the market. The whole idea behind the OSI string was that it could be reverted when the time was right. From this discussion it very well may be for systems with AMD GPU, but it needs to be checked again. > > Alex: do you know if there are remaining issues regarding that with amdgpu? > > > > > > > We already have the PRIME offloading in place and if that's not > > > enough, we should work on extending it, not adding some ACPI based > > > workarounds, because that's exactly how that looks like. > > > > > > Also, was this discussed with anybody involved in the drm subsystem? > > > > > > > > > > > I feel we need a knob and/or DMI detection to affect the changes that > > > > the > ASL > > > > normally performs. > > > > > > Why do we have to do that on a firmware level at all? > > > > Folks from AMD Graphics team recommended this approach. I should clarify this is from the folks on AMD graphics team that interact to OEMs like Dell which are not necessarily the same folks who work on the drivers directly. > From their perspective > > it's not a workaround. They view this as a different architecture for AMD > graphics driver on > > Windows and AMD graphics w/ amdgpu driver. They have different ASL paths > used for > > each. > > @alex: is this true?
RE: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
> On Thu, Aug 15, 2019 at 10:15 AM Karol Herbst wrote: > > > > On Thu, Aug 15, 2019 at 4:13 PM Alex Deucher > wrote: > > > > > > On Thu, Aug 15, 2019 at 10:04 AM Karol Herbst wrote: > > > > > > > > On Thu, Aug 15, 2019 at 3:56 PM wrote: > > > > > > > > > > > -Original Message- > > > > > > From: linux-acpi-ow...@vger.kernel.org ow...@vger.kernel.org> On > > > > > > Behalf Of Dave Airlie > > > > > > Sent: Wednesday, August 14, 2019 5:48 PM > > > > > > To: Karol Herbst > > > > > > Cc: LKML; Linux ACPI; dri-devel; nouveau; Rafael J . Wysocki; Alex > > > > > > Hung; > Ben > > > > > > Skeggs; Dave Airlie > > > > > > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI > string to > > > > > > enable dGPU direct output" > > > > > > > > > > > > On Thu, 15 Aug 2019 at 07:31, Karol Herbst > wrote: > > > > > > > > > > > > > > This reverts commit 28586a51eea666d5531bcaef2f68e4abbd87242c. > > > > > > > > > > > > > > The original commit message didn't even make sense. AMD _does_ > support it and > > > > > > > it works with Nouveau as well. > > > > > > > > > > > > > > Also what was the issue being solved here? No references to any > > > > > > > bugs > and not > > > > > > > even explaining any issue at all isn't the way we do things. > > > > > > > > > > > > > > And even if it means a muxed design, then the fix is to make it > > > > > > > work > inside the > > > > > > > driver, not adding some hacky workaround through ACPI tricks. > > > > > > > > > > > > > > And what out of tree drivers do or do not support we don't care > > > > > > > one > bit anyway. > > > > > > > > > > > > > > > > > > > I think the reverts should be merged via Rafael's tree as the > > > > > > original > > > > > > patches went in via there, and we should get them in asap. > > > > > > > > > > > > Acked-by: Dave Airlie > > > > > > Dave. > > > > > > > > > > There are definitely going to be regressions on machines in the field > > > > > with > the > > > > > in tree drivers by reverting this. I think we should have an answer > > > > > for all > of those > > > > > before this revert is accepted. > > > > > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with partners > > > > > to > collect > > > > > some information on the impact of reverting this. > > > > > > > > > > When this is used on a system with Intel+AMD the ASL configures AMD > GPU to use > > > > > "Hybrid Graphics" when on Windows and "Power Express" and > "Switchable Graphics" > > > > > when on Linux. > > > > > > > > and what's exactly the difference between those? And what's the actual > > > > issue here? > > > > > > Hybrid Graphics is the new "standard" way of handling these laptops. > > > It uses the standard _PR3 APCI method to handle dGPU power. Support > > > for this was added to Linux relatively later compared to when the > > > laptops were launched. "Power Express" used the other AMD specific > > > ATPX ACPI method to handle dGPU power. The driver supports both so > > > either method will work. > > > > > > Alex > > > > > > > thanks for clarifying. But that still means that we won't need such > > workarounds for AMD users, right? amdgpu handles hybrid graphics just > > fine, right? > > Yeah it should, assuming you have a new enough kernel which supports > HG, which has been several years at this point IIRC. > > Alex > Can you define how new of a kernel is a new enough kernel? Looking on my side these problems were on new hardware (Precision 7740) and are checked as recently as start of this summer, w/ kernel 4.15. We can arrange to have it checked again on 5.3rcX w/ the OSI disabled.
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, Aug 15, 2019 at 4:20 PM wrote: > > > > There are definitely going to be regressions on machines in the field > > > with the > > > in tree drivers by reverting this. I think we should have an answer for > > > all of > > those > > > before this revert is accepted. > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with partners to > > collect > > > some information on the impact of reverting this. > > > > > > When this is used on a system with Intel+AMD the ASL configures AMD GPU to > > use > > > "Hybrid Graphics" when on Windows and "Power Express" and "Switchable > > Graphics" > > > when on Linux. > > > > and what's exactly the difference between those? And what's the actual > > issue here? > > DP/HDMI is not detected unless plugged in at bootup. It's due to missing HPD > events. > afaik Lyude was working on fixing all that, at least for some drivers. If there is something wrong, we still should fix the drivers, not adding ACPI workarounds. Alex: do you know if there are remaining issues regarding that with amdgpu? > > > > We already have the PRIME offloading in place and if that's not > > enough, we should work on extending it, not adding some ACPI based > > workarounds, because that's exactly how that looks like. > > > > Also, was this discussed with anybody involved in the drm subsystem? > > > > > > > > I feel we need a knob and/or DMI detection to affect the changes that the > > > ASL > > > normally performs. > > > > Why do we have to do that on a firmware level at all? > > Folks from AMD Graphics team recommended this approach. From their > perspective > it's not a workaround. They view this as a different architecture for AMD > graphics driver on > Windows and AMD graphics w/ amdgpu driver. They have different ASL paths > used for > each. @alex: is this true?
RE: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
> > There are definitely going to be regressions on machines in the field with > > the > > in tree drivers by reverting this. I think we should have an answer for > > all of > those > > before this revert is accepted. > > > > Regarding systems with Intel+NVIDIA, we'll have to work with partners to > collect > > some information on the impact of reverting this. > > > > When this is used on a system with Intel+AMD the ASL configures AMD GPU to > use > > "Hybrid Graphics" when on Windows and "Power Express" and "Switchable > Graphics" > > when on Linux. > > and what's exactly the difference between those? And what's the actual > issue here? DP/HDMI is not detected unless plugged in at bootup. It's due to missing HPD events. > > We already have the PRIME offloading in place and if that's not > enough, we should work on extending it, not adding some ACPI based > workarounds, because that's exactly how that looks like. > > Also, was this discussed with anybody involved in the drm subsystem? > > > > > I feel we need a knob and/or DMI detection to affect the changes that the > > ASL > > normally performs. > > Why do we have to do that on a firmware level at all? Folks from AMD Graphics team recommended this approach. From their perspective it's not a workaround. They view this as a different architecture for AMD graphics driver on Windows and AMD graphics w/ amdgpu driver. They have different ASL paths used for each.
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, Aug 15, 2019 at 10:15 AM Karol Herbst wrote: > > On Thu, Aug 15, 2019 at 4:13 PM Alex Deucher wrote: > > > > On Thu, Aug 15, 2019 at 10:04 AM Karol Herbst wrote: > > > > > > On Thu, Aug 15, 2019 at 3:56 PM wrote: > > > > > > > > > -Original Message- > > > > > From: linux-acpi-ow...@vger.kernel.org > > > > > On > > > > > Behalf Of Dave Airlie > > > > > Sent: Wednesday, August 14, 2019 5:48 PM > > > > > To: Karol Herbst > > > > > Cc: LKML; Linux ACPI; dri-devel; nouveau; Rafael J . Wysocki; Alex > > > > > Hung; Ben > > > > > Skeggs; Dave Airlie > > > > > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI > > > > > string to > > > > > enable dGPU direct output" > > > > > > > > > > On Thu, 15 Aug 2019 at 07:31, Karol Herbst wrote: > > > > > > > > > > > > This reverts commit 28586a51eea666d5531bcaef2f68e4abbd87242c. > > > > > > > > > > > > The original commit message didn't even make sense. AMD _does_ > > > > > > support it and > > > > > > it works with Nouveau as well. > > > > > > > > > > > > Also what was the issue being solved here? No references to any > > > > > > bugs and not > > > > > > even explaining any issue at all isn't the way we do things. > > > > > > > > > > > > And even if it means a muxed design, then the fix is to make it > > > > > > work inside the > > > > > > driver, not adding some hacky workaround through ACPI tricks. > > > > > > > > > > > > And what out of tree drivers do or do not support we don't care one > > > > > > bit anyway. > > > > > > > > > > > > > > > > I think the reverts should be merged via Rafael's tree as the original > > > > > patches went in via there, and we should get them in asap. > > > > > > > > > > Acked-by: Dave Airlie > > > > > Dave. > > > > > > > > There are definitely going to be regressions on machines in the field > > > > with the > > > > in tree drivers by reverting this. I think we should have an answer > > > > for all of those > > > > before this revert is accepted. > > > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with partners > > > > to collect > > > > some information on the impact of reverting this. > > > > > > > > When this is used on a system with Intel+AMD the ASL configures AMD GPU > > > > to use > > > > "Hybrid Graphics" when on Windows and "Power Express" and "Switchable > > > > Graphics" > > > > when on Linux. > > > > > > and what's exactly the difference between those? And what's the actual > > > issue here? > > > > Hybrid Graphics is the new "standard" way of handling these laptops. > > It uses the standard _PR3 APCI method to handle dGPU power. Support > > for this was added to Linux relatively later compared to when the > > laptops were launched. "Power Express" used the other AMD specific > > ATPX ACPI method to handle dGPU power. The driver supports both so > > either method will work. > > > > Alex > > > > thanks for clarifying. But that still means that we won't need such > workarounds for AMD users, right? amdgpu handles hybrid graphics just > fine, right? Yeah it should, assuming you have a new enough kernel which supports HG, which has been several years at this point IIRC. Alex > > > > > > > We already have the PRIME offloading in place and if that's not > > > enough, we should work on extending it, not adding some ACPI based > > > workarounds, because that's exactly how that looks like. > > > > > > Also, was this discussed with anybody involved in the drm subsystem? > > > > > > > > > > > I feel we need a knob and/or DMI detection to affect the changes that > > > > the ASL > > > > normally performs. > > > > > > Why do we have to do that on a firmware level at all? > > > ___ > > > dri-devel mailing list > > > dri-de...@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, Aug 15, 2019 at 4:13 PM Alex Deucher wrote: > > On Thu, Aug 15, 2019 at 10:04 AM Karol Herbst wrote: > > > > On Thu, Aug 15, 2019 at 3:56 PM wrote: > > > > > > > -Original Message- > > > > From: linux-acpi-ow...@vger.kernel.org > > > > On > > > > Behalf Of Dave Airlie > > > > Sent: Wednesday, August 14, 2019 5:48 PM > > > > To: Karol Herbst > > > > Cc: LKML; Linux ACPI; dri-devel; nouveau; Rafael J . Wysocki; Alex > > > > Hung; Ben > > > > Skeggs; Dave Airlie > > > > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI > > > > string to > > > > enable dGPU direct output" > > > > > > > > On Thu, 15 Aug 2019 at 07:31, Karol Herbst wrote: > > > > > > > > > > This reverts commit 28586a51eea666d5531bcaef2f68e4abbd87242c. > > > > > > > > > > The original commit message didn't even make sense. AMD _does_ > > > > > support it and > > > > > it works with Nouveau as well. > > > > > > > > > > Also what was the issue being solved here? No references to any bugs > > > > > and not > > > > > even explaining any issue at all isn't the way we do things. > > > > > > > > > > And even if it means a muxed design, then the fix is to make it work > > > > > inside the > > > > > driver, not adding some hacky workaround through ACPI tricks. > > > > > > > > > > And what out of tree drivers do or do not support we don't care one > > > > > bit anyway. > > > > > > > > > > > > > I think the reverts should be merged via Rafael's tree as the original > > > > patches went in via there, and we should get them in asap. > > > > > > > > Acked-by: Dave Airlie > > > > Dave. > > > > > > There are definitely going to be regressions on machines in the field > > > with the > > > in tree drivers by reverting this. I think we should have an answer for > > > all of those > > > before this revert is accepted. > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with partners to > > > collect > > > some information on the impact of reverting this. > > > > > > When this is used on a system with Intel+AMD the ASL configures AMD GPU > > > to use > > > "Hybrid Graphics" when on Windows and "Power Express" and "Switchable > > > Graphics" > > > when on Linux. > > > > and what's exactly the difference between those? And what's the actual > > issue here? > > Hybrid Graphics is the new "standard" way of handling these laptops. > It uses the standard _PR3 APCI method to handle dGPU power. Support > for this was added to Linux relatively later compared to when the > laptops were launched. "Power Express" used the other AMD specific > ATPX ACPI method to handle dGPU power. The driver supports both so > either method will work. > > Alex > thanks for clarifying. But that still means that we won't need such workarounds for AMD users, right? amdgpu handles hybrid graphics just fine, right? > > > > We already have the PRIME offloading in place and if that's not > > enough, we should work on extending it, not adding some ACPI based > > workarounds, because that's exactly how that looks like. > > > > Also, was this discussed with anybody involved in the drm subsystem? > > > > > > > > I feel we need a knob and/or DMI detection to affect the changes that the > > > ASL > > > normally performs. > > > > Why do we have to do that on a firmware level at all? > > ___ > > dri-devel mailing list > > dri-de...@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, Aug 15, 2019 at 10:04 AM Karol Herbst wrote: > > On Thu, Aug 15, 2019 at 3:56 PM wrote: > > > > > -Original Message- > > > From: linux-acpi-ow...@vger.kernel.org > > > On > > > Behalf Of Dave Airlie > > > Sent: Wednesday, August 14, 2019 5:48 PM > > > To: Karol Herbst > > > Cc: LKML; Linux ACPI; dri-devel; nouveau; Rafael J . Wysocki; Alex Hung; > > > Ben > > > Skeggs; Dave Airlie > > > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI > > > string to > > > enable dGPU direct output" > > > > > > On Thu, 15 Aug 2019 at 07:31, Karol Herbst wrote: > > > > > > > > This reverts commit 28586a51eea666d5531bcaef2f68e4abbd87242c. > > > > > > > > The original commit message didn't even make sense. AMD _does_ support > > > > it and > > > > it works with Nouveau as well. > > > > > > > > Also what was the issue being solved here? No references to any bugs > > > > and not > > > > even explaining any issue at all isn't the way we do things. > > > > > > > > And even if it means a muxed design, then the fix is to make it work > > > > inside the > > > > driver, not adding some hacky workaround through ACPI tricks. > > > > > > > > And what out of tree drivers do or do not support we don't care one bit > > > > anyway. > > > > > > > > > > I think the reverts should be merged via Rafael's tree as the original > > > patches went in via there, and we should get them in asap. > > > > > > Acked-by: Dave Airlie > > > Dave. > > > > There are definitely going to be regressions on machines in the field with > > the > > in tree drivers by reverting this. I think we should have an answer for > > all of those > > before this revert is accepted. > > > > Regarding systems with Intel+NVIDIA, we'll have to work with partners to > > collect > > some information on the impact of reverting this. > > > > When this is used on a system with Intel+AMD the ASL configures AMD GPU to > > use > > "Hybrid Graphics" when on Windows and "Power Express" and "Switchable > > Graphics" > > when on Linux. > > and what's exactly the difference between those? And what's the actual > issue here? Hybrid Graphics is the new "standard" way of handling these laptops. It uses the standard _PR3 APCI method to handle dGPU power. Support for this was added to Linux relatively later compared to when the laptops were launched. "Power Express" used the other AMD specific ATPX ACPI method to handle dGPU power. The driver supports both so either method will work. Alex > > We already have the PRIME offloading in place and if that's not > enough, we should work on extending it, not adding some ACPI based > workarounds, because that's exactly how that looks like. > > Also, was this discussed with anybody involved in the drm subsystem? > > > > > I feel we need a knob and/or DMI detection to affect the changes that the > > ASL > > normally performs. > > Why do we have to do that on a firmware level at all? > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, Aug 15, 2019 at 12:47 AM Dave Airlie wrote: > > On Thu, 15 Aug 2019 at 07:31, Karol Herbst wrote: > > > > This reverts commit 28586a51eea666d5531bcaef2f68e4abbd87242c. > > > > The original commit message didn't even make sense. AMD _does_ support it > > and > > it works with Nouveau as well. > > > > Also what was the issue being solved here? No references to any bugs and not > > even explaining any issue at all isn't the way we do things. > > > > And even if it means a muxed design, then the fix is to make it work inside > > the > > driver, not adding some hacky workaround through ACPI tricks. > > > > And what out of tree drivers do or do not support we don't care one bit > > anyway. > > > > I think the reverts should be merged via Rafael's tree as the original > patches went in via there, and we should get them in asap. +1 > Acked-by: Dave Airlie Acked-by: Daniel Vetter Also fully agreeing with Karol's reply further down, if this doesn't work we need to improve the drivers, not pile stuff on top in some ACPI hacks. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, Aug 15, 2019 at 3:56 PM wrote: > > > -Original Message- > > From: linux-acpi-ow...@vger.kernel.org On > > Behalf Of Dave Airlie > > Sent: Wednesday, August 14, 2019 5:48 PM > > To: Karol Herbst > > Cc: LKML; Linux ACPI; dri-devel; nouveau; Rafael J . Wysocki; Alex Hung; Ben > > Skeggs; Dave Airlie > > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string > > to > > enable dGPU direct output" > > > > On Thu, 15 Aug 2019 at 07:31, Karol Herbst wrote: > > > > > > This reverts commit 28586a51eea666d5531bcaef2f68e4abbd87242c. > > > > > > The original commit message didn't even make sense. AMD _does_ support it > > > and > > > it works with Nouveau as well. > > > > > > Also what was the issue being solved here? No references to any bugs and > > > not > > > even explaining any issue at all isn't the way we do things. > > > > > > And even if it means a muxed design, then the fix is to make it work > > > inside the > > > driver, not adding some hacky workaround through ACPI tricks. > > > > > > And what out of tree drivers do or do not support we don't care one bit > > > anyway. > > > > > > > I think the reverts should be merged via Rafael's tree as the original > > patches went in via there, and we should get them in asap. > > > > Acked-by: Dave Airlie > > Dave. > > There are definitely going to be regressions on machines in the field with the > in tree drivers by reverting this. I think we should have an answer for all > of those > before this revert is accepted. > > Regarding systems with Intel+NVIDIA, we'll have to work with partners to > collect > some information on the impact of reverting this. > > When this is used on a system with Intel+AMD the ASL configures AMD GPU to use > "Hybrid Graphics" when on Windows and "Power Express" and "Switchable > Graphics" > when on Linux. and what's exactly the difference between those? And what's the actual issue here? We already have the PRIME offloading in place and if that's not enough, we should work on extending it, not adding some ACPI based workarounds, because that's exactly how that looks like. Also, was this discussed with anybody involved in the drm subsystem? > > I feel we need a knob and/or DMI detection to affect the changes that the ASL > normally performs. Why do we have to do that on a firmware level at all? ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
RE: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
> -Original Message- > From: linux-acpi-ow...@vger.kernel.org On > Behalf Of Dave Airlie > Sent: Wednesday, August 14, 2019 5:48 PM > To: Karol Herbst > Cc: LKML; Linux ACPI; dri-devel; nouveau; Rafael J . Wysocki; Alex Hung; Ben > Skeggs; Dave Airlie > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to > enable dGPU direct output" > > On Thu, 15 Aug 2019 at 07:31, Karol Herbst wrote: > > > > This reverts commit 28586a51eea666d5531bcaef2f68e4abbd87242c. > > > > The original commit message didn't even make sense. AMD _does_ support it > > and > > it works with Nouveau as well. > > > > Also what was the issue being solved here? No references to any bugs and not > > even explaining any issue at all isn't the way we do things. > > > > And even if it means a muxed design, then the fix is to make it work inside > > the > > driver, not adding some hacky workaround through ACPI tricks. > > > > And what out of tree drivers do or do not support we don't care one bit > > anyway. > > > > I think the reverts should be merged via Rafael's tree as the original > patches went in via there, and we should get them in asap. > > Acked-by: Dave Airlie > Dave. There are definitely going to be regressions on machines in the field with the in tree drivers by reverting this. I think we should have an answer for all of those before this revert is accepted. Regarding systems with Intel+NVIDIA, we'll have to work with partners to collect some information on the impact of reverting this. When this is used on a system with Intel+AMD the ASL configures AMD GPU to use "Hybrid Graphics" when on Windows and "Power Express" and "Switchable Graphics" when on Linux. I feel we need a knob and/or DMI detection to affect the changes that the ASL normally performs.
Re: DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?
On 15/08/2019 14:35, Christoph Hellwig wrote: On Wed, Aug 14, 2019 at 07:49:27PM +0200, Daniel Vetter wrote: On Wed, Aug 14, 2019 at 04:50:33PM +0200, Corentin Labbe wrote: Hello Since lot of release (at least since 4.19), I hit the following error message: DMA-API: cacheline tracking ENOMEM, dma-debug disabled After hitting that, I try to check who is creating so many DMA mapping and see: cat /sys/kernel/debug/dma-api/dump | cut -d' ' -f2 | sort | uniq -c 6 ahci 257 e1000e 6 ehci-pci 5891 nouveau 24 uhci_hcd Does nouveau having this high number of DMA mapping is normal ? Yeah seems perfectly fine for a gpu. That is a lot and apparently overwhelm the dma-debug tracking. Robin rewrote this code in Linux 4.21 to work a little better, so I'm curious why this might have changes in 4.19, as dma-debug did not change at all there. FWIW, the cacheline tracking entries are a separate thing from the dma-debug entries that I rejigged - judging by those numbers there should still be plenty of free dma-debug entries, but for some reason it has failed to extend the radix tree :/ Robin.
Re: [Nouveau] DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?
On Wed, Aug 14, 2019 at 07:49:27PM +0200, Daniel Vetter wrote: > On Wed, Aug 14, 2019 at 04:50:33PM +0200, Corentin Labbe wrote: > > Hello > > > > Since lot of release (at least since 4.19), I hit the following error > > message: > > DMA-API: cacheline tracking ENOMEM, dma-debug disabled > > > > After hitting that, I try to check who is creating so many DMA mapping and > > see: > > cat /sys/kernel/debug/dma-api/dump | cut -d' ' -f2 | sort | uniq -c > > 6 ahci > > 257 e1000e > > 6 ehci-pci > >5891 nouveau > > 24 uhci_hcd > > > > Does nouveau having this high number of DMA mapping is normal ? > > Yeah seems perfectly fine for a gpu. That is a lot and apparently overwhelm the dma-debug tracking. Robin rewrote this code in Linux 4.21 to work a little better, so I'm curious why this might have changes in 4.19, as dma-debug did not change at all there. ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] turn hmm migrate_vma upside down v3
On Wed, Aug 14, 2019 at 05:09:54PM -0700, Ralph Campbell wrote: > Some of the patches seem to have been mangled in the mail. Weird, I never had such a an issue with git-send-email. But to be covered for such weird cases I also posted a git url for exactly the tree I've been working on. > I was able to edit them and apply to Jason's tree > https://github.com/jgunthorpe/linux.git mmu_notifier branch. > So for the series you can add: > > Tested-by: Ralph Campbell Thanks! ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH] volt: Fix for some cards having 0 maximum voltage
although I'd like to have a deeper understanding of this table, I think it's fine as it is right now and this actually does fix something without causing any issues (as far as we know of) Reviewed-by: Karol Herbst On Fri, Aug 2, 2019 at 11:21 AM Mark Menzynski wrote: > > Some, mostly Fermi, vbioses appear to have zero max voltage. That causes > Nouveau to not parse voltage entries, thus users not being able to set higher > clocks. > > When changing this value Nvidia driver still appeared to ignore it, and I > wasn't able to find out why, thus the code is ignoring the value if it is > zero. > > CC: Maarten Lankhorst > Signed-off-by: Mark Menzynski > --- > drm/nouveau/nvkm/subdev/bios/volt.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drm/nouveau/nvkm/subdev/bios/volt.c > b/drm/nouveau/nvkm/subdev/bios/volt.c > index 7143ea46..33a9fb5a 100644 > --- a/drm/nouveau/nvkm/subdev/bios/volt.c > +++ b/drm/nouveau/nvkm/subdev/bios/volt.c > @@ -96,6 +96,8 @@ nvbios_volt_parse(struct nvkm_bios *bios, u8 *ver, u8 *hdr, > u8 *cnt, u8 *len, > info->min = min(info->base, > info->base + info->step * info->vidmask); > info->max = nvbios_rd32(bios, volt + 0x0e); > + if (!info->max) > + info->max = max(info->base, info->base + info->step * > info->vidmask); > break; > case 0x50: > info->min = nvbios_rd32(bios, volt + 0x0a); > -- > 2.21.0 > > ___ > Nouveau mailing list > Nouveau@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau