Re: [PATCH v5 3/7] PCI: Drop the `is_thunderbolt` attribute from PCI core
Hi, On Mon, Feb 28, 2022 at 10:36:59PM +, Limonciello, Mario wrote: > [AMD Official Use Only] > > > -Original Message- > > From: Lukas Wunner > > Sent: Monday, February 28, 2022 16:33 > > To: Bjorn Helgaas > > Cc: Limonciello, Mario ; Mika Westerberg > > ; Michael Jamet > > ; open list:PCI SUBSYSTEM > p...@vger.kernel.org>; open list:THUNDERBOLT DRIVER > u...@vger.kernel.org>; Yehezkel Bernat ; open > > list:DRM DRIVERS ; open list:X86 > > PLATFORM DRIVERS ; Andreas > > Noever ; open list:RADEON and AMDGPU > > DRM DRIVERS ; open list:DRM DRIVER FOR > > NVIDIA GEFORCE/QUADRO GPUS ; Bjorn > > Helgaas ; Deucher, Alexander > > > > Subject: Re: [PATCH v5 3/7] PCI: Drop the `is_thunderbolt` attribute from > > PCI > > core > > > > On Mon, Feb 28, 2022 at 04:13:44PM -0600, Bjorn Helgaas wrote: > > > On Mon, Feb 28, 2022 at 03:33:13PM +, Limonciello, Mario wrote: > > > > > On Fri, Feb 25, 2022 at 11:42:24AM -0600, Bjorn Helgaas wrote: > > > > > > That would just leave the "PCI_VSEC_ID_INTEL_TBT implies external- > > > > > facing" > > > > > > assumption above. Not having a Thunderbolt spec, I have no idea > > how > > > > > > you deal with that. > > > > > > > > > > You can download the spec here: > > [...] > > > > > Inside the archive there is also the DVSEC spec with name "USB4 DVSEC > > > > > Version 1.0.pdf". > > > > > > > > The spec has Host_Router_indication (bits 18-19) as meaning external > > facing. > > > > I'll respin the patch 3 for using that. > > > > > > Thanks, please include the spec citation when you do. And probably > > > the URL, because it's not at all obvious how the casual reader would > > > get from "is_thunderbolt" to a recent add-on to the USB4 spec. > > > > PCI_VSEC_ID_INTEL_TBT is not mentioned at all in the USB4 spec, > > hence there's no connection between "is_thunderbolt" and the USB4 spec. > > > > It's a proprietary VSEC used by Intel and the only way to recognize > > pre-USB4 Thunderbolt devices that I know of. Its ID is also > > different from the DVSEC IDs given in the above-mentioned spec. > > > > Thanks, > > The USB4 DVSEC spec makes comments about DVSEC_ID of 0x8086 and also > DVSEC VENDOR_ID of 0x8086. Is that not also present on the Intel TBT3 > controllers? > > My interpretation of this (and Mika's comment) was that rather than > looking at the Intel VSEC we should look at the USB4 DVSEC to detect > the Intel TBT3 controllers. For pre-USB4 controllers (TBT 1-3) we need to use the existing method (or a quirk based on device ID) as they don't have the USB4 DVSEC.
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Mon, Feb 28, 2022 at 4:45 PM Linus Torvalds wrote: > > Yeah, except that's ugly beyond belief, plus it's literally not what > we do in the kernel. (Of course, I probably shouldn't have used 'min()' as an example, because that is actually one of the few places where we do exactly that, using our __UNIQUE_ID() macros. Exactly because people _have_ tried to do -Wshadow when doing W=2). Linus
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Mon, Feb 28, 2022 at 12:37:15PM -0800, Linus Torvalds wrote: > On Mon, Feb 28, 2022 at 12:16 PM Matthew Wilcox wrote: > > > > Then we can never use -Wshadow ;-( I'd love to be able to turn it on; > > it catches real bugs. > > Oh, we already can never use -Wshadow regardless of things like this. > That bridge hasn't just been burned, it never existed in the first > place. > > The whole '-Wshadow' thing simply cannot work with local variables in > macros - something that we've used since day 1. > > Try this (as a "p.c" file): > > #define min(a,b) ({ \ > typeof(a) __a = (a);\ > typeof(b) __b = (b);\ > __a < __b ? __a : __b; }) > > int min3(int a, int b, int c) > { > return min(a,min(b,c)); > } > > and now do "gcc -O2 -S t.c". > > Then try it with -Wshadow. #define ___PASTE(a, b) a##b #define __PASTE(a, b) ___PASTE(a, b) #define _min(a, b, u) ({ \ typeof(a) __PASTE(__a,u) = (a);\ typeof(b) __PASTE(__b,u) = (b);\ __PASTE(__a,u) < __PASTE(__b,u) ? __PASTE(__a,u) : __PASTE(__b,u); }) #define min(a, b) _min(a, b, __COUNTER__) int min3(int a, int b, int c) { return min(a,min(b,c)); } (probably there's a more elegant way to do this)
Re: [PATCH] mm: split vm_normal_pages for LRU and non-LRU handling
On 28.02.22 21:34, Alex Sierra wrote: > DEVICE_COHERENT pages introduce a subtle distinction in the way > "normal" pages can be used by various callers throughout the kernel. > They behave like normal pages for purposes of mapping in CPU page > tables, and for COW. But they do not support LRU lists, NUMA > migration or THP. Therefore we split vm_normal_page into two > functions vm_normal_any_page and vm_normal_lru_page. The latter will > only return pages that can be put on an LRU list and that support > NUMA migration and THP. Why not s/vm_normal_any_page/vm_normal_page/ and avoid code churn? > > We also introduced a FOLL_LRU flag that adds the same behaviour to > follow_page and related APIs, to allow callers to specify that they > expect to put pages on an LRU list. [...] > -#define FOLL_WRITE 0x01/* check pte is writable */ > -#define FOLL_TOUCH 0x02/* mark page accessed */ > -#define FOLL_GET 0x04/* do get_page on page */ > -#define FOLL_DUMP0x08/* give error on hole if it would be zero */ > -#define FOLL_FORCE 0x10/* get_user_pages read/write w/o permission */ > -#define FOLL_NOWAIT 0x20/* if a disk transfer is needed, start the IO > - * and return without waiting upon it */ > -#define FOLL_POPULATE0x40/* fault in pages (with FOLL_MLOCK) */ > -#define FOLL_NOFAULT 0x80/* do not fault in pages */ > -#define FOLL_HWPOISON0x100 /* check page is hwpoisoned */ > -#define FOLL_NUMA0x200 /* force NUMA hinting page fault */ > -#define FOLL_MIGRATION 0x400 /* wait for page to replace migration > entry */ > -#define FOLL_TRIED 0x800 /* a retry, previous pass started an IO */ > -#define FOLL_MLOCK 0x1000 /* lock present pages */ > -#define FOLL_REMOTE 0x2000 /* we are working on non-current tsk/mm */ > -#define FOLL_COW 0x4000 /* internal GUP flag */ > -#define FOLL_ANON0x8000 /* don't do file mappings */ > -#define FOLL_LONGTERM0x1 /* mapping lifetime is indefinite: see > below */ > -#define FOLL_SPLIT_PMD 0x2 /* split huge pmd before returning */ > -#define FOLL_PIN 0x4 /* pages must be released via unpin_user_page */ > -#define FOLL_FAST_ONLY 0x8 /* gup_fast: prevent fall-back to slow > gup */ > +#define FOLL_WRITE 0x01 /* check pte is writable */ > +#define FOLL_TOUCH 0x02 /* mark page accessed */ > +#define FOLL_GET 0x04 /* do get_page on page */ > +#define FOLL_DUMP0x08 /* give error on hole if it would be zero */ > +#define FOLL_FORCE 0x10 /* get_user_pages read/write w/o permission */ > +#define FOLL_NOWAIT 0x20 /* if a disk transfer is needed, start the IO > + * and return without waiting upon it */ > +#define FOLL_POPULATE0x40 /* fault in pages (with FOLL_MLOCK) */ > +#define FOLL_NOFAULT 0x80 /* do not fault in pages */ > +#define FOLL_HWPOISON0x100/* check page is hwpoisoned */ > +#define FOLL_NUMA0x200/* force NUMA hinting page fault */ > +#define FOLL_MIGRATION 0x400/* wait for page to replace migration > entry */ > +#define FOLL_TRIED 0x800/* a retry, previous pass started an IO */ > +#define FOLL_MLOCK 0x1000 /* lock present pages */ > +#define FOLL_REMOTE 0x2000 /* we are working on non-current tsk/mm */ > +#define FOLL_COW 0x4000 /* internal GUP flag */ > +#define FOLL_ANON0x8000 /* don't do file mappings */ > +#define FOLL_LONGTERM0x1 /* mapping lifetime is indefinite: see > below */ > +#define FOLL_SPLIT_PMD 0x2 /* split huge pmd before returning */ > +#define FOLL_PIN 0x4 /* pages must be released via unpin_user_page > */ > +#define FOLL_FAST_ONLY 0x8 /* gup_fast: prevent fall-back to slow > gup */ > +#define FOLL_LRU 0x10 /* return only LRU (anon or page cache) */ > Can we minimize code churn, please? > if (PageReserved(page)) > diff --git a/mm/migrate.c b/mm/migrate.c > index c31d04b46a5e..17d049311b78 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1614,7 +1614,7 @@ static int add_page_for_migration(struct mm_struct *mm, > unsigned long addr, > goto out; > > /* FOLL_DUMP to ignore special (like zero) pages */ > - follflags = FOLL_GET | FOLL_DUMP; > + follflags = FOLL_GET | FOLL_DUMP | FOLL_LRU; > page = follow_page(vma, addr, follflags); Why wouldn't we want to dump DEVICE_COHERENT pages? This looks wrong. -- Thanks, David / dhildenb
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Mon, Feb 28, 2022 at 1:47 PM Jakob Koschel wrote: > > The goal of this is to get compiler warnings right? This would indeed be > great. Yes, so I don't mind having a one-time patch that has been gathered using some automated checker tool, but I don't think that works from a long-term maintenance perspective. So if we have the basic rule being "don't use the loop iterator after the loop has finished, because it can cause all kinds of subtle issues", then in _addition_ to fixing the existing code paths that have this issue, I really would want to (a) get a compiler warning for future cases and (b) make it not actually _work_ for future cases. Because otherwise it will just happen again. > Changing the list_for_each_entry() macro first will break all of those cases > (e.g. the ones using 'list_entry_is_head()). So I have no problems with breaking cases that we basically already have a patch for due to your automated tool. There were certainly more than a handful, but it didn't look _too_ bad to just make the rule be "don't use the iterator after the loop". Of course, that's just based on that patch of yours. Maybe there are a ton of other cases that your patch didn't change, because they didn't match your trigger case, so I may just be overly optimistic here. But basically to _me_, the important part is that the end result is maintainable longer-term. I'm more than happy to have a one-time patch to fix a lot of dubious cases if we can then have clean rules going forward. > I assumed it is better to fix those cases first and then have a simple > coccinelle script changing the macro + moving the iterator into the scope > of the macro. So that had been another plan of mine, until I actually looked at changing the macro. In the one case I looked at, it was ugly beyond belief. It turns out that just syntactically, it's really nice to give the type of the iterator from outside the way we do now. Yeah, it may be a bit odd, and maybe it's partly because I'm so used to the "list_for_each_list_entry()" syntax, but moving the type into the loop construct really made it nasty - either one very complex line, or having to split it over two lines which was even worse. Maybe the place I looked at just happened to have a long typename, but it's basically always going to be a struct, so it's never a _simple_ type. And it just looked very odd adn unnatural to have the type as one of the "arguments" to that list_for_each_entry() macro. So yes, initially my idea had been to just move the iterator entirely inside the macro. But specifying the type got so ugly that I think that typeof (pos) pos trick inside the macro really ends up giving us the best of all worlds: (a) let's us keep the existing syntax and code for all the nice cases that did everything inside the loop anyway (b) gives us a nice warning for any normal use-after-loop case (unless you explicitly initialized it like that sgx_mmu_notifier_release() function did for no good reason (c) also guarantees that even if you don't get a warning, non-converted (or newly written) bad code won't actually _work_ so you end up getting the new rules without any ambiguity or mistaken > With this you are no longer able to set the 'outer' pos within the list > iterator loop body or am I missing something? Correct. Any assignment inside the loop will be entirely just to the local loop case. So any "break;" out of the loop will have to set another variable - like your updated patch did. > I fail to see how this will make most of the changes in this > patch obsolete (if that was the intention). I hope my explanation above clarifies my thinking: I do not dislike your patch, and in fact your patch is indeed required to make the new semantics work. What I disliked was always the maintainability of your patch - making the rules be something that isn't actually visible in the source code, and letting the old semantics still work as well as they ever did, and having to basically run some verification pass to find bad users. (I also disliked your original patch that mixed up the "CPU speculation type safety" with the actual non-speculative problems, but that was another issue). Linus
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Mon, Feb 28, 2022 at 3:26 PM Matthew Wilcox wrote: > > #define ___PASTE(a, b) a##b > #define __PASTE(a, b) ___PASTE(a, b) > #define _min(a, b, u) ({ \ Yeah, except that's ugly beyond belief, plus it's literally not what we do in the kernel. Really. The "-Wshadow doesn't work on the kernel" is not some new issue, because you have to do completely insane things to the source code to enable it. Just compare your uglier-than-sin version to my straightforward one. One does the usual and obvious "use a private variable to avoid the classic multi-use of a macro argument". And the other one is an abomination. Linus
RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
From: Matthew Wilcox > Sent: 28 February 2022 20:16 > > On Mon, Feb 28, 2022 at 12:10:24PM -0800, Linus Torvalds wrote: > > We can do > > > > typeof(pos) pos > > > > in the 'for ()' loop, and never use __iter at all. > > > > That means that inside the for-loop, we use a _different_ 'pos' than > > outside. > > Then we can never use -Wshadow ;-( I'd love to be able to turn it on; > it catches real bugs. > > > +#define list_for_each_entry(pos, head, member) > > \ > > + for (typeof(pos) pos = list_first_entry(head, typeof(*pos), member); > > \ > > +!list_entry_is_head(pos, head, member);\ > > pos = list_next_entry(pos, member)) Actually can't you use 'pos' to temporarily hold the address of 'member'. Something like: for (pos = (void *)head; \ pos ? ((pos = (void *)pos - offsetof(member)), 1) : 0; \ pos = (void *)pos->next) So that 'pos' is NULL if the loop terminates. No pointers outside structures are generated. Probably need to kill list_entry_is_head() - or it just checks for NULL. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Mon, Feb 28, 2022 at 05:28:58PM -0500, James Bottomley wrote: > On Mon, 2022-02-28 at 23:59 +0200, Mike Rapoport wrote: > > > > On February 28, 2022 10:42:53 PM GMT+02:00, James Bottomley < > > james.bottom...@hansenpartnership.com> wrote: > > > On Mon, 2022-02-28 at 21:07 +0100, Christian König wrote: > [...] > > > > > I do wish we could actually poison the 'pos' value after the > > > > > loop somehow - but clearly the "might be uninitialized" I was > > > > > hoping for isn't the way to do it. > > > > > > > > > > Anybody have any ideas? > > > > > > > > I think we should look at the use cases why code is touching > > > > (pos) after the loop. > > > > > > > > Just from skimming over the patches to change this and experience > > > > with the drivers/subsystems I help to maintain I think the > > > > primary pattern looks something like this: > > > > > > > > list_for_each_entry(entry, head, member) { > > > > if (some_condition_checking(entry)) > > > > break; > > > > } > > > > do_something_with(entry); > > > > > > Actually, we usually have a check to see if the loop found > > > anything, but in that case it should something like > > > > > > if (list_entry_is_head(entry, head, member)) { > > >return with error; > > > } > > > do_somethin_with(entry); > > > > > > Suffice? The list_entry_is_head() macro is designed to cope with > > > the bogus entry on head problem. > > > > Won't suffice because the end goal of this work is to limit scope of > > entry only to loop. Hence the need for additional variable. > > Well, yes, but my objection is more to the size of churn than the > desire to do loop local. I'm not even sure loop local is possible, > because it's always annoyed me that for (int i = 0; ... in C++ defines > i in the outer scope not the loop scope, which is why I never use it. In C its scope is the rest of the declaration and the entire loop, not anything after it. This was the same in C++98 already, btw (but in pre-standard versions of C++ things were like you remember, yes, and it was painful). Segher
RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
From: Linus Torvalds > Sent: 28 February 2022 19:56 > On Mon, Feb 28, 2022 at 4:19 AM Christian König > wrote: > > > > I don't think that using the extra variable makes the code in any way > > more reliable or easier to read. > > So I think the next step is to do the attached patch (which requires > that "-std=gnu11" that was discussed in the original thread). > > That will guarantee that the 'pos' parameter of list_for_each_entry() > is only updated INSIDE the for_each_list_entry() loop, and can never > point to the (wrongly typed) head entry. > > And I would actually hope that it should actually cause compiler > warnings about possibly uninitialized variables if people then use the > 'pos' pointer outside the loop. Except > > (a) that code in sgx/encl.c currently initializes 'tmp' to NULL for > inexplicable reasons - possibly because it already expected this > behavior > > (b) when I remove that NULL initializer, I still don't get a warning, > because we've disabled -Wno-maybe-uninitialized since it results in so > many false positives. > > Oh well. > > Anyway, give this patch a look, and at least if it's expanded to do > "(pos) = NULL" in the entry statement for the for-loop, it will avoid > the HEAD type confusion that Jakob is working on. And I think in a > cleaner way than the horrid games he plays. > > (But it won't avoid possible CPU speculation of such type confusion. > That, in my opinion, is a completely different issue) > > I do wish we could actually poison the 'pos' value after the loop > somehow - but clearly the "might be uninitialized" I was hoping for > isn't the way to do it. > > Anybody have any ideas? > > Linus diff --git a/include/linux/list.h b/include/linux/list.h index dd6c2041d09c..bab995596aaa 100644 --- a/include/linux/list.h +++ b/include/linux/list.h @@ -634,10 +634,10 @@ static inline void list_splice_tail_init(struct list_head *list, * @head: the head for your list. * @member:the name of the list_head within the struct. */ -#define list_for_each_entry(pos, head, member) \ - for (pos = list_first_entry(head, typeof(*pos), member);\ -!list_entry_is_head(pos, head, member);\ -pos = list_next_entry(pos, member)) +#define list_for_each_entry(pos, head, member) \ + for (typeof(pos) __iter = list_first_entry(head, typeof(*pos), member); \ +!list_entry_is_head(__iter, head, member) && (((pos)=__iter),1); \ +__iter = list_next_entry(__iter, member)) /** * list_for_each_entry_reverse - iterate backwards over list of given type. I think you actually want: !list_entry_is_head(__iter, head, member) ? (((pos)=__iter),1) : (((pos) = NULL),0); Which can be done in the original by: !list_entry_is_head(pos, head, member) ? 1 : (((pos) = NULL), 0); Although it would be safer to have (without looking up the actual name): for (item *__item = head; \ __item ? (((pos) = list_item(__item, member)), 1) : (((pos) = NULL), 0); __item = (pos)->member) The local does need 'fixing' to avoid shadowing for nested loops. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Mon, Feb 28, 2022 at 4:38 PM Segher Boessenkool wrote: > > In C its scope is the rest of the declaration and the entire loop, not > anything after it. This was the same in C++98 already, btw (but in > pre-standard versions of C++ things were like you remember, yes, and it > was painful). Yeah, the original C++ model was just unadulterated garbage, with no excuse for it, and the scope was not the loop, but the block the loop existed in. That would never have been acceptable for the kernel - it's basically just an even uglier version of "put variable declarations in the middle of code" (and we use "-Wdeclaration-after-statement" to disallow that for kernel code, although apparently some of our user space tooling code doesn't enforce or follow that rule). The actual C99 version is the sane one which actually makes it easier and clearer to have loop iterators that are clearly just in loop scope. That's a good idea in general, and I have wanted to start using that in the kernel even aside from some of the loop construct macros. Because putting variables in natural minimal scope is a GoodThing(tm). Of course, we shouldn't go crazy with it. Even after we do that -std=gnu11 thing, we'll have backports to worry about. And it's not clear that we necessarily want to backport that gnu11 thing - since people who run old stable kernels also may be still using those really old compilers... Linus
[PATCH] drm/amdgpu: Move common initialization operations of each ras block to one function
Define amdgpu_ras_sw_init function to initialize all ras blocks. Signed-off-by: yipechai --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 + drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c| 2 - drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c| 143 - drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h| 1 + drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 21 --- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 16 --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 28 drivers/gpu/drm/amd/amdgpu/mca_v3_0.c | 6 - drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 17 --- 9 files changed, 148 insertions(+), 92 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 6113ddc765a7..72550e9f6058 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2402,6 +2402,12 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) } } + r = amdgpu_ras_sw_init(adev); + if (r) { + DRM_ERROR("amdgpu_ras_early_init failed (%d).\n", r); + goto init_failed; + } + if (amdgpu_sriov_vf(adev)) amdgpu_virt_init_data_exchange(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index ab75e189bc0b..544241f357b2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -440,8 +440,6 @@ int amdgpu_gmc_ras_early_init(struct amdgpu_device *adev) { if (!adev->gmc.xgmi.connected_to_cpu) { adev->gmc.xgmi.ras = &xgmi_ras; - amdgpu_ras_register_ras_block(adev, &adev->gmc.xgmi.ras->ras_block); - adev->gmc.xgmi.ras_if = &adev->gmc.xgmi.ras->ras_block.ras_comm; } return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index d3875618ebf5..89075ab9e82e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -2299,8 +2299,6 @@ int amdgpu_ras_init(struct amdgpu_device *adev) case CHIP_ALDEBARAN: if (!adev->gmc.xgmi.connected_to_cpu) { adev->nbio.ras = &nbio_v7_4_ras; - amdgpu_ras_register_ras_block(adev, &adev->nbio.ras->ras_block); - adev->nbio.ras_if = &adev->nbio.ras->ras_block.ras_comm; } break; default: @@ -2533,6 +2531,147 @@ void amdgpu_ras_suspend(struct amdgpu_device *adev) amdgpu_ras_disable_all_features(adev, 1); } +int amdgpu_ras_sw_init(struct amdgpu_device *adev) +{ + int err = 0; + + if (!amdgpu_ras_asic_supported(adev)) + return 0; + + if (adev->nbio.ras) { + err = amdgpu_ras_register_ras_block(adev, &adev->nbio.ras->ras_block); + if (err) { + dev_err(adev->dev, "Failed to register nbio ras block!\n"); + return err; + } + adev->nbio.ras_if = &adev->nbio.ras->ras_block.ras_comm; + } + + if (adev->gmc.xgmi.ras) { + err = amdgpu_ras_register_ras_block(adev, &adev->gmc.xgmi.ras->ras_block); + if (err) { + dev_err(adev->dev, "Failed to register xgmi ras block!\n"); + return err; + } + adev->gmc.xgmi.ras_if = &adev->gmc.xgmi.ras->ras_block.ras_comm; + } + + if (adev->gfx.ras) { + err = amdgpu_ras_register_ras_block(adev, &adev->gfx.ras->ras_block); + if (err) { + dev_err(adev->dev, "Failed to register gfx ras block!\n"); + return err; + } + + strcpy(adev->gfx.ras->ras_block.ras_comm.name, "gfx"); + adev->gfx.ras->ras_block.ras_comm.block = AMDGPU_RAS_BLOCK__GFX; + adev->gfx.ras->ras_block.ras_comm.type = AMDGPU_RAS_ERROR__MULTI_UNCORRECTABLE; + adev->gfx.ras_if = &adev->gfx.ras->ras_block.ras_comm; + + /* If not define special ras_late_init function, use gfx default ras_late_init */ + if (!adev->gfx.ras->ras_block.ras_late_init) + adev->gfx.ras->ras_block.ras_late_init = amdgpu_gfx_ras_late_init; + + /* If not defined special ras_cb function, use default ras_cb */ + if (!adev->gfx.ras->ras_block.ras_cb) + adev->gfx.ras->ras_block.ras_cb = amdgpu_gfx_process_ras_data_cb; + } + + if (adev->umc.ras) { + err = amdgpu_ras_register_ras_block(adev, &adev->umc.ras->ras_block); + if (err) { + dev_err(adev->dev, "Failed to register umc ras block!\n"); + return err; + } + + strcpy(adev->umc.ras->ra
[PATCH 1/3] drm/amdgpu/nv: enable clock gating for GC 10.3.7 subblock
This will enable the following block clock gating. - MC - SDMA - HDP - ATHUB - IH - VCN/JPEG Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/nv.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index f6aeef759ee1..05487894120a 100644 --- a/drivers/gpu/drm/amd/amdgpu/nv.c +++ b/drivers/gpu/drm/amd/amdgpu/nv.c @@ -946,7 +946,17 @@ static int nv_common_early_init(void *handle) AMD_CG_SUPPORT_GFX_3D_CGLS | AMD_CG_SUPPORT_GFX_RLC_LS | AMD_CG_SUPPORT_GFX_CP_LS | - AMD_CG_SUPPORT_GFX_FGCG; + AMD_CG_SUPPORT_GFX_FGCG | + AMD_CG_SUPPORT_MC_MGCG | + AMD_CG_SUPPORT_MC_LS | + AMD_CG_SUPPORT_SDMA_LS | + AMD_CG_SUPPORT_HDP_MGCG | + AMD_CG_SUPPORT_HDP_LS | + AMD_CG_SUPPORT_ATHUB_MGCG | + AMD_CG_SUPPORT_ATHUB_LS | + AMD_CG_SUPPORT_IH_CG | + AMD_CG_SUPPORT_VCN_MGCG | + AMD_CG_SUPPORT_JPEG_MGCG; adev->pg_flags = AMD_PG_SUPPORT_VCN | AMD_PG_SUPPORT_VCN_DPG | AMD_PG_SUPPORT_JPEG; -- 2.17.1
[PATCH 2/3] drm/amdgpu: enable gfx power gating for GC 10.3.7
Enable gfx power gating for GC 10.3.7. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 ++ drivers/gpu/drm/amd/amdgpu/nv.c| 3 ++- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index fd7ded7799e2..e048635435a2 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -8348,6 +8348,7 @@ static void gfx_v10_cntl_power_gating(struct amdgpu_device *adev, bool enable) case IP_VERSION(10, 3, 1): case IP_VERSION(10, 3, 3): case IP_VERSION(10, 3, 6): + case IP_VERSION(10, 3, 7): data = 0x4E20 & RLC_PG_DELAY_3__CGCG_ACTIVE_BEFORE_CGPG_MASK_Vangogh; WREG32_SOC15(GC, 0, mmRLC_PG_DELAY_3, data); break; @@ -8417,6 +8418,7 @@ static int gfx_v10_0_set_powergating_state(void *handle, case IP_VERSION(10, 3, 1): case IP_VERSION(10, 3, 3): case IP_VERSION(10, 3, 6): + case IP_VERSION(10, 3, 7): gfx_v10_cntl_pg(adev, enable); amdgpu_gfx_off_ctrl(adev, enable); break; diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index 05487894120a..e19f14c3ef59 100644 --- a/drivers/gpu/drm/amd/amdgpu/nv.c +++ b/drivers/gpu/drm/amd/amdgpu/nv.c @@ -959,7 +959,8 @@ static int nv_common_early_init(void *handle) AMD_CG_SUPPORT_JPEG_MGCG; adev->pg_flags = AMD_PG_SUPPORT_VCN | AMD_PG_SUPPORT_VCN_DPG | - AMD_PG_SUPPORT_JPEG; + AMD_PG_SUPPORT_JPEG | + AMD_PG_SUPPORT_GFX_PG; adev->external_rev_id = adev->rev_id + 0x01; break; default: -- 2.17.1
[PATCH 3/3] drm/amdgpu: enable gfxoff routine for GC 10.3.7
Enable gfxoff routine for GC 10.3.7. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 3 +++ drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 1 + 2 files changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index e048635435a2..92fdccc4a905 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -6557,6 +6557,7 @@ static void gfx_v10_0_kiq_setting(struct amdgpu_ring *ring) case IP_VERSION(10, 3, 5): case IP_VERSION(10, 3, 6): case IP_VERSION(10, 3, 3): + case IP_VERSION(10, 3, 7): tmp = RREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid); tmp &= 0xff00; tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue); @@ -7857,6 +7858,7 @@ static void gfx_v10_0_set_safe_mode(struct amdgpu_device *adev) case IP_VERSION(10, 3, 5): case IP_VERSION(10, 3, 6): case IP_VERSION(10, 3, 3): + case IP_VERSION(10, 3, 7): WREG32_SOC15(GC, 0, mmRLC_SAFE_MODE_Sienna_Cichlid, data); /* wait for RLC_SAFE_MODE */ @@ -7894,6 +7896,7 @@ static void gfx_v10_0_unset_safe_mode(struct amdgpu_device *adev) case IP_VERSION(10, 3, 5): case IP_VERSION(10, 3, 6): case IP_VERSION(10, 3, 3): + case IP_VERSION(10, 3, 7): WREG32_SOC15(GC, 0, mmRLC_SAFE_MODE_Sienna_Cichlid, data); break; default: diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c index 261a3749c089..05783b9b4b9a 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c @@ -751,6 +751,7 @@ int smu_v13_0_gfx_off_control(struct smu_context *smu, bool enable) switch (adev->ip_versions[MP1_HWIP][0]) { case IP_VERSION(13, 0, 1): case IP_VERSION(13, 0, 3): + case IP_VERSION(13, 0, 8): if (!(adev->pm.pp_feature & PP_GFXOFF_MASK)) return 0; if (enable) -- 2.17.1
[PATCH 1/2] drm/amdgpu: move amdgpu_gmc_noretry_set after ip_versions populated
otherwise adev->ip_versions is still empty when amdgpu_gmc_noretry_set is called. Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 6113ddc765a7..61a6a7920c76 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -1554,7 +1554,6 @@ static int amdgpu_device_check_arguments(struct amdgpu_device *adev) amdgpu_gmc_tmz_set(adev); - amdgpu_gmc_noretry_set(adev); return 0; } @@ -3641,6 +3640,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, if (r) return r; + amdgpu_gmc_noretry_set(adev); /* Need to get xgmi info early to decide the reset behavior*/ if (adev->gmc.xgmi.supported) { r = adev->gfxhub.funcs->get_xgmi_info(adev); -- 2.25.1
[PATCH 2/2] drm/amdgpu: add noretry=1 for gc 10.3.6
From: Aaron Liu By default, set noretry=1 for kfd exception test. Signed-off-by: Aaron Liu Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index ab75e189bc0b..4bbd8501c6c1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -595,6 +595,13 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) gmc->noretry = amdgpu_noretry; break; } + + if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(10, 3, 6)) { + if (amdgpu_noretry == -1) + gmc->noretry = 1; + else + gmc->noretry = amdgpu_noretry; + } } void amdgpu_gmc_set_vm_fault_masks(struct amdgpu_device *adev, int hub_type, -- 2.25.1
Re: [PATCH 2/2] drm/amdgpu: add noretry=1 for gc 10.3.6
Am 01.03.22 um 14:05 schrieb Yifan Zhang: From: Aaron Liu By default, set noretry=1 for kfd exception test. Signed-off-by: Aaron Liu Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index ab75e189bc0b..4bbd8501c6c1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -595,6 +595,13 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) gmc->noretry = amdgpu_noretry; break; } + + if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(10, 3, 6)) { + if (amdgpu_noretry == -1) + gmc->noretry = 1; + else + gmc->noretry = amdgpu_noretry; + } You should probably move the whole switch case to version checks instead. Christian. } void amdgpu_gmc_set_vm_fault_masks(struct amdgpu_device *adev, int hub_type,
[PATCH Review 1/1] drm/amdgpu: support send bad channel info to smu
Message SMU bad channel information bitmap to update OOB table Change-Id: I49a79af64d5263c28db059ecb8b8405a471431b4 Signed-off-by: Stanley.Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 7 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 3 ++ .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c| 25 ++- .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h| 4 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 5 +++ drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 12 ++ drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 1 + drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 10 + drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 7 +++ .../pm/swsmu/inc/pmfw_if/aldebaran_ppsmc.h| 3 +- drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h | 3 +- .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 43 +++ 12 files changed, 119 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index d3875618ebf5..f9104f99eb9c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -2068,6 +2068,7 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev) mutex_init(&con->recovery_lock); INIT_WORK(&con->recovery_work, amdgpu_ras_do_recovery); atomic_set(&con->in_recovery, 0); + con->eeprom_control.bad_channel_bitmap = 0; max_eeprom_records_count = amdgpu_ras_eeprom_max_record_count(); amdgpu_ras_validate_threshold(adev, max_eeprom_records_count); @@ -2092,6 +2093,11 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev) goto free; amdgpu_dpm_send_hbm_bad_pages_num(adev, con->eeprom_control.ras_num_recs); + + if (con->update_channel_flag == true) { + amdgpu_dpm_send_hbm_bad_channel_flag(adev, con->eeprom_control.bad_channel_bitmap); + con->update_channel_flag = false; + } } #ifdef CONFIG_X86_MCE_AMD @@ -2285,6 +2291,7 @@ int amdgpu_ras_init(struct amdgpu_device *adev) goto release_con; } + con->update_channel_flag = false; con->features = 0; INIT_LIST_HEAD(&con->head); /* Might need get this flag from vbios. */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h index 7cddaad90d6d..9314fde81e68 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h @@ -374,6 +374,9 @@ struct amdgpu_ras { /* record umc error info queried from smu */ struct umc_ecc_info umc_ecc; + + /* Indicates smu whether need update bad channel info */ + bool update_channel_flag; }; struct ras_fs_data { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index 2b844a5aafdb..ad5d8667756d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c @@ -265,6 +265,7 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control) { struct amdgpu_device *adev = to_amdgpu_device(control); struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr; + struct amdgpu_ras *con = amdgpu_ras_get_context(adev); u8 csum; int res; @@ -285,6 +286,10 @@ int amdgpu_ras_eeprom_reset_table(struct amdgpu_ras_eeprom_control *control) amdgpu_dpm_send_hbm_bad_pages_num(adev, control->ras_num_recs); + control->bad_channel_bitmap = 0; + amdgpu_dpm_send_hbm_bad_channel_flag(adev, control->bad_channel_bitmap); + con->update_channel_flag = false; + amdgpu_ras_debugfs_set_ret_size(control); mutex_unlock(&control->ras_tbl_mutex); @@ -418,6 +423,7 @@ amdgpu_ras_eeprom_append_table(struct amdgpu_ras_eeprom_control *control, struct eeprom_table_record *record, const u32 num) { + struct amdgpu_ras *con = amdgpu_ras_get_context(to_amdgpu_device(control)); u32 a, b, i; u8 *buf, *pp; int res; @@ -429,9 +435,16 @@ amdgpu_ras_eeprom_append_table(struct amdgpu_ras_eeprom_control *control, /* Encode all of them in one go. */ pp = buf; - for (i = 0; i < num; i++, pp += RAS_TABLE_RECORD_SIZE) + for (i = 0; i < num; i++, pp += RAS_TABLE_RECORD_SIZE) { __encode_table_record_to_buf(control, &record[i], pp); + /* update bad channel bitmap */ + if (!(control->bad_channel_bitmap & (1 << record[i].mem_channel))) { + control->bad_channel_bitmap |= 1 << record[i].mem_channel; + con->update_channel_flag = true; + } + } + /* a, first record index to write into. * b, last record index to write into. * a = first index to read (fri)
回复: [PATCH] drm/amdgpu: Move common initialization operations of each ras block to one function
[AMD Official Use Only] Hi yipe, One suggestion for this patch, please check my comment. Regards, Stanley > -邮件原件- > 发件人: amd-gfx 代表 yipechai > 发送时间: Tuesday, March 1, 2022 5:46 PM > 收件人: amd-gfx@lists.freedesktop.org > 抄送: Zhou1, Tao ; Zhang, Hawking > ; Clements, John ; > Chai, Thomas ; Chai, Thomas > > 主题: [PATCH] drm/amdgpu: Move common initialization operations of each > ras block to one function > > Define amdgpu_ras_sw_init function to initialize all ras blocks. > > Signed-off-by: yipechai > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 + > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c| 2 - > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c| 143 > - > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h| 1 + > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 21 --- > drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 16 --- > drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 28 > drivers/gpu/drm/amd/amdgpu/mca_v3_0.c | 6 - > drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 17 --- > 9 files changed, 148 insertions(+), 92 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 6113ddc765a7..72550e9f6058 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -2402,6 +2402,12 @@ static int amdgpu_device_ip_init(struct > amdgpu_device *adev) > } > } > > + r = amdgpu_ras_sw_init(adev); > + if (r) { > + DRM_ERROR("amdgpu_ras_early_init failed (%d).\n", r); > + goto init_failed; > + } [Yang, Stanley] : This is ras blocks early init, I think it's more reasonable to move amdgpu_ras_sw_init before amdgpu_ras_init function. > + > if (amdgpu_sriov_vf(adev)) > amdgpu_virt_init_data_exchange(adev); > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > index ab75e189bc0b..544241f357b2 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > @@ -440,8 +440,6 @@ int amdgpu_gmc_ras_early_init(struct > amdgpu_device *adev) { > if (!adev->gmc.xgmi.connected_to_cpu) { > adev->gmc.xgmi.ras = &xgmi_ras; > - amdgpu_ras_register_ras_block(adev, &adev- > >gmc.xgmi.ras->ras_block); > - adev->gmc.xgmi.ras_if = &adev->gmc.xgmi.ras- > >ras_block.ras_comm; > } > > return 0; > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > index d3875618ebf5..89075ab9e82e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > @@ -2299,8 +2299,6 @@ int amdgpu_ras_init(struct amdgpu_device *adev) > case CHIP_ALDEBARAN: > if (!adev->gmc.xgmi.connected_to_cpu) { > adev->nbio.ras = &nbio_v7_4_ras; > - amdgpu_ras_register_ras_block(adev, &adev- > >nbio.ras->ras_block); > - adev->nbio.ras_if = &adev->nbio.ras- > >ras_block.ras_comm; > } > break; > default: > @@ -2533,6 +2531,147 @@ void amdgpu_ras_suspend(struct > amdgpu_device *adev) > amdgpu_ras_disable_all_features(adev, 1); } > > +int amdgpu_ras_sw_init(struct amdgpu_device *adev) { > + int err = 0; > + > + if (!amdgpu_ras_asic_supported(adev)) > + return 0; > + > + if (adev->nbio.ras) { > + err = amdgpu_ras_register_ras_block(adev, &adev- > >nbio.ras->ras_block); > + if (err) { > + dev_err(adev->dev, "Failed to register nbio ras > block!\n"); > + return err; > + } > + adev->nbio.ras_if = &adev->nbio.ras->ras_block.ras_comm; > + } > + > + if (adev->gmc.xgmi.ras) { > + err = amdgpu_ras_register_ras_block(adev, &adev- > >gmc.xgmi.ras->ras_block); > + if (err) { > + dev_err(adev->dev, "Failed to register xgmi ras > block!\n"); > + return err; > + } > + adev->gmc.xgmi.ras_if = &adev->gmc.xgmi.ras- > >ras_block.ras_comm; > + } > + > + if (adev->gfx.ras) { > + err = amdgpu_ras_register_ras_block(adev, &adev->gfx.ras- > >ras_block); > + if (err) { > + dev_err(adev->dev, "Failed to register gfx ras > block!\n"); > + return err; > + } > + > + strcpy(adev->gfx.ras->ras_block.ras_comm.name, "gfx"); > + adev->gfx.ras->ras_block.ras_comm.block = > AMDGPU_RAS_BLOCK__GFX; > + adev->gfx.ras->ras_block.ras_comm.type = > AMDGPU_RAS_ERROR__MULTI_UNCORRECTABLE; > + adev->gfx.ras_if = &adev->gfx.ras->ras_block.ras_comm; > + > + /* If not define special ras_late_init function, use gfx default > ras_late_init */ > + if (!adev->gfx.ras->ras_bloc
Re: [PATCH] drm/amdgpu: enable gfx clock gating control for GC 10.3.7
[Public] Reviewed-by: Alex Deucher From: Liang, Prike Sent: Monday, February 28, 2022 10:01 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Huang, Ray ; Liang, Prike Subject: [PATCH] drm/amdgpu: enable gfx clock gating control for GC 10.3.7 Enable gfx cg gate/ungate control for GC 10.3.7. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 90158289cd30..fd7ded7799e2 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -8445,6 +8445,7 @@ static int gfx_v10_0_set_clockgating_state(void *handle, case IP_VERSION(10, 3, 5): case IP_VERSION(10, 3, 6): case IP_VERSION(10, 3, 3): + case IP_VERSION(10, 3, 7): gfx_v10_0_update_gfx_clock_gating(adev, state == AMD_CG_STATE_GATE); break; -- 2.17.1
Re: [PATCH 3/3] drm/amdgpu: enable gfxoff routine for GC 10.3.7
Series is: Reviewed-by: Alex Deucher On Tue, Mar 1, 2022 at 7:30 AM Prike Liang wrote: > > Enable gfxoff routine for GC 10.3.7. > > Signed-off-by: Prike Liang > --- > drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 3 +++ > drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 1 + > 2 files changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > index e048635435a2..92fdccc4a905 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > @@ -6557,6 +6557,7 @@ static void gfx_v10_0_kiq_setting(struct amdgpu_ring > *ring) > case IP_VERSION(10, 3, 5): > case IP_VERSION(10, 3, 6): > case IP_VERSION(10, 3, 3): > + case IP_VERSION(10, 3, 7): > tmp = RREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid); > tmp &= 0xff00; > tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue); > @@ -7857,6 +7858,7 @@ static void gfx_v10_0_set_safe_mode(struct > amdgpu_device *adev) > case IP_VERSION(10, 3, 5): > case IP_VERSION(10, 3, 6): > case IP_VERSION(10, 3, 3): > + case IP_VERSION(10, 3, 7): > WREG32_SOC15(GC, 0, mmRLC_SAFE_MODE_Sienna_Cichlid, data); > > /* wait for RLC_SAFE_MODE */ > @@ -7894,6 +7896,7 @@ static void gfx_v10_0_unset_safe_mode(struct > amdgpu_device *adev) > case IP_VERSION(10, 3, 5): > case IP_VERSION(10, 3, 6): > case IP_VERSION(10, 3, 3): > + case IP_VERSION(10, 3, 7): > WREG32_SOC15(GC, 0, mmRLC_SAFE_MODE_Sienna_Cichlid, data); > break; > default: > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c > b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c > index 261a3749c089..05783b9b4b9a 100644 > --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c > @@ -751,6 +751,7 @@ int smu_v13_0_gfx_off_control(struct smu_context *smu, > bool enable) > switch (adev->ip_versions[MP1_HWIP][0]) { > case IP_VERSION(13, 0, 1): > case IP_VERSION(13, 0, 3): > + case IP_VERSION(13, 0, 8): > if (!(adev->pm.pp_feature & PP_GFXOFF_MASK)) > return 0; > if (enable) > -- > 2.17.1 >
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
> On 1. Mar 2022, at 01:41, Linus Torvalds > wrote: > > On Mon, Feb 28, 2022 at 1:47 PM Jakob Koschel wrote: >> >> The goal of this is to get compiler warnings right? This would indeed be >> great. > > Yes, so I don't mind having a one-time patch that has been gathered > using some automated checker tool, but I don't think that works from a > long-term maintenance perspective. > > So if we have the basic rule being "don't use the loop iterator after > the loop has finished, because it can cause all kinds of subtle > issues", then in _addition_ to fixing the existing code paths that > have this issue, I really would want to (a) get a compiler warning for > future cases and (b) make it not actually _work_ for future cases. > > Because otherwise it will just happen again. > >> Changing the list_for_each_entry() macro first will break all of those cases >> (e.g. the ones using 'list_entry_is_head()). > > So I have no problems with breaking cases that we basically already > have a patch for due to your automated tool. There were certainly > more than a handful, but it didn't look _too_ bad to just make the > rule be "don't use the iterator after the loop". > > Of course, that's just based on that patch of yours. Maybe there are a > ton of other cases that your patch didn't change, because they didn't > match your trigger case, so I may just be overly optimistic here. Based on the coccinelle script there are ~480 cases that need fixing in total. I'll now finish all of them and then split them by submodules as Greg suggested and repost a patch set per submodule. Sounds good? > > But basically to _me_, the important part is that the end result is > maintainable longer-term. I'm more than happy to have a one-time patch > to fix a lot of dubious cases if we can then have clean rules going > forward. > >> I assumed it is better to fix those cases first and then have a simple >> coccinelle script changing the macro + moving the iterator into the scope >> of the macro. > > So that had been another plan of mine, until I actually looked at > changing the macro. In the one case I looked at, it was ugly beyond > belief. > > It turns out that just syntactically, it's really nice to give the > type of the iterator from outside the way we do now. Yeah, it may be a > bit odd, and maybe it's partly because I'm so used to the > "list_for_each_list_entry()" syntax, but moving the type into the loop > construct really made it nasty - either one very complex line, or > having to split it over two lines which was even worse. > > Maybe the place I looked at just happened to have a long typename, but > it's basically always going to be a struct, so it's never a _simple_ > type. And it just looked very odd adn unnatural to have the type as > one of the "arguments" to that list_for_each_entry() macro. > > So yes, initially my idea had been to just move the iterator entirely > inside the macro. But specifying the type got so ugly that I think > that > >typeof (pos) pos > > trick inside the macro really ends up giving us the best of all worlds: > > (a) let's us keep the existing syntax and code for all the nice cases > that did everything inside the loop anyway > > (b) gives us a nice warning for any normal use-after-loop case > (unless you explicitly initialized it like that > sgx_mmu_notifier_release() function did for no good reason > > (c) also guarantees that even if you don't get a warning, > non-converted (or newly written) bad code won't actually _work_ > > so you end up getting the new rules without any ambiguity or mistaken > >> With this you are no longer able to set the 'outer' pos within the list >> iterator loop body or am I missing something? > > Correct. Any assignment inside the loop will be entirely just to the > local loop case. So any "break;" out of the loop will have to set > another variable - like your updated patch did. > >> I fail to see how this will make most of the changes in this >> patch obsolete (if that was the intention). > > I hope my explanation above clarifies my thinking: I do not dislike > your patch, and in fact your patch is indeed required to make the new > semantics work. ok it's all clear now, thanks for clarifying. I've defined all the 'tmp' iterator variables uninitialized so applying your patch on top of that later will just give the nice compiler warning if they are used past the loop body. > > What I disliked was always the maintainability of your patch - making > the rules be something that isn't actually visible in the source code, > and letting the old semantics still work as well as they ever did, and > having to basically run some verification pass to find bad users. Since this patch is not a complete list of cases that need fixing (30%) I haven't included the actual change of moving the iterator variable into the loop and thought that would be a second step coming after this is merged. With these changes alone, yes you still rely on manu
撤回: [PATCH] drm/amdgpu: Move common initialization operations of each ras block to one function
Yang, Stanley 将撤回邮件“[PATCH] drm/amdgpu: Move common initialization operations of each ras block to one function”。
[PATCH v2 1/3] drm/amdgpu: move amdgpu_gmc_noretry_set after ip_versions populated
otherwise adev->ip_versions is still empty when amdgpu_gmc_noretry_set is called. Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 6113ddc765a7..61a6a7920c76 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -1554,7 +1554,6 @@ static int amdgpu_device_check_arguments(struct amdgpu_device *adev) amdgpu_gmc_tmz_set(adev); - amdgpu_gmc_noretry_set(adev); return 0; } @@ -3641,6 +3640,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, if (r) return r; + amdgpu_gmc_noretry_set(adev); /* Need to get xgmi info early to decide the reset behavior*/ if (adev->gmc.xgmi.supported) { r = adev->gfxhub.funcs->get_xgmi_info(adev); -- 2.25.1
[PATCH v2 2/3] drm/amdgpu: convert code name to ip version for noretry set
Use IP version rather than codename for noretry set. Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index ab75e189bc0b..fbc22b7b6315 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -559,14 +559,14 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) { struct amdgpu_gmc *gmc = &adev->gmc; - switch (adev->asic_type) { - case CHIP_VEGA10: - case CHIP_VEGA20: - case CHIP_ARCTURUS: - case CHIP_ALDEBARAN: - case CHIP_BEIGE_GOBY: - case CHIP_YELLOW_CARP: - case CHIP_RENOIR: + switch (adev->ip_versions[GC_HWIP][0]) { + case IP_VERSION(9, 0, 1): + case IP_VERSION(9, 4, 0): + case IP_VERSION(9, 4, 1): + case IP_VERSION(9, 4, 2): + case IP_VERSION(10, 3, 5): + case IP_VERSION(10, 3, 3): + case IP_VERSION(9, 3, 0): /* * noretry = 0 will cause kfd page fault tests fail * for some ASICs, so set default to 1 for these ASICs. @@ -576,7 +576,6 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) else gmc->noretry = amdgpu_noretry; break; - case CHIP_RAVEN: default: /* Raven currently has issues with noretry * regardless of what we decide for other -- 2.25.1
[PATCH v2 3/3] drm/amdgpu: set noretry=1 for gc 10.3.6
this patch to set noretry=1 for gc 10.3.6. Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index fbc22b7b6315..7c2a9555b7cc 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -567,6 +567,7 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) case IP_VERSION(10, 3, 5): case IP_VERSION(10, 3, 3): case IP_VERSION(9, 3, 0): + case IP_VERSION(10, 3, 6): /* * noretry = 0 will cause kfd page fault tests fail * for some ASICs, so set default to 1 for these ASICs. -- 2.25.1
Re: [PATCH v2 1/3] drm/amdgpu: move amdgpu_gmc_noretry_set after ip_versions populated
Series is: Reviewed-by: Alex Deucher On Tue, Mar 1, 2022 at 10:02 AM Yifan Zhang wrote: > > otherwise adev->ip_versions is still empty when amdgpu_gmc_noretry_set > is called. > > Signed-off-by: Yifan Zhang > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 6113ddc765a7..61a6a7920c76 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -1554,7 +1554,6 @@ static int amdgpu_device_check_arguments(struct > amdgpu_device *adev) > > amdgpu_gmc_tmz_set(adev); > > - amdgpu_gmc_noretry_set(adev); > > return 0; > } > @@ -3641,6 +3640,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, > if (r) > return r; > > + amdgpu_gmc_noretry_set(adev); > /* Need to get xgmi info early to decide the reset behavior*/ > if (adev->gmc.xgmi.supported) { > r = adev->gfxhub.funcs->get_xgmi_info(adev); > -- > 2.25.1 >
Re: [PATCH v2 2/3] drm/amdgpu: convert code name to ip version for noretry set
Dear Yifan, Thank you for your patch. Am 01.03.22 um 16:01 schrieb Yifan Zhang: Use IP version rather than codename for noretry set. Why? Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index ab75e189bc0b..fbc22b7b6315 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -559,14 +559,14 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) { struct amdgpu_gmc *gmc = &adev->gmc; - switch (adev->asic_type) { - case CHIP_VEGA10: - case CHIP_VEGA20: - case CHIP_ARCTURUS: - case CHIP_ALDEBARAN: - case CHIP_BEIGE_GOBY: - case CHIP_YELLOW_CARP: - case CHIP_RENOIR: + switch (adev->ip_versions[GC_HWIP][0]) { + case IP_VERSION(9, 0, 1): + case IP_VERSION(9, 4, 0): + case IP_VERSION(9, 4, 1): + case IP_VERSION(9, 4, 2): + case IP_VERSION(10, 3, 5): + case IP_VERSION(10, 3, 3): + case IP_VERSION(9, 3, 0): I think, sorting these entries might be useful. Should the names be added as comments for those not having them memorized? /* * noretry = 0 will cause kfd page fault tests fail * for some ASICs, so set default to 1 for these ASICs. @@ -576,7 +576,6 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) else gmc->noretry = amdgpu_noretry; break; - case CHIP_RAVEN: Why remove this? default: /* Raven currently has issues with noretry * regardless of what we decide for other Kind regards, Paul
[PATCH 3/3] drm/amdgpu/sdma5: drop unused cyan skillfish firmware
Leftover from bring up. Not used anymore. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 +--- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c index 81e033549dda..53a8df4b030e 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c @@ -51,9 +51,6 @@ MODULE_FIRMWARE("amdgpu/navi14_sdma1.bin"); MODULE_FIRMWARE("amdgpu/navi12_sdma.bin"); MODULE_FIRMWARE("amdgpu/navi12_sdma1.bin"); -MODULE_FIRMWARE("amdgpu/cyan_skillfish_sdma.bin"); -MODULE_FIRMWARE("amdgpu/cyan_skillfish_sdma1.bin"); - MODULE_FIRMWARE("amdgpu/cyan_skillfish2_sdma.bin"); MODULE_FIRMWARE("amdgpu/cyan_skillfish2_sdma1.bin"); @@ -264,10 +261,7 @@ static int sdma_v5_0_init_microcode(struct amdgpu_device *adev) chip_name = "navi12"; break; case IP_VERSION(5, 0, 1): - if (adev->apu_flags & AMD_APU_IS_CYAN_SKILLFISH2) - chip_name = "cyan_skillfish2"; - else - chip_name = "cyan_skillfish"; + chip_name = "cyan_skillfish2"; break; default: BUG(); -- 2.35.1
[PATCH 2/3] drm/amdgpu/gfx10: drop unused cyan skillfish firmware
Leftover from bring up. Not used anymore. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 12 +--- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 90158289cd30..0f350a5c4bf2 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -250,13 +250,6 @@ MODULE_FIRMWARE("amdgpu/yellow_carp_mec.bin"); MODULE_FIRMWARE("amdgpu/yellow_carp_mec2.bin"); MODULE_FIRMWARE("amdgpu/yellow_carp_rlc.bin"); -MODULE_FIRMWARE("amdgpu/cyan_skillfish_ce.bin"); -MODULE_FIRMWARE("amdgpu/cyan_skillfish_pfp.bin"); -MODULE_FIRMWARE("amdgpu/cyan_skillfish_me.bin"); -MODULE_FIRMWARE("amdgpu/cyan_skillfish_mec.bin"); -MODULE_FIRMWARE("amdgpu/cyan_skillfish_mec2.bin"); -MODULE_FIRMWARE("amdgpu/cyan_skillfish_rlc.bin"); - MODULE_FIRMWARE("amdgpu/cyan_skillfish2_ce.bin"); MODULE_FIRMWARE("amdgpu/cyan_skillfish2_pfp.bin"); MODULE_FIRMWARE("amdgpu/cyan_skillfish2_me.bin"); @@ -4043,10 +4036,7 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev) break; case IP_VERSION(10, 1, 3): case IP_VERSION(10, 1, 4): - if (adev->apu_flags & AMD_APU_IS_CYAN_SKILLFISH2) - chip_name = "cyan_skillfish2"; - else - chip_name = "cyan_skillfish"; + chip_name = "cyan_skillfish2"; break; case IP_VERSION(10, 3, 7): chip_name = "gc_10_3_7"; -- 2.35.1
[PATCH 1/3] drm/amdgpu: remove unused gpu_info firmwares
These were leftover from bring up and are no longer necessary. The information is available via the IP discovery table. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 23 -- 1 file changed, 23 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 6113ddc765a7..2f732a1758d6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -80,12 +80,7 @@ MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin"); MODULE_FIRMWARE("amdgpu/picasso_gpu_info.bin"); MODULE_FIRMWARE("amdgpu/raven2_gpu_info.bin"); MODULE_FIRMWARE("amdgpu/arcturus_gpu_info.bin"); -MODULE_FIRMWARE("amdgpu/renoir_gpu_info.bin"); -MODULE_FIRMWARE("amdgpu/navi10_gpu_info.bin"); -MODULE_FIRMWARE("amdgpu/navi14_gpu_info.bin"); MODULE_FIRMWARE("amdgpu/navi12_gpu_info.bin"); -MODULE_FIRMWARE("amdgpu/vangogh_gpu_info.bin"); -MODULE_FIRMWARE("amdgpu/yellow_carp_gpu_info.bin"); #define AMDGPU_RESUME_MS 2000 #define AMDGPU_MAX_RETRY_LIMIT 2 @@ -1992,27 +1987,9 @@ static int amdgpu_device_parse_gpu_info_fw(struct amdgpu_device *adev) case CHIP_ARCTURUS: chip_name = "arcturus"; break; - case CHIP_RENOIR: - if (adev->apu_flags & AMD_APU_IS_RENOIR) - chip_name = "renoir"; - else - chip_name = "green_sardine"; - break; - case CHIP_NAVI10: - chip_name = "navi10"; - break; - case CHIP_NAVI14: - chip_name = "navi14"; - break; case CHIP_NAVI12: chip_name = "navi12"; break; - case CHIP_VANGOGH: - chip_name = "vangogh"; - break; - case CHIP_YELLOW_CARP: - chip_name = "yellow_carp"; - break; } snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_gpu_info.bin", chip_name); -- 2.35.1
Re: [PATCH] mm: split vm_normal_pages for LRU and non-LRU handling
Am 2022-03-01 um 03:03 schrieb David Hildenbrand: On 28.02.22 21:34, Alex Sierra wrote: DEVICE_COHERENT pages introduce a subtle distinction in the way "normal" pages can be used by various callers throughout the kernel. They behave like normal pages for purposes of mapping in CPU page tables, and for COW. But they do not support LRU lists, NUMA migration or THP. Therefore we split vm_normal_page into two functions vm_normal_any_page and vm_normal_lru_page. The latter will only return pages that can be put on an LRU list and that support NUMA migration and THP. Why not s/vm_normal_any_page/vm_normal_page/ and avoid code churn? I don't care much, personally, what names we end up with. I'll wait for a consensus here. We also introduced a FOLL_LRU flag that adds the same behaviour to follow_page and related APIs, to allow callers to specify that they expect to put pages on an LRU list. [...] -#define FOLL_WRITE 0x01/* check pte is writable */ -#define FOLL_TOUCH 0x02/* mark page accessed */ -#define FOLL_GET 0x04/* do get_page on page */ -#define FOLL_DUMP 0x08/* give error on hole if it would be zero */ -#define FOLL_FORCE 0x10/* get_user_pages read/write w/o permission */ -#define FOLL_NOWAIT0x20/* if a disk transfer is needed, start the IO -* and return without waiting upon it */ -#define FOLL_POPULATE 0x40/* fault in pages (with FOLL_MLOCK) */ -#define FOLL_NOFAULT 0x80/* do not fault in pages */ -#define FOLL_HWPOISON 0x100 /* check page is hwpoisoned */ -#define FOLL_NUMA 0x200 /* force NUMA hinting page fault */ -#define FOLL_MIGRATION 0x400 /* wait for page to replace migration entry */ -#define FOLL_TRIED 0x800 /* a retry, previous pass started an IO */ -#define FOLL_MLOCK 0x1000 /* lock present pages */ -#define FOLL_REMOTE0x2000 /* we are working on non-current tsk/mm */ -#define FOLL_COW 0x4000 /* internal GUP flag */ -#define FOLL_ANON 0x8000 /* don't do file mappings */ -#define FOLL_LONGTERM 0x1 /* mapping lifetime is indefinite: see below */ -#define FOLL_SPLIT_PMD 0x2 /* split huge pmd before returning */ -#define FOLL_PIN 0x4 /* pages must be released via unpin_user_page */ -#define FOLL_FAST_ONLY 0x8 /* gup_fast: prevent fall-back to slow gup */ +#define FOLL_WRITE 0x01 /* check pte is writable */ +#define FOLL_TOUCH 0x02 /* mark page accessed */ +#define FOLL_GET 0x04 /* do get_page on page */ +#define FOLL_DUMP 0x08 /* give error on hole if it would be zero */ +#define FOLL_FORCE 0x10 /* get_user_pages read/write w/o permission */ +#define FOLL_NOWAIT0x20 /* if a disk transfer is needed, start the IO + * and return without waiting upon it */ +#define FOLL_POPULATE 0x40 /* fault in pages (with FOLL_MLOCK) */ +#define FOLL_NOFAULT 0x80 /* do not fault in pages */ +#define FOLL_HWPOISON 0x100/* check page is hwpoisoned */ +#define FOLL_NUMA 0x200/* force NUMA hinting page fault */ +#define FOLL_MIGRATION 0x400/* wait for page to replace migration entry */ +#define FOLL_TRIED 0x800/* a retry, previous pass started an IO */ +#define FOLL_MLOCK 0x1000 /* lock present pages */ +#define FOLL_REMOTE0x2000 /* we are working on non-current tsk/mm */ +#define FOLL_COW 0x4000 /* internal GUP flag */ +#define FOLL_ANON 0x8000 /* don't do file mappings */ +#define FOLL_LONGTERM 0x1 /* mapping lifetime is indefinite: see below */ +#define FOLL_SPLIT_PMD 0x2 /* split huge pmd before returning */ +#define FOLL_PIN 0x4 /* pages must be released via unpin_user_page */ +#define FOLL_FAST_ONLY 0x8 /* gup_fast: prevent fall-back to slow gup */ +#define FOLL_LRU 0x10 /* return only LRU (anon or page cache) */ Can we minimize code churn, please? OK. I guess we could unindent the FOLL_LRU number to avoid changing all the comments. if (PageReserved(page)) diff --git a/mm/migrate.c b/mm/migrate.c index c31d04b46a5e..17d049311b78 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1614,7 +1614,7 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr, goto out; /* FOLL_DUMP to ignore special (like zero) pages */ - follflags = FOLL_GET | FOLL_DUMP; + follflags = FOLL_GET | FOLL_DUMP | FOLL_LRU; page = follow_page(vma, addr, follflags); Why wouldn't we want to dump DEVICE_COHERENT pages? This looks wrong. This function later calls isolate_lru_page, which is something you can't do with a device page. Regards, Felix
RE: [PATCH] drm/amdgpu: Move CAP firmware loading to the beginning of PSP firmware list
[AMD Official Use Only] Reviewed-by: Bokun Zhang Thanks! -Original Message- From: Yifan Zha Sent: Tuesday, March 1, 2022 2:19 AM To: amd-gfx@lists.freedesktop.org; Zhang, Bokun Cc: Chen, Jingwen ; Liu, Monk ; Chen, Guchun ; Zha, YiFan(Even) Subject: [PATCH] drm/amdgpu: Move CAP firmware loading to the beginning of PSP firmware list [Why] As PSP needs to verify the signature, CAP firmware must be loaded first when PSP loads firmwares. Otherwise, when DFC feature is enabled, CP firmwares would be loaded failed. [ 1149.160480] [drm] MM table gpu addr = 0x800022f000, cpu addr = a62afcea. [ 1149.209874] [drm] failed to load ucode CP_CE(0x8) [ 1149.209878] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0x0007) [ 1149.215914] [drm] failed to load ucode CP_PFP(0x9) [ 1149.215917] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0x0007) [ 1149.221941] [drm] failed to load ucode CP_ME(0xA) [ 1149.221944] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0x0007) [ 1149.228082] [drm] failed to load ucode CP_MEC1(0xB) [ 1149.228085] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0x0007) [ 1149.234209] [drm] failed to load ucode CP_MEC2(0xD) [ 1149.234212] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0x0007) [ 1149.242379] [drm] failed to load ucode VCN(0x1C) [ 1149.242382] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0x0007) [How] Move CAP UCODE ID to the beginning of AMDGPU_UCODE_ID enum list. Signed-off-by: Yifan Zha --- drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h index 428f4df184d0..40dffbac85a0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h @@ -343,7 +343,8 @@ union amdgpu_firmware_header { * fw loading support */ enum AMDGPU_UCODE_ID { - AMDGPU_UCODE_ID_SDMA0 = 0, + AMDGPU_UCODE_ID_CAP = 0, + AMDGPU_UCODE_ID_SDMA0, AMDGPU_UCODE_ID_SDMA1, AMDGPU_UCODE_ID_SDMA2, AMDGPU_UCODE_ID_SDMA3, @@ -378,7 +379,6 @@ enum AMDGPU_UCODE_ID { AMDGPU_UCODE_ID_VCN0_RAM, AMDGPU_UCODE_ID_VCN1_RAM, AMDGPU_UCODE_ID_DMCUB, - AMDGPU_UCODE_ID_CAP, AMDGPU_UCODE_ID_MAXIMUM, }; -- 2.25.1
Re: [PATCH] mm: split vm_normal_pages for LRU and non-LRU handling
Am 2022-03-01 um 11:22 schrieb David Hildenbrand: if (PageReserved(page)) diff --git a/mm/migrate.c b/mm/migrate.c index c31d04b46a5e..17d049311b78 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1614,7 +1614,7 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr, goto out; /* FOLL_DUMP to ignore special (like zero) pages */ - follflags = FOLL_GET | FOLL_DUMP; + follflags = FOLL_GET | FOLL_DUMP | FOLL_LRU; page = follow_page(vma, addr, follflags); Why wouldn't we want to dump DEVICE_COHERENT pages? This looks wrong. This function later calls isolate_lru_page, which is something you can't do with a device page. Then, that code might require care instead. We most certainly don't want to have random memory holes in a dump just because some anonymous memory was migrated to DEVICE_COHERENT. I don't think this code is for core dumps. The call chain I see is SYSCALL_DEFINE6(move_pages, ...) -> kernel_move_pages -> do_pages_move -> add_page_for_migration As far as I can tell this is for NUMA migrations. Regards, Felix
[PATCH] drm/amdgpu/discovery: add a comment about ip_discovery firmware
The firmware is optional and only used as a fallback when the IP discovery table is not available (e.g., during asic bring up). Once the chip goes into production, the table will always be available on the part itself. Unfortunately, there is no way to mark a firmware as optional. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index e4fcbb385a62..31d86d083968 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -67,6 +67,13 @@ #include "smuio_v11_0_6.h" #include "smuio_v13_0.h" +/* This firmware is optional, but there is no way to + * specify a firmware file as optional. It is only + * used as a fallback when the IP discovery table + * is not available for some reason (e.g., during + * bring up). Once the product is in production + * the IP discovery table is shipped on the part itself. + */ #define FIRMWARE_IP_DISCOVERY "amdgpu/ip_discovery.bin" MODULE_FIRMWARE(FIRMWARE_IP_DISCOVERY); -- 2.35.1
[PATCH 00/12] use dynamic-debug under drm.debug api
hi Jason, Greg, Daniel, DRM-everyone drm.debug api provides ~23 macros to issue 10 categories of debug messages, each enabled by a bit in /sys/module/drm/parameters/debug. drm_debug_enabled(category) tests these bits at runtime; while cheap individually, the costs accumulate. Daniel, I think this revision addresses most of your early review, a lot has changed since. Heres the link: https://patchwork.freedesktop.org/patch/443989/ For CONFIG_DRM_USE_DYNAMIC_DEBUG=y, this patchset obsoletes those runtime tests (inside drm_*dbg) by wrapping the 2 fns in one of the dynamic_func_call* Factory macros. The config dependence is due to the .data footprint cost of the tables; AMDGPU has ~4k callsites, at 56 bytes each. This patchset creates entries in /proc/dynamic_debug/control for each callsite, and each has .class_id = macros' category. Those entries, and a new query keyword, allow (1st): # 1=DRM_UT_KMS (iirc) #> echo "module drm class 1 +p > /proc/dynamic_debug/control Then equivalently: # except it also clears other flags #> echo 0x01 > /sys/module/drm/parameters/debug series overview: dyndbg: - fix a bug in dyndbg static_key toggling, @stable cc'd - adds support for distinct classes to dyndbg (new,unused feature) - add DECLARE_DYNAMIC_DEBUG_CLASSBITS macro and callbacks to implement bitmap -> classid sysfs knob dyndbg: - drops exported fn: dynamic_debug_exec_queries() any potential users would just use macro, or a tweak on it. - improve info-msg to print both "old -> new" flags drm: - adapts drm debug category to dyndbg.class_id - wraps drm_*dbg() in a dyndbg Factory macro to get NOOP optimized debugs this disconnects drm.debug sysfs knob - uses DECLARE_DYNAMIC_DEBUG_CLASSBITS macro this reconnects sysfs knob This could be -v12, but the focus and subject has wandered a bit, and patchwork CI had multiple different notions of the version. Noteworthy changes: - no tracefs stuff here, refocus In contrast, the previous drm.debug approach: - replaced drm_dbg & drm_devdbg with calls to pr_debug & dev_dbg this preserved the optional decorations: module:function:line: - used DRM_UT_CORE => "drm:core:" prefix-string, cpp cat'd to formats this made sites selectable by matching to that format prefix This version: - .class_id is easier to explain, and no config/format-string diffs - wraps drm_dbg & drm_devdbg callsites for jumplabel enablement efficiency was original goal. - loses the optional decorations. drm has its own logmsg standards, doesn't need decorations slapped on later: could recast flags for drm specific decorations This is based on 5.17-rc4, for no particular reason. Its also here: in (dd-drm branch) ghlinux-rohttps://github.com/jimc/linux.git (fetch) Jim Cromie (13): dyndbg: fix static_branch manipulation @stable dyndbg: add class_id field and query support dyndbg: add DEFINE_DYNAMIC_DEBUG_CLASSBITS macro and callbacks dyndbg: drop EXPORTed dynamic_debug_exec_queries dyndbg: improve change-info to have old and new dyndbg: abstract dyndbg_site_is_printing drm_print: condense enum drm_debug_category drm_print: interpose drm_*dbg with forwarding macros drm_print: wrap drm_*_dbg in dyndbg jumplabel drm_print: refine drm_debug_enabled for dyndbg+jump-label drm_print: prefer bare printk KERN_DEBUG on generic fn drm_print: add _ddebug desc to drm_*dbg prototypes drm_print: use DEFINE_DYNAMIC_DEBUG_CLASSBITS for drm.debug .../admin-guide/dynamic-debug-howto.rst | 7 + drivers/gpu/drm/Kconfig | 12 ++ drivers/gpu/drm/Makefile | 2 + drivers/gpu/drm/drm_print.c | 56 --- include/drm/drm_print.h | 80 +++--- include/linux/dynamic_debug.h | 113 +++--- lib/dynamic_debug.c | 140 ++ 7 files changed, 323 insertions(+), 87 deletions(-) -- 2.35.1
[PATCH 01/13] dyndbg: fix static_branch manipulation
In https://lore.kernel.org/lkml/20211209150910.ga23...@axis.com/ Vincent's patch commented on, and worked around, a bug toggling static_branch's, when a 2nd PRINTK-ish flag was added. The bug results in a premature static_branch_disable when the 1st of 2 flags was disabled. The cited commit computed newflags, but then in the JUMP_LABEL block, did not use that result, but used just one of the terms in it. Using newflags instead made the code work properly. This is Vincents test-case, reduced. It needs the 2nd flag to work properly, but it's explanatory here. pt_test() { echo 5 > /sys/module/dynamic_debug/verbose site="module tcp" # just one callsite echo " $site =_ " > /proc/dynamic_debug/control # clear it # A B ~A ~B for flg in +T +p "-T #broke here" -p; do echo " $site $flg " > /proc/dynamic_debug/control done; # A B ~B ~A for flg in +T +p "-p #broke here" -T; do echo " $site $flg " > /proc/dynamic_debug/control done } pt_test Fixes: 84da83a6ffc0 dyndbg: combine flags & mask into a struct, simplify with it CC: vincent.whitchu...@axis.com CC: sta...@vger.kernel.org Signed-off-by: Jim Cromie --- lib/dynamic_debug.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c index dd7f56af9aed..a56c1286ffa4 100644 --- a/lib/dynamic_debug.c +++ b/lib/dynamic_debug.c @@ -211,10 +211,11 @@ static int ddebug_change(const struct ddebug_query *query, continue; #ifdef CONFIG_JUMP_LABEL if (dp->flags & _DPRINTK_FLAGS_PRINT) { - if (!(modifiers->flags & _DPRINTK_FLAGS_PRINT)) + if (!(newflags & _DPRINTK_FLAGS_PRINT)) static_branch_disable(&dp->key.dd_key_true); - } else if (modifiers->flags & _DPRINTK_FLAGS_PRINT) + } else if (newflags & _DPRINTK_FLAGS_PRINT) { static_branch_enable(&dp->key.dd_key_true); + } #endif dp->flags = newflags; v4pr_info("changed %s:%d [%s]%s =%s\n", -- 2.35.1
[PATCH 02/13] dyndbg: add class_id field and query support
DRM defines/uses 10 enum drm_debug_category's to create exclusive classes of debug messages. To support this directly in dynamic-debug, add the following: - struct _ddebug.class_id:4 - 4 bits is enough - define _DPRINTK_SITE_UNCLASSED 15 - see below and the query support: - struct _ddebug_query.class_id - ddebug_parse_query- looks for "class" keyword, then calls.. - parse_class - accepts uint: 0-15, sets query.class_id and marker - vpr_info_dq - displays new field - ddebug_proc_show - append column with "cls:%d" if !UNCLASSED With the patch, this command enables current/unclassed callsites: #> echo drm class 15 +p > /proc/dynamic_debug/control parse_class() accepts 0 .. _DPRINTK_SITE_UNCLASSED, which allows the >control interface to explicitly manipulate unclassed callsites. After parsing keywords, ddebug_parse_query() sets .class_id=15, iff it wasnt explicitly set. This allows future classed/categorized callsites to be untouched by legacy (class unaware) queries. DEFINE_DYNAMIC_DEBUG_METADATA gets _CLS(cls,) suffix and arg, and initializes the new .class_id=cls field. The old name gets the default. Then, these _CLS(cls,...) modifications are repeated up through the stack of *dynamic_func_call* macros that use the METADATA initializer, so as to actually supply the category into it. NOTES: _DPRINTK_SITE_UNCLASSED: this symbol is used to initialize all existing/unclassed pr-debug callsites. Normally, the default would be zero, but DRM_UT_CORE "uses" that value, in the sense that 0 is exposed as a bit position in drm.debug. Using 15 allows identity mapping from category to class, avoiding fiddly offsets. CC: Rasmus Villemoes Signed-off-by: Jim Cromie fixup class-id preset fix2 --- .../admin-guide/dynamic-debug-howto.rst | 7 +++ include/linux/dynamic_debug.h | 54 ++- lib/dynamic_debug.c | 48 ++--- 3 files changed, 88 insertions(+), 21 deletions(-) diff --git a/Documentation/admin-guide/dynamic-debug-howto.rst b/Documentation/admin-guide/dynamic-debug-howto.rst index a89cfa083155..8ef8d7dcd140 100644 --- a/Documentation/admin-guide/dynamic-debug-howto.rst +++ b/Documentation/admin-guide/dynamic-debug-howto.rst @@ -35,6 +35,7 @@ Dynamic debug has even more useful features: - line number (including ranges of line numbers) - module name - format string + - class number:0-15 * Provides a debugfs control file: ``/dynamic_debug/control`` which can be read to display the complete list of known debug @@ -143,6 +144,7 @@ against. Possible keywords are::: 'module' string | 'format' string | 'line' line-range +'class' integer:[0-15] line-range ::= lineno | '-'lineno | @@ -217,6 +219,11 @@ line line -1605 // the 1605 lines from line 1 to line 1605 line 1600- // all lines from line 1600 to the end of the file +class +This expects a single integer in range: 0-15. +15 is used/reserved for existing/unclassed callsites, +and is defaulted in unless specified to >control + The flags specification comprises a change operation followed by one or more flag characters. The change operation is one of the characters:: diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h index dce631e678dd..d4b48f3cc6e8 100644 --- a/include/linux/dynamic_debug.h +++ b/include/linux/dynamic_debug.h @@ -6,6 +6,8 @@ #include #endif +#include + /* * An instance of this structure is created in a special * ELF section at every dynamic debug callsite. At runtime, @@ -21,6 +23,9 @@ struct _ddebug { const char *filename; const char *format; unsigned int lineno:18; +#define CLS_BITS 4 + unsigned int class_id:CLS_BITS; +#define _DPRINTK_SITE_UNCLASSED((1 << CLS_BITS) - 1) /* * The flags field controls the behaviour at the callsite. * The bits here are changed dynamically when the user @@ -87,7 +92,7 @@ void __dynamic_ibdev_dbg(struct _ddebug *descriptor, const struct ib_device *ibdev, const char *fmt, ...); -#define DEFINE_DYNAMIC_DEBUG_METADATA(name, fmt) \ +#define DEFINE_DYNAMIC_DEBUG_METADATA_CLS(name, cls, fmt) \ static struct _ddebug __aligned(8) \ __section("__dyndbg") name = { \ .modname = KBUILD_MODNAME, \ @@ -96,8 +101,14 @@ void __dynamic_ibdev_dbg(struct _ddebug *descriptor, .format = (fmt),\ .lineno = __LINE__, \ .flags = _DPRINTK_FLAGS_DEFAULT,\ + .class_id = cls,\ _DPRINTK_KEY_INIT
[PATCH 03/13] dyndbg: add DEFINE_DYNAMIC_DEBUG_CLASSBITS macro and callbacks
DEFINE_DYNAMIC_DEBUG_CLASSBITS(fsname, var, bitmap_desc, classes..) allows users to create a drm.debug style (bitmap) sysfs interface, to control sets of pr_debug's according to their .class_id's This wraps existing "class" keyword and behavior: echo "module drm -p ; module drm class 0 +p ; module drm class 2 +p" >control With the macro in use, this is equivalent: echo 0x05 > /sys/module/drm/parameters/debug To use: DEFINE_DYNAMIC_DEBUG_CLASSBITS(debug, __drm_debug, "pmfl", "drm.debug - bits => categories:", /* vector of uint:4 symbols, ala enum drm_debug_category, 15 is EOL */ DRM_UT_CORE, DRM_UT_DRIVER, DRM_UT_KMS ... ); The 3rd arg is a string with any of the dyndbg.flags [pmflt_]+ Full exposure of the flags here lets the module author: - fully customize/take-over the decorations of enabled sites. generally leaving decorations to user is preferred. - aim the debug-stream: now printk, later tracefs. using both together means more work (p or T, in practice) iface doesn't care about new flags added later - declare 2 separate sysfs-knobs, one each for p, T, if desired. - decorations are per callsite, shared across sysfs-knobs for any controlled classes To support the macro, the patch adds: - int param_set_dyndbg_classbits() - int param_get_dyndbg_classbits() - struct kernel_param_ops param_ops_dyndbg_classbits Following the model of kernel/params.c STANDARD_PARAM_DEFS, these are non-static and exported. get/set use an augmented kernel_param; the arg refs a new struct dyndbg_bitmap_param containing: A- the vector of classes (drm.debug "categories") being controlled This in-line vector of constants (uint [0-14]) specifies a sequence of controlling bits (by position, starting at 0) with the values naming the class_id's mapped to that bit. A value of _DPRINTK_SITE_UNCLASSED terminates the vector processing by param_set_dyndbg_classbits(), and is appended by the macro to insure a defined termination after max 15 classes are applied. Technically, the vector is a flex-array, but its size is practically limited to max 15 in length (repeats are pointless). B- a pointer to the user module's ulong holding the bits/state. By accessing client's bit-state, we coordinate with existing code that still uses drm_debug_enabled(), so they work unchanged. The change to ulong allows use of BIT() etc. NOTES: _DPRINTK_SITE_UNCLASSED = 15, ie 2**CLS_BITS-1, deserves special mention; it already marks all existing pr-debug callsites, only the new drm.debug callsites are initialized to other (DRM_UT_*) values. _DPRINTK_SITE_UNCLASSED is used in param_set_dyndbg_classbits() to limit the range of bits that are processed to what fits in uint:4. It also terminates the class-id list passed into the macro, so dont use it halfway through your list of classes-to-control. param_set_dyndbg_classbits() compares new vs old bits, and only updates each class on changes. This maximally preserves the underlying state, which may have been customized at some point via `echo $cmd >control`. So if users really want to know that all prdbgs are set precisely, they must pre-clear then set. Identity mapping in (A) is encouraged. Your vector should exclude 15, if used, it terminates the list prematurely; any following bit mappings will be ignored (it is the default/non category). The whole (A) vector/list passed to the macro is: - not strictly needed: 0-N bits are scanned, only N is needed in the macro interface to do this, not the whole list. - 0-N list allows juggling the bit->class map Identity map is preferred. 15 terminates list if used. (macro impl does this) That said, (A) is self-documenting; the explicit list is greppable, 'DRM_UT_*' provides lots of clues. Further, it could be upgraded, something like: _pick_sym_(DRM_UT_CORE, "mumble something useful about CORE debug") _pick_sym_(a,b) a // gives us what we need here _pick_help_(a,b) #a " : " b // mod-info fodder Signed-off-by: Jim Cromie --- include/linux/dynamic_debug.h | 50 +++ lib/dynamic_debug.c | 77 +++ 2 files changed, 127 insertions(+) diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h index d4b48f3cc6e8..e83c4e36ad29 100644 --- a/include/linux/dynamic_debug.h +++ b/include/linux/dynamic_debug.h @@ -209,6 +209,10 @@ void __dynamic_ibdev_dbg(struct _ddebug *descriptor, KERN_DEBUG, prefix_str, prefix_type, \ rowsize, groupsize, buf, len, ascii) +struct kernel_param; +int param_set_dyndbg_classbits(const char *instr, const struct kernel_param *kp); +int param_get_dyndbg_classbits(char *buffer, const struct kernel_param *kp); + #else /* !CONFIG_DYNAMIC_DEBUG_CORE */ #include @@ -255,6 +259,52 @@ static inline int dynamic_debug_exec_queries(const char *query, const char *modn return 0; } +struct kern
[PATCH 05/13] dyndbg: improve change-info to have old and new
move site.flag update after the v4pr_info("change") message, and improve the message to print both old and new flag values. Heres new form: dyndbg: changed net/ipv4/tcp.c:2424 [tcp]tcp_recvmsg_locked pT -> _ Signed-off-by: Jim Cromie --- lib/dynamic_debug.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c index 60b2572e64f0..ab93b370d489 100644 --- a/lib/dynamic_debug.c +++ b/lib/dynamic_debug.c @@ -158,7 +158,7 @@ static int ddebug_change(const struct ddebug_query *query, struct ddebug_table *dt; unsigned int newflags; unsigned int nfound = 0; - struct flagsbuf fbuf; + struct flagsbuf fbuf, nbuf; /* search for matching ddebugs */ mutex_lock(&ddebug_lock); @@ -223,11 +223,12 @@ static int ddebug_change(const struct ddebug_query *query, static_branch_enable(&dp->key.dd_key_true); } #endif + v4pr_info("changed %s:%d [%s]%s %s -> %s\n", + trim_prefix(dp->filename), dp->lineno, + dt->mod_name, dp->function, + ddebug_describe_flags(dp->flags, &fbuf), + ddebug_describe_flags(newflags, &nbuf)); dp->flags = newflags; - v4pr_info("changed %s:%d [%s]%s =%s\n", -trim_prefix(dp->filename), dp->lineno, -dt->mod_name, dp->function, -ddebug_describe_flags(dp->flags, &fbuf)); } } mutex_unlock(&ddebug_lock); -- 2.35.1
[PATCH 04/13] dyndbg: drop EXPORTed dynamic_debug_exec_queries
This exported fn is effectively obsoleted by Commit:HEAD~2, so remove it. The export was added here: commit a2d375eda771 ("dyndbg: refine export, rename to dynamic_debug_exec_queries()") commit 4c0d77828d4f ("dyndbg: export ddebug_exec_queries") Its intent was to allow drm.debug to use the exported function to implement its drm.debug bitmap api using dynamic_debug. Instead, HEAD~2 implements the bitmap inside dyndbg, and exposes it in a macro declarator, and HEAD~1 uses the macro to connect __drm_debug to the supporting callbacks. Since there are no other expected users, and any prospects would likely reuse the bitmap or a straightforward extension of it, we can drop this function until its really needed. This also drops the CONFIG_DYNAMIC_DEBUG=N stub-func, and its pr_warn(), which I avoided in 2012, then added in 2020 :-/ Signed-off-by: Jim Cromie --- include/linux/dynamic_debug.h | 9 - lib/dynamic_debug.c | 29 - 2 files changed, 38 deletions(-) diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h index e83c4e36ad29..664bb83778d2 100644 --- a/include/linux/dynamic_debug.h +++ b/include/linux/dynamic_debug.h @@ -60,9 +60,6 @@ struct _ddebug { #if defined(CONFIG_DYNAMIC_DEBUG_CORE) -/* exported for module authors to exercise >control */ -int dynamic_debug_exec_queries(const char *query, const char *modname); - int ddebug_add_module(struct _ddebug *tab, unsigned int n, const char *modname); extern int ddebug_remove_module(const char *mod_name); @@ -253,12 +250,6 @@ static inline int ddebug_dyndbg_module_param_cb(char *param, char *val, rowsize, groupsize, buf, len, ascii); \ } while (0) -static inline int dynamic_debug_exec_queries(const char *query, const char *modname) -{ - pr_warn("kernel not built with CONFIG_DYNAMIC_DEBUG_CORE\n"); - return 0; -} - struct kernel_param; static inline int param_set_dyndbg_classbits(const char *instr, const struct kernel_param *kp) { return 0; } diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c index 7eb1c31f870d..60b2572e64f0 100644 --- a/lib/dynamic_debug.c +++ b/lib/dynamic_debug.c @@ -582,35 +582,6 @@ static int ddebug_exec_queries(char *query, const char *modname) return nfound; } -/** - * dynamic_debug_exec_queries - select and change dynamic-debug prints - * @query: query-string described in admin-guide/dynamic-debug-howto - * @modname: string containing module name, usually &module.mod_name - * - * This uses the >/proc/dynamic_debug/control reader, allowing module - * authors to modify their dynamic-debug callsites. The modname is - * canonically struct module.mod_name, but can also be null or a - * module-wildcard, for example: "drm*". - */ -int dynamic_debug_exec_queries(const char *query, const char *modname) -{ - int rc; - char *qry; /* writable copy of query */ - - if (!query) { - pr_err("non-null query/command string expected\n"); - return -EINVAL; - } - qry = kstrndup(query, PAGE_SIZE, GFP_KERNEL); - if (!qry) - return -ENOMEM; - - rc = ddebug_exec_queries(qry, modname); - kfree(qry); - return rc; -} -EXPORT_SYMBOL_GPL(dynamic_debug_exec_queries); - #ifdef CONFIG_MODULES #define KP_MOD_NAME kp->mod->name #else -- 2.35.1
[PATCH 06/13] dyndbg: abstract dyndbg_site_is_printing
Hide flags test in a macro. no functional changes. Signed-off-by: Jim Cromie --- include/linux/dynamic_debug.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h index 664bb83778d2..106065244f73 100644 --- a/include/linux/dynamic_debug.h +++ b/include/linux/dynamic_debug.h @@ -56,7 +56,7 @@ struct _ddebug { #endif } __attribute__((aligned(8))); - +#define dyndbg_site_is_printing(desc) (desc->flags & _DPRINTK_FLAGS_PRINT) #if defined(CONFIG_DYNAMIC_DEBUG_CORE) -- 2.35.1
[PATCH 07/13] drm_print: condense enum drm_debug_category
enum drm_debug_category has 10 "classes", explicitly initialized with 0x-bitmasks which could be simplified as BIT(X)s. But lets go further: use natural enumeration (int, starting at 0), and do the BIT(cat) in drm_debug_enabled(cat) at runtime. While this slightly pessimizes the bit-test, the category now fits in 4 bits, allowing it in struct _ddebug.class_id:4. This sets us up to adapt drm to use dyndbg with JUMP_LABEL, thus avoiding all those bit-tests anyway. Signed-off-by: Jim Cromie --- include/drm/drm_print.h | 22 +++--- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h index 22fabdeed297..b3b470440e46 100644 --- a/include/drm/drm_print.h +++ b/include/drm/drm_print.h @@ -279,49 +279,49 @@ enum drm_debug_category { * @DRM_UT_CORE: Used in the generic drm code: drm_ioctl.c, drm_mm.c, * drm_memory.c, ... */ - DRM_UT_CORE = 0x01, + DRM_UT_CORE, /** * @DRM_UT_DRIVER: Used in the vendor specific part of the driver: i915, * radeon, ... macro. */ - DRM_UT_DRIVER = 0x02, + DRM_UT_DRIVER, /** * @DRM_UT_KMS: Used in the modesetting code. */ - DRM_UT_KMS = 0x04, + DRM_UT_KMS, /** * @DRM_UT_PRIME: Used in the prime code. */ - DRM_UT_PRIME= 0x08, + DRM_UT_PRIME, /** * @DRM_UT_ATOMIC: Used in the atomic code. */ - DRM_UT_ATOMIC = 0x10, + DRM_UT_ATOMIC, /** * @DRM_UT_VBL: Used for verbose debug message in the vblank code. */ - DRM_UT_VBL = 0x20, + DRM_UT_VBL, /** * @DRM_UT_STATE: Used for verbose atomic state debugging. */ - DRM_UT_STATE= 0x40, + DRM_UT_STATE, /** * @DRM_UT_LEASE: Used in the lease code. */ - DRM_UT_LEASE= 0x80, + DRM_UT_LEASE, /** * @DRM_UT_DP: Used in the DP code. */ - DRM_UT_DP = 0x100, + DRM_UT_DP, /** * @DRM_UT_DRMRES: Used in the drm managed resources code. */ - DRM_UT_DRMRES = 0x200, + DRM_UT_DRMRES }; static inline bool drm_debug_enabled(enum drm_debug_category category) { - return unlikely(__drm_debug & category); + return unlikely(__drm_debug & BIT(category)); } /* -- 2.35.1
[PATCH 08/13] drm_print: interpose drm_*dbg with forwarding macros
drm_dev_dbg() & drm_dbg() sit below the categorized layer of the DRM debug API, and implement most of it. These are good places to insert dynamic-debug jump-label mechanics, allowing DRM to avoid the runtime cost of drm_debug_enabled(). Set up for this by changing the func names by adding '__' prefixes, and define forwarding macros to the new names. no functional changes. memory cost baseline: (unchanged) bash-5.1# drms_load [9.220389] dyndbg: 1 debug prints in module drm [9.224426] ACPI: bus type drm_connector registered [9.302192] dyndbg: 2 debug prints in module ttm [9.305033] dyndbg: 8 debug prints in module video [9.627563] dyndbg: 127 debug prints in module i915 [9.721505] AMD-Vi: AMD IOMMUv2 functionality not available on this system - This is not a bug. [ 10.091345] dyndbg: 2196 debug prints in module amdgpu [ 10.106589] [drm] amdgpu kernel modesetting enabled. [ 10.107270] amdgpu: CRAT table not found [ 10.107926] amdgpu: Virtual CRAT table created for CPU [ 10.108398] amdgpu: Topology: Add CPU node [ 10.168507] dyndbg: 3 debug prints in module wmi [ 10.329587] dyndbg: 3 debug prints in module nouveau Signed-off-by: Jim Cromie --- drivers/gpu/drm/drm_print.c | 10 +- include/drm/drm_print.h | 9 +++-- 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c index f783d4963d4b..e45ba224e57c 100644 --- a/drivers/gpu/drm/drm_print.c +++ b/drivers/gpu/drm/drm_print.c @@ -256,8 +256,8 @@ void drm_dev_printk(const struct device *dev, const char *level, } EXPORT_SYMBOL(drm_dev_printk); -void drm_dev_dbg(const struct device *dev, enum drm_debug_category category, -const char *format, ...) +void __drm_dev_dbg(const struct device *dev, enum drm_debug_category category, + const char *format, ...) { struct va_format vaf; va_list args; @@ -278,9 +278,9 @@ void drm_dev_dbg(const struct device *dev, enum drm_debug_category category, va_end(args); } -EXPORT_SYMBOL(drm_dev_dbg); +EXPORT_SYMBOL(__drm_dev_dbg); -void __drm_dbg(enum drm_debug_category category, const char *format, ...) +void ___drm_dbg(enum drm_debug_category category, const char *format, ...) { struct va_format vaf; va_list args; @@ -297,7 +297,7 @@ void __drm_dbg(enum drm_debug_category category, const char *format, ...) va_end(args); } -EXPORT_SYMBOL(__drm_dbg); +EXPORT_SYMBOL(___drm_dbg); void __drm_err(const char *format, ...) { diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h index b3b470440e46..4bed99326631 100644 --- a/include/drm/drm_print.h +++ b/include/drm/drm_print.h @@ -334,7 +334,7 @@ __printf(3, 4) void drm_dev_printk(const struct device *dev, const char *level, const char *format, ...); __printf(3, 4) -void drm_dev_dbg(const struct device *dev, enum drm_debug_category category, +void __drm_dev_dbg(const struct device *dev, enum drm_debug_category category, const char *format, ...); /** @@ -383,6 +383,9 @@ void drm_dev_dbg(const struct device *dev, enum drm_debug_category category, } \ }) +#define drm_dev_dbg(dev, cat, fmt, ...)\ + __drm_dev_dbg(dev, cat, fmt, ##__VA_ARGS__) + /** * DRM_DEV_DEBUG() - Debug output for generic drm code * @@ -484,10 +487,12 @@ void drm_dev_dbg(const struct device *dev, enum drm_debug_category category, */ __printf(2, 3) -void __drm_dbg(enum drm_debug_category category, const char *format, ...); +void ___drm_dbg(enum drm_debug_category category, const char *format, ...); __printf(1, 2) void __drm_err(const char *format, ...); +#define __drm_dbg(fmt, ...)___drm_dbg(fmt, ##__VA_ARGS__) + /* Macros to make printk easier */ #define _DRM_PRINTK(once, level, fmt, ...) \ -- 2.35.1
[PATCH 09/13] drm_print: wrap drm_*_dbg in dyndbg jumplabel
For CONFIG_DRM_USE_DYNAMIC_DEBUG=y, wrap drm_dbg() & drm_dev_dbg() in one of dyndbg's Factory macros: _dynamic_func_call_no_desc(). This makes the (~4000) callsites controllable, typically by class: # 0 is DRM_UT_CORE #> echo module drm class 0 +p > /proc/dynamic_debug/control =N: keeps direct forwarding: drm_*_dbg -> __drm_*_dbg() I added the CONFIG_DRM_USE_DYNAMIC_DEBUG item because of the .data footprint cost of per-callsite control; 56 bytes/site * ~2k,4k callsites (for i915, amdgpu), which is significant enough that a user might not want it. Using CONFIG_DYNAMIC_DEBUG_CORE only eliminates the builtin portion, leaving only drm modules, but still 200k of module data is a lot. Signed-off-by: Jim Cromie --- drivers/gpu/drm/Kconfig | 12 drivers/gpu/drm/Makefile | 2 ++ include/drm/drm_print.h | 12 3 files changed, 26 insertions(+) diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index b1f22e457fd0..ec14a1cd4449 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -63,6 +63,18 @@ config DRM_DEBUG_MM If in doubt, say "N". +config DRM_USE_DYNAMIC_DEBUG + bool "use dynamic debug to implement drm.debug" + default y + depends on DRM + depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE + depends on JUMP_LABEL + help + Use dynamic-debug to avoid drm_debug_enabled() runtime overheads. + Due to callsite counts in DRM drivers (~4k in amdgpu) and 56 + bytes per callsite, the .data costs can be substantial, and + are therefore configurable. + config DRM_DEBUG_SELFTEST tristate "kselftests for DRM" depends on DRM diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 301a44dc18e3..24e6410d6c0e 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -3,6 +3,8 @@ # Makefile for the drm device driver. This driver provides support for the # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher. +CFLAGS-$(CONFIG_DRM_USE_DYNAMIC_DEBUG) += -DDYNAMIC_DEBUG_MODULE + drm-y := drm_aperture.o drm_auth.o drm_cache.o \ drm_file.o drm_gem.o drm_ioctl.o \ drm_drv.o \ diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h index 4bed99326631..06f0ee06be1f 100644 --- a/include/drm/drm_print.h +++ b/include/drm/drm_print.h @@ -383,8 +383,14 @@ void __drm_dev_dbg(const struct device *dev, enum drm_debug_category category, } \ }) +#if !defined(CONFIG_DRM_USE_DYNAMIC_DEBUG) #define drm_dev_dbg(dev, cat, fmt, ...)\ __drm_dev_dbg(dev, cat, fmt, ##__VA_ARGS__) +#else +#define drm_dev_dbg(dev, cat, fmt, ...)\ + _dynamic_func_call_no_desc(fmt, __drm_dev_dbg, \ + dev, cat, fmt, ##__VA_ARGS__) +#endif /** * DRM_DEV_DEBUG() - Debug output for generic drm code @@ -491,7 +497,13 @@ void ___drm_dbg(enum drm_debug_category category, const char *format, ...); __printf(1, 2) void __drm_err(const char *format, ...); +#if !defined(CONFIG_DRM_USE_DYNAMIC_DEBUG) #define __drm_dbg(fmt, ...)___drm_dbg(fmt, ##__VA_ARGS__) +#else +#define __drm_dbg(cat, fmt, ...) \ + _dynamic_func_call_no_desc(fmt, ___drm_dbg, \ + cat, fmt, ##__VA_ARGS__) +#endif /* Macros to make printk easier */ -- 2.35.1
[PATCH 10/13] drm_print: refine drm_debug_enabled for dyndbg+jump-label
In order to use dynamic-debug's jump-label optimization in drm-debug, its clarifying to refine drm_debug_enabled into 3 uses: 1. drm_debug_enabled - legacy, public 2. __drm_debug_enabled - optimized for dyndbg jump-label enablement. 3. _drm_debug_enabled - pr_debug instrumented, observable 1. The legacy version always checks the bits. 2. is privileged, for use by __drm_dbg(), __drm_dev_dbg(), which do an early return unless the category is enabled (free of call/NOOP side effects). For dyndbg builds, debug callsites are selectively "pre-enabled", so __drm_debug_enabled() short-circuits to true there. Remaining callers of 1 may be able to use 2, case by case. 3. is 1st wrapped in a macro, with a pr_debug, which reports each usage in /proc/dynamic_debug/control, making it observable in the logs. The macro lets the pr_debug see the real caller, not an inline function. When plugged into 1, it identified ~10 remaining callers of the function, leading to the follow-on cleanup patch, and would allow activating the pr_debugs, estimating the callrate, and the potential savings by using the wrapper macro. It is unused ATM, but it fills out the picture. Signed-off-by: Jim Cromie --- drivers/gpu/drm/drm_print.c | 4 ++-- include/drm/drm_print.h | 28 2 files changed, 30 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c index e45ba224e57c..92e6e18026da 100644 --- a/drivers/gpu/drm/drm_print.c +++ b/drivers/gpu/drm/drm_print.c @@ -262,7 +262,7 @@ void __drm_dev_dbg(const struct device *dev, enum drm_debug_category category, struct va_format vaf; va_list args; - if (!drm_debug_enabled(category)) + if (!__drm_debug_enabled(category)) return; va_start(args, format); @@ -285,7 +285,7 @@ void ___drm_dbg(enum drm_debug_category category, const char *format, ...) struct va_format vaf; va_list args; - if (!drm_debug_enabled(category)) + if (!__drm_debug_enabled(category)) return; va_start(args, format); diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h index 06f0ee06be1f..38ef044d786e 100644 --- a/include/drm/drm_print.h +++ b/include/drm/drm_print.h @@ -319,11 +319,39 @@ enum drm_debug_category { DRM_UT_DRMRES }; +/* + * 3 name flavors of drm_debug_enabled: + * drm_debug_enabled - public/legacy, always checks bits + * _drm_debug_enabled - instrumented to observe call-rates, est overheads. + * __drm_debug_enabled - privileged - knows jump-label state, can short-circuit + */ static inline bool drm_debug_enabled(enum drm_debug_category category) { return unlikely(__drm_debug & BIT(category)); } +/* + * Wrap fn in macro, so that the pr_debug sees the actual caller, not + * the inline fn. Using this name creates a callsite entry / control + * point in /proc/dynamic_debug/control. + */ +#define _drm_debug_enabled(category) \ + ({ \ + pr_debug("todo: maybe avoid via dyndbg\n"); \ + drm_debug_enabled(category);\ + }) + +#if defined(CONFIG_DRM_USE_DYNAMIC_DEBUG) +/* + * dyndbg is wrapping the drm.debug API, so as to avoid the runtime + * bit-test overheads of drm_debug_enabled() in those api calls. + * In this case, executed callsites are known enabled, so true. + */ +#define __drm_debug_enabled(category) true +#else +#define __drm_debug_enabled(category) drm_debug_enabled(category) +#endif + /* * struct device based logging * -- 2.35.1
[PATCH 11/13] drm_print: prefer bare printk KERN_DEBUG on generic fn
drm_print.c calls pr_debug() just once, from __drm_printfn_debug(), which is a generic/service fn. The callsite is compile-time enabled by DEBUG in both DYNAMIC_DEBUG=y/n builds. For dyndbg builds, reverting this callsite back to bare printk is correcting a few anti-features: 1- callsite is generic, serves multiple drm users. its hardwired on currently could accidentally: #> echo -p > /proc/dynamic_debug/control 2- optional "decorations" by dyndbg are unhelpful/misleading they describe only the generic site, not end users IOW, 1,2 are unhelpful at best, and possibly confusing. reverting yields a nominal data and text shrink: textdata bss dec hex filename 462583 36604 54592 553779 87333 /lib/modules/5.16.0-rc4-lm1-8-ged3eac8ceeea/kernel/drivers/gpu/drm/drm.ko 462515 36532 54592 553639 872a7 /lib/modules/5.16.0-rc4-lm1-9-g6ce0b88d2539-dirty/kernel/drivers/gpu/drm/drm.ko NB: this was noticed using _drm_debug_enabled(), added earlier. Signed-off-by: Jim Cromie --- drivers/gpu/drm/drm_print.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c index 92e6e18026da..24c57b92dc69 100644 --- a/drivers/gpu/drm/drm_print.c +++ b/drivers/gpu/drm/drm_print.c @@ -23,8 +23,6 @@ * Rob Clark */ -#define DEBUG /* for pr_debug() */ - #include #include @@ -162,7 +160,8 @@ EXPORT_SYMBOL(__drm_printfn_info); void __drm_printfn_debug(struct drm_printer *p, struct va_format *vaf) { - pr_debug("%s %pV", p->prefix, vaf); + /* pr_debug callsite decorations are unhelpful here */ + printk(KERN_DEBUG "%s %pV", p->prefix, vaf); } EXPORT_SYMBOL(__drm_printfn_debug); -- 2.35.1
[PATCH 13/13] drm_print: use DEFINE_DYNAMIC_DEBUG_CLASSBITS for drm.debug
if CONFIG_DRM_USE_DYNAMIC_DEBUG=y, use new macro to create the sysfs bitmap to control drm.debug callsites. DEFINE_DYNAMIC_DEBUG_CLASSBITS( debug, __drm_debug, "p", "drm.debug - control summary", /* inline vector of _ddebug.class_id's to be controlled, max 14 vals */ DRM_UT_CORE, DRM_UT_DRIVER, DRM_UT_KMS, DRM_UT_PRIME, DRM_UT_ATOMIC, DRM_UT_VBL, DRM_UT_STATE, DRM_UT_LEASE, DRM_UT_DP, DRM_UT_DRMRES ); NOTES: The @_flgs used here is "p", so this bitmap enables input to syslog only, matching legacy behavior. Also, no "fmlt" decorator flags are used here; that is discouraged, as it then toggles those flags along with the "p". This would overwrite any customizations a user added since the sysfs-knob was last used. Still, there may be cases/reasons. _ddebug.class_id is uint:4, values 0-14 are valid. 15 is reserved for non-classified callsites (regular pr_debugs). Using it terminates the scan, don't use it halfway through your list. Signed-off-by: Jim Cromie --- drivers/gpu/drm/drm_print.c | 20 ++-- include/drm/drm_print.h | 4 ++-- 2 files changed, 20 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c index c9b2a2ab0d3d..d916daa384e5 100644 --- a/drivers/gpu/drm/drm_print.c +++ b/drivers/gpu/drm/drm_print.c @@ -38,7 +38,7 @@ * __drm_debug: Enable debug output. * Bitmask of DRM_UT_x. See include/drm/drm_print.h for details. */ -unsigned int __drm_debug; +unsigned long __drm_debug; EXPORT_SYMBOL(__drm_debug); MODULE_PARM_DESC(debug, "Enable debug output, where each bit enables a debug category.\n" @@ -50,7 +50,23 @@ MODULE_PARM_DESC(debug, "Enable debug output, where each bit enables a debug cat "\t\tBit 5 (0x20) will enable VBL messages (vblank code)\n" "\t\tBit 7 (0x80) will enable LEASE messages (leasing code)\n" "\t\tBit 8 (0x100) will enable DP messages (displayport code)"); -module_param_named(debug, __drm_debug, int, 0600); + +#if !defined(CONFIG_DRM_USE_DYNAMIC_DEBUG) +module_param_named(debug, __drm_debug, ulong, 0600); +#else +DEFINE_DYNAMIC_DEBUG_CLASSBITS(debug, __drm_debug, "p", + "enable drm.debug categories - 1 bit per category", + DRM_UT_CORE, + DRM_UT_DRIVER, + DRM_UT_KMS, + DRM_UT_PRIME, + DRM_UT_ATOMIC, + DRM_UT_VBL, + DRM_UT_STATE, + DRM_UT_LEASE, + DRM_UT_DP, + DRM_UT_DRMRES); +#endif void __drm_puts_coredump(struct drm_printer *p, const char *str) { diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h index 13d52b60f388..419140bf992d 100644 --- a/include/drm/drm_print.h +++ b/include/drm/drm_print.h @@ -36,7 +36,7 @@ #include /* Do *not* use outside of drm_print.[ch]! */ -extern unsigned int __drm_debug; +extern unsigned long __drm_debug; /** * DOC: print @@ -527,7 +527,7 @@ __printf(1, 2) void __drm_err(const char *format, ...); #if !defined(CONFIG_DRM_USE_DYNAMIC_DEBUG) -#define __drm_dbg(fmt, ...)___drm_dbg(NULL, fmt, ##__VA_ARGS__) +#define __drm_dbg(cat, fmt, ...) ___drm_dbg(NULL, cat, fmt, ##__VA_ARGS__) #else #define __drm_dbg(cat, fmt, ...) \ _dynamic_func_call_cls(cat, fmt, ___drm_dbg,\ -- 2.35.1
[PATCH 12/13] drm_print: add _ddebug desc to drm_*dbg prototypes
Add a struct _ddebug ptr to drm_dbg() and drm_dev_dbg() protos. And upgrade the current use of _dynamic_func_call_no_desc(); ie drop the '_no_desc', since the factory macro's callees (these 2 functions) are now expecting the arg. This lets those functions act more like pr_debug(). It also means that these functions don't just get the decorations from an underlying implementation. DRM already has standards for logging/messaging; tossing optional decorations on top may not help. use that info provide it to dyndbg [1], which can then control debug enablement and decoration for all those drm.debug callsites. For CONFIG_DRM_USE_DYNAMIC_DEBUG=N, just pass null. NB: desc->class_id is redundant with category, but !!desc dependent. Signed-off-by: Jim Cromie --- drivers/gpu/drm/drm_print.c | 23 +-- include/drm/drm_print.h | 23 --- 2 files changed, 25 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c index 24c57b92dc69..c9b2a2ab0d3d 100644 --- a/drivers/gpu/drm/drm_print.c +++ b/drivers/gpu/drm/drm_print.c @@ -255,8 +255,8 @@ void drm_dev_printk(const struct device *dev, const char *level, } EXPORT_SYMBOL(drm_dev_printk); -void __drm_dev_dbg(const struct device *dev, enum drm_debug_category category, - const char *format, ...) +void __drm_dev_dbg(struct _ddebug *desc, const struct device *dev, + enum drm_debug_category category, const char *format, ...) { struct va_format vaf; va_list args; @@ -264,22 +264,25 @@ void __drm_dev_dbg(const struct device *dev, enum drm_debug_category category, if (!__drm_debug_enabled(category)) return; + /* we know we are printing for either syslog, tracefs, or both */ va_start(args, format); vaf.fmt = format; vaf.va = &args; - if (dev) - dev_printk(KERN_DEBUG, dev, "[" DRM_NAME ":%ps] %pV", - __builtin_return_address(0), &vaf); - else - printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV", - __builtin_return_address(0), &vaf); - + if (dev) { + if (dyndbg_site_is_printing(desc)) + dev_printk(KERN_DEBUG, dev, "[" DRM_NAME ":%ps] %pV", + __builtin_return_address(0), &vaf); + } else { + if (dyndbg_site_is_printing(desc)) + printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV", + __builtin_return_address(0), &vaf); + } va_end(args); } EXPORT_SYMBOL(__drm_dev_dbg); -void ___drm_dbg(enum drm_debug_category category, const char *format, ...) +void ___drm_dbg(struct _ddebug *desc, enum drm_debug_category category, const char *format, ...) { struct va_format vaf; va_list args; diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h index 38ef044d786e..13d52b60f388 100644 --- a/include/drm/drm_print.h +++ b/include/drm/drm_print.h @@ -31,6 +31,7 @@ #include #include #include +#include #include @@ -361,9 +362,9 @@ static inline bool drm_debug_enabled(enum drm_debug_category category) __printf(3, 4) void drm_dev_printk(const struct device *dev, const char *level, const char *format, ...); -__printf(3, 4) -void __drm_dev_dbg(const struct device *dev, enum drm_debug_category category, -const char *format, ...); +__printf(4, 5) +void __drm_dev_dbg(struct _ddebug *desc, const struct device *dev, + enum drm_debug_category category, const char *format, ...); /** * DRM_DEV_ERROR() - Error output. @@ -413,11 +414,11 @@ void __drm_dev_dbg(const struct device *dev, enum drm_debug_category category, #if !defined(CONFIG_DRM_USE_DYNAMIC_DEBUG) #define drm_dev_dbg(dev, cat, fmt, ...)\ - __drm_dev_dbg(dev, cat, fmt, ##__VA_ARGS__) + __drm_dev_dbg(NULL, dev, cat, fmt, ##__VA_ARGS__) #else #define drm_dev_dbg(dev, cat, fmt, ...)\ - _dynamic_func_call_no_desc(fmt, __drm_dev_dbg, \ - dev, cat, fmt, ##__VA_ARGS__) + _dynamic_func_call_cls(cat, fmt, __drm_dev_dbg, \ + dev, cat, fmt, ##__VA_ARGS__) #endif /** @@ -520,17 +521,17 @@ void __drm_dev_dbg(const struct device *dev, enum drm_debug_category category, * Prefer drm_device based logging over device or prink based logging. */ -__printf(2, 3) -void ___drm_dbg(enum drm_debug_category category, const char *format, ...); +__printf(3, 4) +void ___drm_dbg(struct _ddebug *desc, enum drm_debug_category category, const char *format, ...); __printf(1, 2) void __drm_err(const char *format, ...); #if !defined(CONFIG_DRM_USE_DYNAMIC_DEBUG) -#define __drm_dbg(fmt, ...)___drm_dbg(fmt, ##__VA_ARGS__) +#define __drm_dbg(fmt, ...)
Re: [PATCH] mm: split vm_normal_pages for LRU and non-LRU handling
On 01.03.22 17:30, Felix Kuehling wrote: > Am 2022-03-01 um 11:22 schrieb David Hildenbrand: > if (PageReserved(page)) > diff --git a/mm/migrate.c b/mm/migrate.c > index c31d04b46a5e..17d049311b78 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1614,7 +1614,7 @@ static int add_page_for_migration(struct mm_struct > *mm, unsigned long addr, > goto out; > > /* FOLL_DUMP to ignore special (like zero) pages */ > - follflags = FOLL_GET | FOLL_DUMP; > + follflags = FOLL_GET | FOLL_DUMP | FOLL_LRU; > page = follow_page(vma, addr, follflags); Why wouldn't we want to dump DEVICE_COHERENT pages? This looks wrong. >>> This function later calls isolate_lru_page, which is something you can't >>> do with a device page. >>> >> Then, that code might require care instead. We most certainly don't want >> to have random memory holes in a dump just because some anonymous memory >> was migrated to DEVICE_COHERENT. > I don't think this code is for core dumps. The call chain I see is > > SYSCALL_DEFINE6(move_pages, ...) -> kernel_move_pages -> do_pages_move > -> add_page_for_migration > Ah, sorry, I got mislead by FOLL_DUMP and thought we'd be in get_dump_page() . As you said, nothing to do. -- Thanks, David / dhildenb
Re: [PATCH] mm: split vm_normal_pages for LRU and non-LRU handling
>> >>> if (PageReserved(page)) >>> diff --git a/mm/migrate.c b/mm/migrate.c >>> index c31d04b46a5e..17d049311b78 100644 >>> --- a/mm/migrate.c >>> +++ b/mm/migrate.c >>> @@ -1614,7 +1614,7 @@ static int add_page_for_migration(struct mm_struct >>> *mm, unsigned long addr, >>> goto out; >>> >>> /* FOLL_DUMP to ignore special (like zero) pages */ >>> - follflags = FOLL_GET | FOLL_DUMP; >>> + follflags = FOLL_GET | FOLL_DUMP | FOLL_LRU; >>> page = follow_page(vma, addr, follflags); >> Why wouldn't we want to dump DEVICE_COHERENT pages? This looks wrong. > > This function later calls isolate_lru_page, which is something you can't > do with a device page. > Then, that code might require care instead. We most certainly don't want to have random memory holes in a dump just because some anonymous memory was migrated to DEVICE_COHERENT. -- Thanks, David / dhildenb
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
> On 1. Mar 2022, at 18:36, Greg KH wrote: > > On Tue, Mar 01, 2022 at 12:28:15PM +0100, Jakob Koschel wrote: >> >> >>> On 1. Mar 2022, at 01:41, Linus Torvalds >>> wrote: >>> >>> On Mon, Feb 28, 2022 at 1:47 PM Jakob Koschel >>> wrote: The goal of this is to get compiler warnings right? This would indeed be great. >>> >>> Yes, so I don't mind having a one-time patch that has been gathered >>> using some automated checker tool, but I don't think that works from a >>> long-term maintenance perspective. >>> >>> So if we have the basic rule being "don't use the loop iterator after >>> the loop has finished, because it can cause all kinds of subtle >>> issues", then in _addition_ to fixing the existing code paths that >>> have this issue, I really would want to (a) get a compiler warning for >>> future cases and (b) make it not actually _work_ for future cases. >>> >>> Because otherwise it will just happen again. >>> Changing the list_for_each_entry() macro first will break all of those cases (e.g. the ones using 'list_entry_is_head()). >>> >>> So I have no problems with breaking cases that we basically already >>> have a patch for due to your automated tool. There were certainly >>> more than a handful, but it didn't look _too_ bad to just make the >>> rule be "don't use the iterator after the loop". >>> >>> Of course, that's just based on that patch of yours. Maybe there are a >>> ton of other cases that your patch didn't change, because they didn't >>> match your trigger case, so I may just be overly optimistic here. >> >> Based on the coccinelle script there are ~480 cases that need fixing >> in total. I'll now finish all of them and then split them by >> submodules as Greg suggested and repost a patch set per submodule. >> Sounds good? > > Sounds good to me! > > If you need help carving these up and maintaining them over time as > different subsystem maintainers accept/ignore them, just let me know. > Doing large patchsets like this can be tough without a lot of > experience. Very much appreciated! There will probably be some cases that do not match one of the pattern we already discussed and need separate attention. I was planning to start with one subsystem and adjust the coming ones according to the feedback gather there instead of posting all of them in one go. > > thanks, > > greg k-h - Jakob
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Mon, Feb 28, 2022 at 01:06:57PM +0100, Jakob Koschel wrote: > > > > On 28. Feb 2022, at 12:20, Greg KH wrote: > > > > On Mon, Feb 28, 2022 at 12:08:18PM +0100, Jakob Koschel wrote: > >> If the list does not contain the expected element, the value of > >> list_for_each_entry() iterator will not point to a valid structure. > >> To avoid type confusion in such case, the list iterator > >> scope will be limited to list_for_each_entry() loop. > >> > >> In preparation to limiting scope of a list iterator to the list traversal > >> loop, use a dedicated pointer to point to the found element. > >> Determining if an element was found is then simply checking if > >> the pointer is != NULL. > >> > >> Signed-off-by: Jakob Koschel > >> --- > >> arch/x86/kernel/cpu/sgx/encl.c | 6 +++-- > >> drivers/scsi/scsi_transport_sas.c| 17 - > >> drivers/thermal/thermal_core.c | 38 ++-- > >> drivers/usb/gadget/configfs.c| 22 ++-- > >> drivers/usb/gadget/udc/max3420_udc.c | 11 +--- > >> drivers/usb/gadget/udc/tegra-xudc.c | 11 +--- > >> drivers/usb/mtu3/mtu3_gadget.c | 11 +--- > >> drivers/usb/musb/musb_gadget.c | 11 +--- > >> drivers/vfio/mdev/mdev_core.c| 11 +--- > >> 9 files changed, 88 insertions(+), 50 deletions(-) > > > > The drivers/usb/ portion of this patch should be in patch 1/X, right? > > I kept them separate since it's a slightly different case. > Using 'ptr' instead of '&ptr->other_member'. Regardless, I'll split > this commit up as you mentioned. > > > > > Also, you will have to split these up per-subsystem so that the > > different subsystem maintainers can take these in their trees. > > Thanks for the feedback. > To clarify I understand you correctly: > I should do one patch set per-subsystem with every instance/(file?) > fixed as a separate commit? Yes, each file needs a different commit as they are usually all written or maintained by different people. As I said in my other response, if you need any help with this, just let me know, I'm glad to help. thanks, greg k-h
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Tue, Mar 01, 2022 at 12:28:15PM +0100, Jakob Koschel wrote: > > > > On 1. Mar 2022, at 01:41, Linus Torvalds > > wrote: > > > > On Mon, Feb 28, 2022 at 1:47 PM Jakob Koschel > > wrote: > >> > >> The goal of this is to get compiler warnings right? This would indeed be > >> great. > > > > Yes, so I don't mind having a one-time patch that has been gathered > > using some automated checker tool, but I don't think that works from a > > long-term maintenance perspective. > > > > So if we have the basic rule being "don't use the loop iterator after > > the loop has finished, because it can cause all kinds of subtle > > issues", then in _addition_ to fixing the existing code paths that > > have this issue, I really would want to (a) get a compiler warning for > > future cases and (b) make it not actually _work_ for future cases. > > > > Because otherwise it will just happen again. > > > >> Changing the list_for_each_entry() macro first will break all of those > >> cases > >> (e.g. the ones using 'list_entry_is_head()). > > > > So I have no problems with breaking cases that we basically already > > have a patch for due to your automated tool. There were certainly > > more than a handful, but it didn't look _too_ bad to just make the > > rule be "don't use the iterator after the loop". > > > > Of course, that's just based on that patch of yours. Maybe there are a > > ton of other cases that your patch didn't change, because they didn't > > match your trigger case, so I may just be overly optimistic here. > > Based on the coccinelle script there are ~480 cases that need fixing > in total. I'll now finish all of them and then split them by > submodules as Greg suggested and repost a patch set per submodule. > Sounds good? Sounds good to me! If you need help carving these up and maintaining them over time as different subsystem maintainers accept/ignore them, just let me know. Doing large patchsets like this can be tough without a lot of experience. thanks, greg k-h
[PATCH 1/2] drm/amdgpu: Fix sigsev when accessing MMIO on hot unplug.
Protect with drm_dev_enter/exit Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index f522b52725e4..4294f17cedcb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -23,6 +23,7 @@ */ #include +#include #include "amdgpu.h" #include "amdgpu_sched.h" #include "amdgpu_ras.h" @@ -339,7 +340,7 @@ static void amdgpu_ctx_fini(struct kref *ref) { struct amdgpu_ctx *ctx = container_of(ref, struct amdgpu_ctx, refcount); struct amdgpu_device *adev = ctx->adev; - unsigned i, j; + unsigned i, j, idx; if (!adev) return; @@ -350,7 +351,12 @@ static void amdgpu_ctx_fini(struct kref *ref) ctx->entities[i][j] = NULL; } } - amdgpu_ctx_set_stable_pstate(ctx, AMDGPU_CTX_STABLE_PSTATE_NONE); + + if (drm_dev_enter(&adev->ddev, &idx)) { + amdgpu_ctx_set_stable_pstate(ctx, AMDGPU_CTX_STABLE_PSTATE_NONE); + drm_dev_exit(idx); + } + kfree(ctx); } -- 2.25.1
[PATCH 2/2] drm/amdgpu: Bump minor version for hot plug tests enabliing.
This will allow to enable the tests only after latest fix after which the tests passed on my system. I tested on NV21 standalone and Vega 10 and Polaris as pair with DRI_PRIME. It's possible there might be still issues on ASICs i don't have at my posession but that that the point of enbling the tests finally - if other people during testing will encounter errors they will report and I will be able to fix. The releated merge request for enabling libdrm tests suite is in https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/227 Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 415ecf8b2e05..be4adda8d674 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -100,9 +100,10 @@ * - 3.43.0 - Add device hot plug/unplug support * - 3.44.0 - DCN3 supports DCC independent block settings: !64B && 128B, 64B && 128B * - 3.45.0 - Add context ioctl stable pstate interface + * * 3.46.0 - Add context ioctl stable pstate interface */ #define KMS_DRIVER_MAJOR 3 -#define KMS_DRIVER_MINOR 45 +#define KMS_DRIVER_MINOR 46 #define KMS_DRIVER_PATCHLEVEL 0 int amdgpu_vram_limit; -- 2.25.1
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Tue, Mar 01, 2022 at 12:28:15PM +0100, Jakob Koschel wrote: > Based on the coccinelle script there are ~480 cases that need fixing > in total. I'll now finish all of them and then split them by > submodules as Greg suggested and repost a patch set per submodule. > Sounds good? To help with this splitting, see: https://github.com/kees/kernel-tools/blob/trunk/split-on-maintainer It's not perfect, but it'll get you really close. For example, if you had a single big tree-wide patch applied to your tree: $ rm 0*.patch $ git format-patch -1 HEAD $ mv 0*.patch treewide.patch $ split-on-maintainer treewide.patch $ ls 0*.patch If you have a build log before the patch that spits out warnings, the --build-log argument can extract those warnings on a per-file basis, too (though this can be fragile). -- Kees Cook
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Mon, Feb 28, 2022 at 04:45:11PM -0800, Linus Torvalds wrote: > Really. The "-Wshadow doesn't work on the kernel" is not some new > issue, because you have to do completely insane things to the source > code to enable it. The first big glitch with -Wshadow was with shadowed global variables. GCC 4.8 fixed that, but it still yells about shadowed functions. What _almost_ works is -Wshadow=local. At first glace, all the warnings look solvable, but then one will eventually discover __wait_event() and associated macros that mix when and how deeply it intentionally shadows variables. :) Another way to try to catch misused shadow variables is -Wunused-but-set-varible, but it, too, has tons of false positives. I tried to capture some of the rationale and research here: https://github.com/KSPP/linux/issues/152 -- Kees Cook
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Tue, Mar 01, 2022 at 06:40:04PM +0100, Jakob Koschel wrote: > > > > On 1. Mar 2022, at 18:36, Greg KH wrote: > > > > On Tue, Mar 01, 2022 at 12:28:15PM +0100, Jakob Koschel wrote: > >> > >> > >>> On 1. Mar 2022, at 01:41, Linus Torvalds > >>> wrote: > >>> > >>> On Mon, Feb 28, 2022 at 1:47 PM Jakob Koschel > >>> wrote: > > The goal of this is to get compiler warnings right? This would indeed be > great. > >>> > >>> Yes, so I don't mind having a one-time patch that has been gathered > >>> using some automated checker tool, but I don't think that works from a > >>> long-term maintenance perspective. > >>> > >>> So if we have the basic rule being "don't use the loop iterator after > >>> the loop has finished, because it can cause all kinds of subtle > >>> issues", then in _addition_ to fixing the existing code paths that > >>> have this issue, I really would want to (a) get a compiler warning for > >>> future cases and (b) make it not actually _work_ for future cases. > >>> > >>> Because otherwise it will just happen again. > >>> > Changing the list_for_each_entry() macro first will break all of those > cases > (e.g. the ones using 'list_entry_is_head()). > >>> > >>> So I have no problems with breaking cases that we basically already > >>> have a patch for due to your automated tool. There were certainly > >>> more than a handful, but it didn't look _too_ bad to just make the > >>> rule be "don't use the iterator after the loop". > >>> > >>> Of course, that's just based on that patch of yours. Maybe there are a > >>> ton of other cases that your patch didn't change, because they didn't > >>> match your trigger case, so I may just be overly optimistic here. > >> > >> Based on the coccinelle script there are ~480 cases that need fixing > >> in total. I'll now finish all of them and then split them by > >> submodules as Greg suggested and repost a patch set per submodule. > >> Sounds good? > > > > Sounds good to me! > > > > If you need help carving these up and maintaining them over time as > > different subsystem maintainers accept/ignore them, just let me know. > > Doing large patchsets like this can be tough without a lot of > > experience. > > Very much appreciated! > > There will probably be some cases that do not match one of the pattern > we already discussed and need separate attention. > > I was planning to start with one subsystem and adjust the coming ones > according to the feedback gather there instead of posting all of them > in one go. That seems wise. Feel free to use USB as a testing ground for this if you want to :) thanks, greg k-h
Re: [PATCH 2/2] drm/amdgpu: Bump minor version for hot plug tests enabliing.
On Tue, Mar 1, 2022 at 1:08 PM Andrey Grodzovsky wrote: > > This will allow to enable the tests only after latest fix > after which the tests passed on my system. > > I tested on NV21 standalone and Vega 10 and Polaris as > pair with DRI_PRIME. > > It's possible there might be still issues on ASICs i don't > have at my posession but that that the point of enbling > the tests finally - if other people during testing will > encounter errors they will report and I will be able to fix. > > The releated merge request for enabling libdrm tests suite is in > https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/227 > > Signed-off-by: Andrey Grodzovsky Typo in the title: enabliing -> enabling > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > index 415ecf8b2e05..be4adda8d674 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > @@ -100,9 +100,10 @@ > * - 3.43.0 - Add device hot plug/unplug support > * - 3.44.0 - DCN3 supports DCC independent block settings: !64B && 128B, > 64B && 128B > * - 3.45.0 - Add context ioctl stable pstate interface > + * * 3.46.0 - Add context ioctl stable pstate interface Please update the comment here to say something like bump for hotplug handing. With the typo and this fixed, series is: Reviewed-by: Alex Deucher > */ > #define KMS_DRIVER_MAJOR 3 > -#define KMS_DRIVER_MINOR 45 > +#define KMS_DRIVER_MINOR 46 > #define KMS_DRIVER_PATCHLEVEL 0 > > int amdgpu_vram_limit; > -- > 2.25.1 >
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Mon, Feb 28, 2022 at 2:29 PM James Bottomley wrote: > > However, if the desire is really to poison the loop variable then we > can do > > #define list_for_each_entry(pos, head, member) \ > for (pos = list_first_entry(head, typeof(*pos), member);\ > !list_entry_is_head(pos, head, member) && ((pos = NULL) == NULL; > \ > pos = list_next_entry(pos, member)) > > Which would at least set pos to NULL when the loop completes. That would actually have been excellent if we had done that originally. It would not only avoid the stale and incorrectly typed head entry left-over turd, it would also have made it very easy to test for "did I find an entry in the loop". But I don't much like it in the situation we are now. Why? Mainly because it basically changes the semantics of the loop _without_ any warnings about it. And we don't actually get the advantage of the nicer semantics, because we can't actually make code do list_for_each_entry(entry, ) { .. } if (!entry) return -ESRCH; .. use the entry we found .. because that would be a disaster for back-porting, plus it would be a flag-day issue (ie we'd have to change the semantics of the loop at the same time we change every single user). So instead of that simple "if (!entry)", we'd effectively have to continue to use something that still works with the old world order (ie that "if (list_entry_is_head())" model). So we couldn't really take _advantage_ of the nicer semantics, and we'd not even get a warning if somebody does it wrong - the code would just silently do the wrong thing. IOW: I don't think you are wrong about that patch: it would solve the problem that Jakob wants to solve, and it would have absolutely been much better if we had done this from the beginning. But I think that in our current situation, it's actually a really fragile solution to the "don't do that then" problem we have. Linus
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Tue, Mar 01, 2022 at 10:14:07AM -0800, Kees Cook wrote: > On Mon, Feb 28, 2022 at 04:45:11PM -0800, Linus Torvalds wrote: > > Really. The "-Wshadow doesn't work on the kernel" is not some new > > issue, because you have to do completely insane things to the source > > code to enable it. > > The first big glitch with -Wshadow was with shadowed global variables. > GCC 4.8 fixed that, but it still yells about shadowed functions. What > _almost_ works is -Wshadow=local. At first glace, all the warnings > look solvable, but then one will eventually discover __wait_event() > and associated macros that mix when and how deeply it intentionally > shadows variables. :) Well, that's just disgusting. Macros fundamentally shouldn't be referring to things that aren't in their arguments. The first step to cleaning this up is ... I'll take a look at the rest of cleaning this up soon. >From 28ffe35d56223d4242b915832299e5acc926737e Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Tue, 1 Mar 2022 13:47:07 -0500 Subject: [PATCH] wait: Parameterize the return variable to ___wait_event() Macros should not refer to variables which aren't in their arguments. Pass the name from its callers. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/swait.h| 12 ++-- include/linux/wait.h | 32 include/linux/wait_bit.h | 4 ++-- 3 files changed, 24 insertions(+), 24 deletions(-) diff --git a/include/linux/swait.h b/include/linux/swait.h index 6a8c22b8c2a5..5e8e9b13be2d 100644 --- a/include/linux/swait.h +++ b/include/linux/swait.h @@ -191,14 +191,14 @@ do { \ } while (0) #define __swait_event_timeout(wq, condition, timeout) \ - ___swait_event(wq, ___wait_cond_timeout(condition), \ + ___swait_event(wq, ___wait_cond_timeout(condition, __ret), \ TASK_UNINTERRUPTIBLE, timeout,\ __ret = schedule_timeout(__ret)) #define swait_event_timeout_exclusive(wq, condition, timeout) \ ({ \ long __ret = timeout; \ - if (!___wait_cond_timeout(condition)) \ + if (!___wait_cond_timeout(condition, __ret))\ __ret = __swait_event_timeout(wq, condition, timeout); \ __ret; \ }) @@ -216,14 +216,14 @@ do { \ }) #define __swait_event_interruptible_timeout(wq, condition, timeout)\ - ___swait_event(wq, ___wait_cond_timeout(condition), \ + ___swait_event(wq, ___wait_cond_timeout(condition, __ret), \ TASK_INTERRUPTIBLE, timeout, \ __ret = schedule_timeout(__ret)) #define swait_event_interruptible_timeout_exclusive(wq, condition, timeout)\ ({ \ long __ret = timeout; \ - if (!___wait_cond_timeout(condition)) \ + if (!___wait_cond_timeout(condition, __ret))\ __ret = __swait_event_interruptible_timeout(wq, \ condition, timeout);\ __ret; \ @@ -252,7 +252,7 @@ do { \ } while (0) #define __swait_event_idle_timeout(wq, condition, timeout) \ - ___swait_event(wq, ___wait_cond_timeout(condition), \ + ___swait_event(wq, ___wait_cond_timeout(condition, __ret), \ TASK_IDLE, timeout, \ __ret = schedule_timeout(__ret)) @@ -278,7 +278,7 @@ do { \ #define swait_event_idle_timeout_exclusive(wq, condition, timeout) \ ({ \ long __ret = timeout; \ - if (!___wait_cond_timeout(condition)) \ + if (!___wait_cond_timeout(condition, __ret))\ __ret = __swait_event_idle_timeout(wq, \ condition, timeout); \ __ret; \ diff --git a/include/linux/wait.h b/include/linux/wait.h index 851e07da2583..890cce3c0f2e 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -271,7 +271,7 @@ static inline void
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Tue, Mar 1, 2022 at 10:14 AM Kees Cook wrote: > > The first big glitch with -Wshadow was with shadowed global variables. > GCC 4.8 fixed that, but it still yells about shadowed functions. What > _almost_ works is -Wshadow=local. Heh. Yeah, I just have long memories of "-Wshadow was a disaster". You looked into the details. > Another way to try to catch misused shadow variables is > -Wunused-but-set-varible, but it, too, has tons of false positives. That on the face of it should be an easy warning to get technically right for a compiler. So I assume the "false positives" are simply because we end up having various variables that really don't end up being used - and "intentionally" so). Or rather, they might only be used under some config option - perhaps the use is even syntactically there and parsed, but the compiler notices that it's turned off under some if (IS_ENABLED(..)) option? Because yeah, we have a lot of those. I think that's a common theme with a lot of compiler warnings: on the face of it they sound "obviously sane" and nobody should ever write code like that. A conditional that is always true? Sounds idiotic, and sounds like a reasonable thing for a compiler to warn about, since why would you have a conditional in the first place for that? But then you realize that maybe the conditional is a build config option, and "always true" suddenly makes sense. Or it's a test for something that is always true on _that_architecture_ but not in some general sense (ie testing "sizeof()"). Or it's a purely syntactic conditional, like "do { } while (0)". It's why I'm often so down on a lot of the odd warnings that are hiding under W=1 and friends. They all may make sense in the trivial case ("That is insane") but then in the end they happen for sane code. And yeah, -Wshadow has had tons of history with macro nesting, and just being badly done in the first place (eg "strlen" can be a perfectly fine local variable). That said, maybe people could ask the gcc and clan people for a way to _mark_ the places where we expect to validly see shadowing. For example, that "local variable in a macro expression statement" thing is absolutely horrendous to fix with preprocessor tricks to try to make for unique identifiers. But I think it would be much more syntactically reasonable to add (for example) a "shadow" attribute to such a variable exactly to tell the compiler "yeah, yeah, I know this identifier could shadow an outer one" and turn it off that way. Linus
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Tue, Mar 1, 2022 at 11:06 AM Linus Torvalds wrote: > > So instead of that simple "if (!entry)", we'd effectively have to > continue to use something that still works with the old world order > (ie that "if (list_entry_is_head())" model). Just to prove my point about how this is painful, that doesn't work at all. If the loop iterator at the end is NULL (good, in theory), we can't use "list_entry_is_head()" to check whether we ended. We'd have to use a new thing entirely, to handle the "list_for_each_entry() has the old/new semantics" cases. That's largely why I was pushing for the "let's make it impossible to use the loop iterator at all outside the loop". It avoids the confusing case, and the patches to move to that stricter semantic can be merged independently (and before) doing the actual semantic change. I'm not saying my suggested approach is wonderful either. Honestly, it's painful that we have so nasty semantics for the end-of-loop case for list_for_each_entry(). The minimal patch would clearly be to keep those broken semantics, and just force everybody to use the list_entry_is_head() case. That's the "we know we messed up, we are too lazy to fix it, we'll just work around it and people need to be careful" approach. And laziness is a virtue. But bad semantics are bad semantics. So it's a question of balancing those two issues. Linus
[PATCH AUTOSEL 5.16 20/28] drm/amdgpu: bypass tiling flag check in virtual display case (v2)
From: Guchun Chen [ Upstream commit e2b993302f40c4eb714ecf896dd9e1c5be7d4cd7 ] vkms leverages common amdgpu framebuffer creation, and also as it does not support FB modifier, there is no need to check tiling flags when initing framebuffer when virtual display is enabled. This can fix below calltrace: amdgpu :00:08.0: GFX9+ requires FB check based on format modifier WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu] v2: check adev->enable_virtual_display instead as vkms can be enabled in bare metal as well. Signed-off-by: Leslie Shi Signed-off-by: Guchun Chen Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index dc50c05f23fc2..5c08047adb594 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -1145,7 +1145,7 @@ int amdgpu_display_framebuffer_init(struct drm_device *dev, if (ret) return ret; - if (!dev->mode_config.allow_fb_modifiers) { + if (!dev->mode_config.allow_fb_modifiers && !adev->enable_virtual_display) { drm_WARN_ONCE(dev, adev->family >= AMDGPU_FAMILY_AI, "GFX9+ requires FB check based on format modifier\n"); ret = check_tiling_flags_gfx6(rfb); -- 2.34.1
[PATCH AUTOSEL 5.15 16/23] drm/amdgpu: bypass tiling flag check in virtual display case (v2)
From: Guchun Chen [ Upstream commit e2b993302f40c4eb714ecf896dd9e1c5be7d4cd7 ] vkms leverages common amdgpu framebuffer creation, and also as it does not support FB modifier, there is no need to check tiling flags when initing framebuffer when virtual display is enabled. This can fix below calltrace: amdgpu :00:08.0: GFX9+ requires FB check based on format modifier WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu] v2: check adev->enable_virtual_display instead as vkms can be enabled in bare metal as well. Signed-off-by: Leslie Shi Signed-off-by: Guchun Chen Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index dc50c05f23fc2..5c08047adb594 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -1145,7 +1145,7 @@ int amdgpu_display_framebuffer_init(struct drm_device *dev, if (ret) return ret; - if (!dev->mode_config.allow_fb_modifiers) { + if (!dev->mode_config.allow_fb_modifiers && !adev->enable_virtual_display) { drm_WARN_ONCE(dev, adev->family >= AMDGPU_FAMILY_AI, "GFX9+ requires FB check based on format modifier\n"); ret = check_tiling_flags_gfx6(rfb); -- 2.34.1
Re: [PATCH v13 5/5] drm/amdgpu: add drm buddy support to amdgpu
On 22/02/22 6:24 pm, Christian König wrote: > Am 21.02.22 um 17:45 schrieb Arunpravin: >> - Remove drm_mm references and replace with drm buddy functionalities >> - Add res cursor support for drm buddy >> >> v2(Matthew Auld): >>- replace spinlock with mutex as we call kmem_cache_zalloc >> (..., GFP_KERNEL) in drm_buddy_alloc() function >> >>- lock drm_buddy_block_trim() function as it calls >> mark_free/mark_split are all globally visible >> >> v3(Matthew Auld): >>- remove trim method error handling as we address the failure case >> at drm_buddy_block_trim() function >> >> v4: >>- fix warnings reported by kernel test robot >> >> v5: >>- fix merge conflict issue >> >> v6: >>- fix warnings reported by kernel test robot >> >> v7: >>- remove DRM_BUDDY_RANGE_ALLOCATION flag usage >> >> v8: >>- keep DRM_BUDDY_RANGE_ALLOCATION flag usage >>- resolve conflicts created by drm/amdgpu: remove VRAM accounting v2 >> >> Signed-off-by: Arunpravin >> --- >> drivers/gpu/drm/Kconfig | 1 + >> .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h| 97 +-- >> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 7 +- >> drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 256 ++ >> 4 files changed, 229 insertions(+), 132 deletions(-) >> >> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig >> index 763355330b17..019ec0440ced 100644 >> --- a/drivers/gpu/drm/Kconfig >> +++ b/drivers/gpu/drm/Kconfig >> @@ -279,6 +279,7 @@ config DRM_AMDGPU >> select HWMON >> select BACKLIGHT_CLASS_DEVICE >> select INTERVAL_TREE >> +select DRM_BUDDY >> help >>Choose this option if you have a recent AMD Radeon graphics card. >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h >> index acfa207cf970..da12b4ff2e45 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h >> @@ -30,12 +30,15 @@ >> #include >> #include >> >> +#include "amdgpu_vram_mgr.h" >> + >> /* state back for walking over vram_mgr and gtt_mgr allocations */ >> struct amdgpu_res_cursor { >> uint64_tstart; >> uint64_tsize; >> uint64_tremaining; >> -struct drm_mm_node *node; >> +void*node; >> +uint32_tmem_type; >> }; >> >> /** >> @@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource >> *res, >> uint64_t start, uint64_t size, >> struct amdgpu_res_cursor *cur) >> { >> +struct drm_buddy_block *block; >> +struct list_head *head, *next; >> struct drm_mm_node *node; >> >> -if (!res || res->mem_type == TTM_PL_SYSTEM) { >> -cur->start = start; >> -cur->size = size; >> -cur->remaining = size; >> -cur->node = NULL; >> -WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT); >> -return; >> -} >> +if (!res) >> +goto err_out; > > It's not really an error to not have a resource. So I would rather name > the label fallback or something like that. > >> >> BUG_ON(start + size > res->num_pages << PAGE_SHIFT); >> >> -node = to_ttm_range_mgr_node(res)->mm_nodes; >> -while (start >= node->size << PAGE_SHIFT) >> -start -= node++->size << PAGE_SHIFT; >> +cur->mem_type = res->mem_type; >> + >> +switch (cur->mem_type) { >> +case TTM_PL_VRAM: >> +head = &to_amdgpu_vram_mgr_node(res)->blocks; >> + >> +block = list_first_entry_or_null(head, >> + struct drm_buddy_block, >> + link); >> +if (!block) >> +goto err_out; >> + >> +while (start >= amdgpu_node_size(block)) { >> +start -= amdgpu_node_size(block); >> + >> +next = block->link.next; >> +if (next != head) >> +block = list_entry(next, struct >> drm_buddy_block, link); >> +} >> + >> +cur->start = amdgpu_node_start(block) + start; >> +cur->size = min(amdgpu_node_size(block) - start, size); >> +cur->remaining = size; >> +cur->node = block; >> +break; >> +case TTM_PL_TT: >> +node = to_ttm_range_mgr_node(res)->mm_nodes; >> +while (start >= node->size << PAGE_SHIFT) >> +start -= node++->size << PAGE_SHIFT; >> + >> +cur->start = (node->start << PAGE_SHIFT) + start; >> +cur->size = min((node->size << PAGE_SHIFT) - start, size); >> +cur->remaining = size; >> +cur->node = node; >> +break; >> +default: >>
[PATCH v9] drm/amdgpu: add drm buddy support to amdgpu
- Remove drm_mm references and replace with drm buddy functionalities - Add res cursor support for drm buddy v2(Matthew Auld): - replace spinlock with mutex as we call kmem_cache_zalloc (..., GFP_KERNEL) in drm_buddy_alloc() function - lock drm_buddy_block_trim() function as it calls mark_free/mark_split are all globally visible v3(Matthew Auld): - remove trim method error handling as we address the failure case at drm_buddy_block_trim() function v4: - fix warnings reported by kernel test robot v5: - fix merge conflict issue v6: - fix warnings reported by kernel test robot v7: - remove DRM_BUDDY_RANGE_ALLOCATION flag usage v8: - keep DRM_BUDDY_RANGE_ALLOCATION flag usage - resolve conflicts created by drm/amdgpu: remove VRAM accounting v2 v9(Christian): - rename label name as fallback - move struct amdgpu_vram_mgr to amdgpu_vram_mgr.h - remove unnecessary flags from struct amdgpu_vram_reservation - rewrite block NULL check condition - change else style as per coding standard - rewrite the node max size - add a helper function to fetch the first entry from the list Signed-off-by: Arunpravin --- drivers/gpu/drm/Kconfig | 1 + .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h| 97 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 10 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 249 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 68 + 5 files changed, 289 insertions(+), 136 deletions(-) create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index f1422bee3dcc..5133c3f028ab 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -280,6 +280,7 @@ config DRM_AMDGPU select HWMON select BACKLIGHT_CLASS_DEVICE select INTERVAL_TREE + select DRM_BUDDY help Choose this option if you have a recent AMD Radeon graphics card. diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h index acfa207cf970..864c609ba00b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h @@ -30,12 +30,15 @@ #include #include +#include "amdgpu_vram_mgr.h" + /* state back for walking over vram_mgr and gtt_mgr allocations */ struct amdgpu_res_cursor { uint64_tstart; uint64_tsize; uint64_tremaining; - struct drm_mm_node *node; + void*node; + uint32_tmem_type; }; /** @@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource *res, uint64_t start, uint64_t size, struct amdgpu_res_cursor *cur) { + struct drm_buddy_block *block; + struct list_head *head, *next; struct drm_mm_node *node; - if (!res || res->mem_type == TTM_PL_SYSTEM) { - cur->start = start; - cur->size = size; - cur->remaining = size; - cur->node = NULL; - WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT); - return; - } + if (!res) + goto fallback; BUG_ON(start + size > res->num_pages << PAGE_SHIFT); - node = to_ttm_range_mgr_node(res)->mm_nodes; - while (start >= node->size << PAGE_SHIFT) - start -= node++->size << PAGE_SHIFT; + cur->mem_type = res->mem_type; + + switch (cur->mem_type) { + case TTM_PL_VRAM: + head = &to_amdgpu_vram_mgr_node(res)->blocks; + + block = list_first_entry_or_null(head, +struct drm_buddy_block, +link); + if (!block) + goto fallback; + + while (start >= amdgpu_node_size(block)) { + start -= amdgpu_node_size(block); + + next = block->link.next; + if (next != head) + block = list_entry(next, struct drm_buddy_block, link); + } + + cur->start = amdgpu_node_start(block) + start; + cur->size = min(amdgpu_node_size(block) - start, size); + cur->remaining = size; + cur->node = block; + break; + case TTM_PL_TT: + node = to_ttm_range_mgr_node(res)->mm_nodes; + while (start >= node->size << PAGE_SHIFT) + start -= node++->size << PAGE_SHIFT; + + cur->start = (node->start << PAGE_SHIFT) + start; + cur->size = min((node->size << PAGE_SHIFT) - start, size); + cur->remaining = size; + cur->node = node; + break; +
Re: [PATCH 6/6] treewide: remove check of list iterator against head past the loop body
So looking at this patch, I really reacted to the fact that quite often the "use outside the loop" case is all kinds of just plain unnecessary, but _used_ to be a convenience feature. I'll just quote the first chunk in it's entirely as an example - not because I think this chunk is particularly important, but because it's a good example: On Mon, Feb 28, 2022 at 3:09 AM Jakob Koschel wrote: > > diff --git a/arch/arm/mach-mmp/sram.c b/arch/arm/mach-mmp/sram.c > index 6794e2db1ad5..fc47c107059b 100644 > --- a/arch/arm/mach-mmp/sram.c > +++ b/arch/arm/mach-mmp/sram.c > @@ -39,19 +39,22 @@ static LIST_HEAD(sram_bank_list); > struct gen_pool *sram_get_gpool(char *pool_name) > { > struct sram_bank_info *info = NULL; > + struct sram_bank_info *tmp; > > if (!pool_name) > return NULL; > > mutex_lock(&sram_lock); > > - list_for_each_entry(info, &sram_bank_list, node) > - if (!strcmp(pool_name, info->pool_name)) > + list_for_each_entry(tmp, &sram_bank_list, node) > + if (!strcmp(pool_name, tmp->pool_name)) { > + info = tmp; > break; > + } > > mutex_unlock(&sram_lock); > > - if (&info->node == &sram_bank_list) > + if (!info) > return NULL; > > return info->gpool; I realize this was probably at least auto-generated with coccinelle, but maybe that script could be taught to do avoid the "use after loop" by simply moving the code _into_ the loop. IOW, this all would be cleaner and clear written as if (!pool_name) return NULL; mutex_lock(&sram_lock); list_for_each_entry(info, &sram_bank_list, node) { if (!strcmp(pool_name, info->pool_name)) { mutex_unlock(&sram_lock); return info; } } mutex_unlock(&sram_lock); return NULL; Ta-daa - no use outside the loop, no need for new variables, just a simple "just do it inside the loop". Yes, we end up having that lock thing twice, but it looks worth it from a "make the code obvious" standpoint. Would it be even cleaner if the locking was done in the caller, and the loop was some simple helper function? It probably would. But that would require a bit more smarts than probably a simple coccinelle script would do. Linus
RE: [PATCH] drm/amdgpu: Move common initialization operations of each ras block to one function
[AMD Official Use Only] Hi Stanley, Thanks for your suggestion. I add a comment after your comment. -Original Message- From: Yang, Stanley Sent: Tuesday, March 1, 2022 9:50 PM To: Chai, Thomas ; amd-gfx@lists.freedesktop.org Cc: Zhou1, Tao ; Zhang, Hawking ; Clements, John ; Chai, Thomas ; Chai, Thomas Subject: 回复: [PATCH] drm/amdgpu: Move common initialization operations of each ras block to one function [AMD Official Use Only] Hi yipe, One suggestion for this patch, please check my comment. Regards, Stanley > -邮件原件- > 发件人: amd-gfx 代表 yipechai > 发送时间: Tuesday, March 1, 2022 5:46 PM > 收件人: amd-gfx@lists.freedesktop.org > 抄送: Zhou1, Tao ; Zhang, Hawking > ; Clements, John ; Chai, > Thomas ; Chai, Thomas > 主题: [PATCH] drm/amdgpu: Move common initialization operations of each > ras block to one function > > Define amdgpu_ras_sw_init function to initialize all ras blocks. > > Signed-off-by: yipechai > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 + > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c| 2 - > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c| 143 > - > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h| 1 + > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 21 --- > drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 16 --- > drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 28 > drivers/gpu/drm/amd/amdgpu/mca_v3_0.c | 6 - > drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 17 --- > 9 files changed, 148 insertions(+), 92 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 6113ddc765a7..72550e9f6058 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -2402,6 +2402,12 @@ static int amdgpu_device_ip_init(struct > amdgpu_device *adev) > } > } > > + r = amdgpu_ras_sw_init(adev); > + if (r) { > + DRM_ERROR("amdgpu_ras_early_init failed (%d).\n", r); > + goto init_failed; > + } > [Yang, Stanley] : This is ras blocks early init, I think it's more > reasonable to move amdgpu_ras_sw_init before amdgpu_ras_init function. [Thomas] Sorry, I will change this error message print. I also agree with you to initialize all ras operations before amdgpu_ras_init, but the initialization place of each ras instance is different. The ras block instances of mmhub umc sdma and hdp are initialized in IP .early_init function, but gfx and mca ras block instances are initialized in the IP .sw_init function. Then, after calling all IP .sw_init initialization and then calling amdgpu_ras_sw_init may be the earliest place to initialize all ras blocks. > + > if (amdgpu_sriov_vf(adev)) > amdgpu_virt_init_data_exchange(adev); > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > index ab75e189bc0b..544241f357b2 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > @@ -440,8 +440,6 @@ int amdgpu_gmc_ras_early_init(struct amdgpu_device > *adev) { > if (!adev->gmc.xgmi.connected_to_cpu) { > adev->gmc.xgmi.ras = &xgmi_ras; > - amdgpu_ras_register_ras_block(adev, &adev- > >gmc.xgmi.ras->ras_block); > - adev->gmc.xgmi.ras_if = &adev->gmc.xgmi.ras- > >ras_block.ras_comm; > } > > return 0; > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > index d3875618ebf5..89075ab9e82e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > @@ -2299,8 +2299,6 @@ int amdgpu_ras_init(struct amdgpu_device *adev) > case CHIP_ALDEBARAN: > if (!adev->gmc.xgmi.connected_to_cpu) { > adev->nbio.ras = &nbio_v7_4_ras; > - amdgpu_ras_register_ras_block(adev, &adev- > >nbio.ras->ras_block); > - adev->nbio.ras_if = &adev->nbio.ras- > >ras_block.ras_comm; > } > break; > default: > @@ -2533,6 +2531,147 @@ void amdgpu_ras_suspend(struct amdgpu_device > *adev) > amdgpu_ras_disable_all_features(adev, 1); } > > +int amdgpu_ras_sw_init(struct amdgpu_device *adev) { > + int err = 0; > + > + if (!amdgpu_ras_asic_supported(adev)) > + return 0; > + > + if (adev->nbio.ras) { > + err = amdgpu_ras_register_ras_block(adev, &adev- > >nbio.ras->ras_block); > + if (err) { > + dev_err(adev->dev, "Failed to register nbio ras > block!\n"); > + return err; > + } > + adev->nbio.ras_if = &adev->nbio.ras->ras_block.ras_comm; > + } > + > + if (adev->gmc.xgmi.ras) { > + err = amdgpu_ras_register_ras_block(adev, &adev- > >gmc.xgmi.ras->ras_bl
Re: [PATCH v9] drm/amdgpu: add drm buddy support to amdgpu
Hi Arunpravin, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on drm/drm-next] [also build test WARNING on drm-tip/drm-tip] [cannot apply to drm-intel/for-linux-next v5.17-rc6 next-20220301] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Arunpravin/drm-amdgpu-add-drm-buddy-support-to-amdgpu/20220302-043936 base: git://anongit.freedesktop.org/drm/drm drm-next config: ia64-randconfig-r011-20220301 (https://download.01.org/0day-ci/archive/20220302/202203020700.yfcutgqz-...@intel.com/config) compiler: ia64-linux-gcc (GCC) 11.2.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/22a06757ae067e29c09a9d95eaf2b9053833740f git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Arunpravin/drm-amdgpu-add-drm-buddy-support-to-amdgpu/20220302-043936 git checkout 22a06757ae067e29c09a9d95eaf2b9053833740f # save the config file to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=ia64 SHELL=/bin/bash drivers/gpu/drm/amd/amdgpu/ If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All warnings (new ones prefixed by >>): drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c: In function 'amdgpu_vram_mgr_new': >> drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c:360:28: warning: variable >> 'mem_bytes' set but not used [-Wunused-but-set-variable] 360 | u64 vis_usage = 0, mem_bytes, max_bytes, min_page_size; |^ vim +/mem_bytes +360 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 343 344 /** 345 * amdgpu_vram_mgr_new - allocate new ranges 346 * 347 * @man: TTM memory type manager 348 * @tbo: TTM BO we need this range for 349 * @place: placement flags and restrictions 350 * @res: the resulting mem object 351 * 352 * Allocate VRAM for the given BO. 353 */ 354 static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, 355 struct ttm_buffer_object *tbo, 356 const struct ttm_place *place, 357 struct ttm_resource **res) 358 { 359 unsigned long lpfn, pages_per_node, pages_left, pages; > 360 u64 vis_usage = 0, mem_bytes, max_bytes, min_page_size; 361 struct amdgpu_vram_mgr *mgr = to_vram_mgr(man); 362 struct amdgpu_device *adev = to_amdgpu_device(mgr); 363 struct amdgpu_vram_mgr_node *node; 364 struct drm_buddy *mm = &mgr->mm; 365 struct drm_buddy_block *block; 366 unsigned i; 367 int r; 368 369 lpfn = place->lpfn; 370 if (!lpfn) 371 lpfn = man->size >> PAGE_SHIFT; 372 373 max_bytes = adev->gmc.mc_vram_size; 374 if (tbo->type != ttm_bo_type_kernel) 375 max_bytes -= AMDGPU_VM_RESERVED_VRAM; 376 377 mem_bytes = tbo->base.size; 378 if (place->flags & TTM_PL_FLAG_CONTIGUOUS) { 379 pages_per_node = ~0ul; 380 } else { 381 #ifdef CONFIG_TRANSPARENT_HUGEPAGE 382 pages_per_node = HPAGE_PMD_NR; 383 #else 384 /* default to 2MB */ 385 pages_per_node = 2UL << (20UL - PAGE_SHIFT); 386 #endif 387 pages_per_node = max_t(uint32_t, pages_per_node, 388 tbo->page_alignment); 389 } 390 391 node = kzalloc(sizeof(*node), GFP_KERNEL); 392 if (!node) 393 return -ENOMEM; 394 395 ttm_resource_init(tbo, place, &node->base); 396 397 /* bail out quickly if there's likely not enough VRAM for this BO */ 398 if (ttm_resource_manager_usage(man) > max_bytes) { 399 r = -ENOSPC; 400 goto error_fini; 401 } 402 403 INIT_LIST_HEAD(&node->blocks); 404 405 if (place->flags & TTM_PL_FLAG_TOPDOWN) 406 node->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION; 407 408 if (place->fpfn || lpfn != man->size >> PAGE_SHIFT) 409 /* Allocate blocks in desired range */ 410 node->flags |=
[PATCH] drm/amdgpu: fix potential null dereference
"ctx" is dereferenced but null checked later. Swap their positions to avoid potential null dereference. Found using a Coccinelle script: https://coccinelle.gitlabpages.inria.fr/website/rules/mini_null_ref.cocci Signed-off-by: Weiguo Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index f522b52725e4..b4f035ce44bc 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -258,11 +258,12 @@ static void amdgpu_ctx_fini_entity(struct amdgpu_ctx_entity *entity) static int amdgpu_ctx_get_stable_pstate(struct amdgpu_ctx *ctx, u32 *stable_pstate) { - struct amdgpu_device *adev = ctx->adev; + struct amdgpu_device *adev; enum amd_dpm_forced_level current_level; if (!ctx) return -EINVAL; + adev = ctx->adev; current_level = amdgpu_dpm_get_performance_level(adev); @@ -289,12 +290,13 @@ static int amdgpu_ctx_get_stable_pstate(struct amdgpu_ctx *ctx, static int amdgpu_ctx_set_stable_pstate(struct amdgpu_ctx *ctx, u32 stable_pstate) { - struct amdgpu_device *adev = ctx->adev; + struct amdgpu_device *adev; enum amd_dpm_forced_level level; int r; if (!ctx) return -EINVAL; + adev = ctx->adev; mutex_lock(&adev->pm.stable_pstate_ctx_lock); if (adev->pm.stable_pstate_ctx && adev->pm.stable_pstate_ctx != ctx) { -- 2.25.1
RE: [PATCH Review 1/1] drm/amdgpu: support send bad channel info to smu
[AMD Official Use Only] > -Original Message- > From: Stanley.Yang > Sent: Tuesday, March 1, 2022 9:30 PM > To: amd-gfx@lists.freedesktop.org; Zhang, Hawking > ; Zhou1, Tao ; Joo, Maria > > Cc: Yang, Stanley > Subject: [PATCH Review 1/1] drm/amdgpu: support send bad channel info to > smu > > Message SMU bad channel information bitmap to update OOB table > > Change-Id: I49a79af64d5263c28db059ecb8b8405a471431b4 > Signed-off-by: Stanley.Yang > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 7 +++ > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 3 ++ > .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c| 25 ++- > .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h| 4 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 5 +++ > drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 12 ++ > drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 1 + > drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 10 + > drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 7 +++ > .../pm/swsmu/inc/pmfw_if/aldebaran_ppsmc.h| 3 +- > drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h | 3 +- > .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 43 +++ > 12 files changed, 119 insertions(+), 4 deletions(-) [Tao] It's better to split the patch into two parts, one for amdgpu and one for pm. > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > index d3875618ebf5..f9104f99eb9c 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > @@ -2068,6 +2068,7 @@ int amdgpu_ras_recovery_init(struct amdgpu_device > *adev) > mutex_init(&con->recovery_lock); > INIT_WORK(&con->recovery_work, amdgpu_ras_do_recovery); > atomic_set(&con->in_recovery, 0); > + con->eeprom_control.bad_channel_bitmap = 0; > > max_eeprom_records_count = > amdgpu_ras_eeprom_max_record_count(); > amdgpu_ras_validate_threshold(adev, max_eeprom_records_count); > @@ -2092,6 +2093,11 @@ int amdgpu_ras_recovery_init(struct amdgpu_device > *adev) > goto free; > > amdgpu_dpm_send_hbm_bad_pages_num(adev, con- > >eeprom_control.ras_num_recs); > + > + if (con->update_channel_flag == true) { [Tao] It can be simplified to "if (con->update_channel_flag)" > + amdgpu_dpm_send_hbm_bad_channel_flag(adev, con- > >eeprom_control.bad_channel_bitmap); [Tao] do we need to check status of the function and stop recovery_init if it fails? > + con->update_channel_flag = false; > + } > } > > #ifdef CONFIG_X86_MCE_AMD > @@ -2285,6 +2291,7 @@ int amdgpu_ras_init(struct amdgpu_device *adev) > goto release_con; > } > > + con->update_channel_flag = false; > con->features = 0; > INIT_LIST_HEAD(&con->head); > /* Might need get this flag from vbios. */ diff --git > a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h > index 7cddaad90d6d..9314fde81e68 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h > @@ -374,6 +374,9 @@ struct amdgpu_ras { > > /* record umc error info queried from smu */ > struct umc_ecc_info umc_ecc; > + > + /* Indicates smu whether need update bad channel info */ > + bool update_channel_flag; > }; > > struct ras_fs_data { > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > index 2b844a5aafdb..ad5d8667756d 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > @@ -265,6 +265,7 @@ int amdgpu_ras_eeprom_reset_table(struct > amdgpu_ras_eeprom_control *control) { > struct amdgpu_device *adev = to_amdgpu_device(control); > struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr; > + struct amdgpu_ras *con = amdgpu_ras_get_context(adev); > u8 csum; > int res; > > @@ -285,6 +286,10 @@ int amdgpu_ras_eeprom_reset_table(struct > amdgpu_ras_eeprom_control *control) > > amdgpu_dpm_send_hbm_bad_pages_num(adev, control- > >ras_num_recs); > > + control->bad_channel_bitmap = 0; > + amdgpu_dpm_send_hbm_bad_channel_flag(adev, control- > >bad_channel_bitmap); > + con->update_channel_flag = false; > + > amdgpu_ras_debugfs_set_ret_size(control); > > mutex_unlock(&control->ras_tbl_mutex); > @@ -418,6 +423,7 @@ amdgpu_ras_eeprom_append_table(struct > amdgpu_ras_eeprom_control *control, > struct eeprom_table_record *record, > const u32 num) > { > + struct amdgpu_ras *con = > +amdgpu_ras_get_context(to_amdgpu_device(control)); > u32 a, b, i; > u8 *buf, *pp; > int res; > @@ -429,9 +435,16 @@ amdgpu_ras_eeprom_append_table(struct > amdgpu_ras_eeprom_control *control, > /* Encode all of them in one go. >
Re: [PATCH v2 2/3] drm/amdgpu: convert code name to ip version for noretry set
Am 01.03.22 um 16:24 schrieb Paul Menzel: Dear Yifan, Thank you for your patch. Am 01.03.22 um 16:01 schrieb Yifan Zhang: Use IP version rather than codename for noretry set. Why? Why not? Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index ab75e189bc0b..fbc22b7b6315 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -559,14 +559,14 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) { struct amdgpu_gmc *gmc = &adev->gmc; - switch (adev->asic_type) { - case CHIP_VEGA10: - case CHIP_VEGA20: - case CHIP_ARCTURUS: - case CHIP_ALDEBARAN: - case CHIP_BEIGE_GOBY: - case CHIP_YELLOW_CARP: - case CHIP_RENOIR: + switch (adev->ip_versions[GC_HWIP][0]) { + case IP_VERSION(9, 0, 1): + case IP_VERSION(9, 4, 0): + case IP_VERSION(9, 4, 1): + case IP_VERSION(9, 4, 2): + case IP_VERSION(10, 3, 5): + case IP_VERSION(10, 3, 3): + case IP_VERSION(9, 3, 0): I think, sorting these entries might be useful. Should the names be added as comments for those not having them memorized? /* * noretry = 0 will cause kfd page fault tests fail * for some ASICs, so set default to 1 for these ASICs. @@ -576,7 +576,6 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) else gmc->noretry = amdgpu_noretry; break; - case CHIP_RAVEN: Why remove this? default: /* Raven currently has issues with noretry * regardless of what we decide for other Kind regards, Paul
Re: [PATCH v2 2/3] drm/amdgpu: convert code name to ip version for noretry set
Am 01.03.22 um 16:01 schrieb Yifan Zhang: Use IP version rather than codename for noretry set. Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index ab75e189bc0b..fbc22b7b6315 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -559,14 +559,14 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) { struct amdgpu_gmc *gmc = &adev->gmc; - switch (adev->asic_type) { - case CHIP_VEGA10: - case CHIP_VEGA20: - case CHIP_ARCTURUS: - case CHIP_ALDEBARAN: - case CHIP_BEIGE_GOBY: - case CHIP_YELLOW_CARP: - case CHIP_RENOIR: + switch (adev->ip_versions[GC_HWIP][0]) { + case IP_VERSION(9, 0, 1): + case IP_VERSION(9, 4, 0): + case IP_VERSION(9, 4, 1): + case IP_VERSION(9, 4, 2): + case IP_VERSION(10, 3, 5): + case IP_VERSION(10, 3, 3): + case IP_VERSION(9, 3, 0): Maybe sort those? Apart from that Acked-by: Christian König Regards, Christian. /* * noretry = 0 will cause kfd page fault tests fail * for some ASICs, so set default to 1 for these ASICs. @@ -576,7 +576,6 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) else gmc->noretry = amdgpu_noretry; break; - case CHIP_RAVEN: default: /* Raven currently has issues with noretry * regardless of what we decide for other
Re: [PATCH v2 2/3] drm/amdgpu: convert code name to ip version for noretry set
Dear Christian, Am 02.03.22 um 08:46 schrieb Christian König: Am 01.03.22 um 16:24 schrieb Paul Menzel: […] Am 01.03.22 um 16:01 schrieb Yifan Zhang: Use IP version rather than codename for noretry set. Why? Why not? No idea as the commit message currently does not document the motivation. Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index ab75e189bc0b..fbc22b7b6315 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -559,14 +559,14 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) { struct amdgpu_gmc *gmc = &adev->gmc; - switch (adev->asic_type) { - case CHIP_VEGA10: - case CHIP_VEGA20: - case CHIP_ARCTURUS: - case CHIP_ALDEBARAN: - case CHIP_BEIGE_GOBY: - case CHIP_YELLOW_CARP: - case CHIP_RENOIR: + switch (adev->ip_versions[GC_HWIP][0]) { + case IP_VERSION(9, 0, 1): + case IP_VERSION(9, 4, 0): + case IP_VERSION(9, 4, 1): + case IP_VERSION(9, 4, 2): + case IP_VERSION(10, 3, 5): + case IP_VERSION(10, 3, 3): + case IP_VERSION(9, 3, 0): I think, sorting these entries might be useful. Should the names be added as comments for those not having them memorized? /* * noretry = 0 will cause kfd page fault tests fail * for some ASICs, so set default to 1 for these ASICs. @@ -576,7 +576,6 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) else gmc->noretry = amdgpu_noretry; break; - case CHIP_RAVEN: Why remove this? default: /* Raven currently has issues with noretry * regardless of what we decide for other Kind regards, Paul