Re: drm pull for v5.3-rc1

2019-07-15 Thread Linus Torvalds
On Mon, Jul 15, 2019 at 10:37 AM Linus Torvalds
 wrote:
>
> I'm not pulling this. Why did you merge it into your tree, when
> apparently you were aware of how questionable it is judging by the drm
> pull request.

Looking at some of the fallout, I also see that you then added that
"adjust apply_to_pfn_range interface for dropped token" patch that
seems to be for easier merging of this all.

But you remove the 'token' entirely in one place, and in another you
keep it and just say "whatever, it's unused, pass in NULL". WHAA?

As part of looking at this all, I also note that some of this is also
very non-kernely.

The whole thing with trying to implement a "closure" in C is simply
not how we do things in the kernel (although I've admittedly seen
signs of it in some drivers).

If this should be done at all (and that's questionable), at least do
it in the canonical kernel way: pass in a separate "actor" function
pointer and an argument block, don't try to mix function pointers and
argument data and call it a "closure".

We try to keep data and functions separate. It's not even for security
concerns (although those have caused some splits in the past - avoid
putting function pointers in structures that you then can't mark
read-only!), it's a more generic issue of just keeping arguments as
arguments - even if you then make a structure of them in order to not
make the calling convention very complicated.

(Yes, we do have the pattern of sometimes mixing function pointers
with "describing data", ie the "struct file_operations" structure
isn't _just_ actual function pointers, it also contains the module
owner, for example. But those aren't about mixing function pointers
with their arguments, it's about basically "describing" an object
interface with more than just the operation pointers).

So some of this code is stuff that I would have let go if it was in
some individual driver ("Closures? C doesn't have closures! But
whatever - that driver writer came from some place that taught lamda
calculus before techning C").

But in the core mm code, I want reviews. And I want the code to follow
normal kernel conventions.

   Linus


Re: DRM pull for v5.3-rc1

2019-07-15 Thread Linus Torvalds
[ Ugh, I have three different threads about the drm pull because of
the subject / html confusion. So now I'm replying in separate threads
and I'm hoping the people involved have better threading than gmail
does ;/ ]

On Mon, Jul 15, 2019 at 5:29 AM Jason Gunthorpe  wrote:
>
> The 'hmm' tree is something I ran to try and help workflow issues like
> this, as it could be merged to DRM as a topic branch - maybe consider
> this flow in future?
>
> Linus, do you have any advice on how best to handle sharing mm
> patches?

I don't have a lot of advice except for "very very carefully".

I think the hmm tree worked really well this merge window, at least
from my standpoint.

But it is of course possible that my happiness about the hmm tree is a
complete fluke and came about because pretty much all the patches were
removing oddities and cleaning things up, and they weren't adding new
odd things (or if they were, you hid it better ;^).

Basically, people should know that there are some subsystems that I
end up being *much* more worried about than others. If I see a pull
request that modifies core VM of VFS code, I tend to go into "careful
mode", in ways that I don't do when some maintainer sends me a pull
request that obviously only changes some subtree that the maintainer
owns.

When driver maintainers send me patches that touch their drivers, I
look at the diffstat.

But when driver maintainers send me patches that change mm/, I look at
individual commit messages and the actual diff itself, and if I see
contentious stuff, I simply won't pull.

It's why I left the hmm pull request (which came in early - thank you)
until yesterday, simply because just from the diffstat I could tell
that "ok, this merge isn't just me merging and doing sanity checks,
this is me having to set aside time to do reviews". So just from the
diffstat, I avoided even looking at it while I still had a mailbox
full of other pull requests.

So I do like seeing core mm changes come in through a separate branch.
That's partly because that way it's easier to review without having
the parts I care about be mixed up with other parts (it's one reason I
asked the security layer pulls to be split up, to pick another area
entirely as an example). But it's also partly because if you have a
separate branch for the stuff that you know is contentious, that
doesn't then hold up the parts that _aren't_.

For example, right now I'm not pulling _any_ of the drm updates,
simply because there's a part of it that I can't stomach.  It would
have been so much nicer if the contentious things that need a lot of
care separate, and I could have at least pulled the other stuff.

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: drm pull for v5.3-rc1

2019-07-15 Thread Linus Torvalds
On Mon, Jul 15, 2019 at 11:29 AM Dave Airlie  wrote:
>
> Not that I want to defend that code, but the mm patch that conflicts
> already shows that removing the token is fine as nobody needs or
> requires it. So the fixup patch in my tree was just a bridge to that patch,
> which reduces conflicts. Rip the token out of the new API, pass it as NULL
> to the old API until the mm patch is merged against it which drops the
> token from the old API.

Well, to me the "old" API looks like a new one too, since it's that
"struct page_range_apply" thing.

But I can appreciate that it makes for minimal patch to avoid
conflicts with other patches. It just doesn't look very sensible
stand-alone afaik.

I might be missing something.

But that last patch is more of a detail - it wouldn't even exist if it
wasn't for the earlier mm patches, and those are the ones that make me
go more than "Whaa?" so it's not like this is really all that big of
an issue. More of just a note I made while looking through the mm
parts.

 Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: DRM pull for v5.3-rc1

2019-07-15 Thread Linus Torvalds
On Mon, Jul 15, 2019 at 11:16 AM Linus Torvalds
 wrote:
>
> On Mon, Jul 15, 2019 at 5:29 AM Jason Gunthorpe  wrote:
> >
> > The 'hmm' tree is something I ran to try and help workflow issues like
> > this, as it could be merged to DRM as a topic branch - maybe consider
> > this flow in future?
> >
> > Linus, do you have any advice on how best to handle sharing mm
> > patches?
>
> I don't have a lot of advice except for "very very carefully".
>
> I think the hmm tree worked really well this merge window, at least
> from my standpoint.

Side note: I suspect that having a separate branch maintained by a
separate person actually does help the "very carefully" part.

I think the hmm branch ended up getting more "incidental review"
simply because of how it was done. So even if the original reason for
the separate branch was to resolve some quilt/git integration issues,
I would not be at all surprised if just the extra indirection through
another person ended up making both the sending and receiving side
think more about each patch and think more about the abstraction.

The hmm branch didn't actually seem to have any of the core VM people
reviewing it either, but it did have reviewers across companies for
all the patches, and I do think that that makes a difference.

It's _soo_ easy for a patch series to be developed inside one company
by a couple of people who are probably in the same group, and have the
exact same objectives, to be a lot more biased (and likely biased not
towards the mm goals, but the goals of the code _outside_ the mm).

This is just a long-winded way to say "I do think that the separate
and external branch with multiple interested parties" can have some
inherent advantages, when you actually have multiple people looking at
it with care and intent.

(And here the fact that you have multiple subsystems looking at the
code is very much part of what makes it a good model - if it was just
an external branch for one single user - the vmware gfx driver - you
wouldn't get the same kind of advantages. So it's not the "external
branch" part itself, it's the "multiple users who care" part that
likely causes people to think more about the end result)

Again - maybe I'm rationalizing, and the hmm branch just randomly
happened to work well this time. I do like having multiple people from
different groups look at things, though.

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: DRM pull for v5.3-rc1

2019-07-15 Thread Linus Torvalds
On Mon, Jul 15, 2019 at 12:17 PM Jason Gunthorpe  wrote:
>
> About the only thing I could concretely suggest for working with -mm
> is if there was some way the -mm quilt patches could participate in
> 'git merge' resolution at your level.

Andrew did make noises about having multiple git branches. We'll see.

Linus


Re: drm pull for v5.3-rc1

2019-07-15 Thread Linus Torvalds
On Mon, Jul 15, 2019 at 12:36 PM Thomas Hellström (VMware)
 wrote:
>
> - I've never had any kernel code more reviewed than this.

Hmm. It may have been reviewed, but that wasn't visible in the commits
themselves, so when I look at the pull request, I don't see that.

> - The combined callback / argument struct: It was strongly inspired by
> the struct mm_walk (mm.h), the page walk code being quite similar in
> functionality.

The mm_walk struct is indeed a bit similar, and is in fact a bit
problematic exactly because it mixes function pointers with non-const
data.

I wish it had been a 'const struct mm_walk *" that only passed in the
stuff that describes what to do on the walk itself.  Or separated into
two different pointers - one for the "this is what to do for the walk"
and one for "this is the walking data".

In fact, I think tight now that is actually _almost_ the case and we
could make them const, except for "walk->vma" which is updated
dynamically as we walk.  Oh well.

And for all I know, some of the walkers may be modifying their
"private" field too, since that's left to the walkers.

So yes, that one also has some problems, I agree.

   Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: drm pull for v5.3-rc1

2019-07-15 Thread Linus Torvalds
On Mon, Jul 15, 2019 at 1:07 PM Linus Torvalds
 wrote:
>
> The mm_walk struct is indeed a bit similar, and is in fact a bit
> problematic exactly because it mixes function pointers with non-const
> data.

This made me look at how nasty that would be to fix.

Not too bad.

The attached patch does add more lines than it removes, but in most
cases it's actually a clear improvement.

It results in:

 - smaller stackframes and less runtime initialization: the bulk of
the 'mm_walk' structure was the ops pointers, and if we split them out
and make them const, we can just initialize them statically, and the
stack footprint now becomes just a single word.

 - the function pointers are now nicely in a const data section

in addition to the whole "don't mix variable data with constants, and
don't put function pointers on the stack" thing.

Of course, I haven't _tested_ the end result, but since it compiles it
must be perfect, right? Not that I tested all of the build either,
since several of the mm_walk users were for other architectures.

I'm not sure this is really worth it, but I'm throwing the patch out
there in case somebody wants to look.

Andrew, comments? I don't think we have anybody who is in charge of
mm_walk outside of you...

  Linus
 arch/openrisc/kernel/dma.c  | 10 --
 arch/powerpc/mm/book3s64/subpage_prot.c |  5 ++-
 arch/s390/mm/gmap.c | 25 ++
 fs/proc/task_mmu.c  | 40 +++---
 include/linux/mm.h  | 23 +
 mm/hmm.c| 34 ++-
 mm/madvise.c| 10 --
 mm/memcontrol.c | 11 --
 mm/mempolicy.c  |  5 ++-
 mm/migrate.c| 20 +--
 mm/mincore.c|  5 ++-
 mm/mprotect.c   |  5 ++-
 mm/pagewalk.c   | 60 +++--
 13 files changed, 165 insertions(+), 88 deletions(-)

diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c
index b41a79fcdbd9..1a69a66fe257 100644
--- a/arch/openrisc/kernel/dma.c
+++ b/arch/openrisc/kernel/dma.c
@@ -80,8 +80,11 @@ arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 {
 	unsigned long va;
 	void *page;
-	struct mm_walk walk = {
+	static const struct mm_walk_ops ops = {
 		.pte_entry = page_set_nocache,
+	};
+	struct mm_walk walk = {
+		.ops = &ops,
 		.mm = &init_mm
 	};
 
@@ -111,8 +114,11 @@ arch_dma_free(struct device *dev, size_t size, void *vaddr,
 		dma_addr_t dma_handle, unsigned long attrs)
 {
 	unsigned long va = (unsigned long)vaddr;
-	struct mm_walk walk = {
+	static const struct mm_walk_ops ops = {
 		.pte_entry = page_clear_nocache,
+	};
+	struct mm_walk walk = {
+		.ops = &ops,
 		.mm = &init_mm
 	};
 
diff --git a/arch/powerpc/mm/book3s64/subpage_prot.c b/arch/powerpc/mm/book3s64/subpage_prot.c
index 9ba07e55c489..7876b316138b 100644
--- a/arch/powerpc/mm/book3s64/subpage_prot.c
+++ b/arch/powerpc/mm/book3s64/subpage_prot.c
@@ -143,9 +143,12 @@ static void subpage_mark_vma_nohuge(struct mm_struct *mm, unsigned long addr,
 unsigned long len)
 {
 	struct vm_area_struct *vma;
+	static const struct mm_walk_ops ops = {
+		.pmd_entry = subpage_walk_pmd_entry,
+	};
 	struct mm_walk subpage_proto_walk = {
+		.ops = &ops,
 		.mm = mm,
-		.pmd_entry = subpage_walk_pmd_entry,
 	};
 
 	/*
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index 1e668b95e0c6..9e0feeb469c2 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -2523,9 +2523,14 @@ static int __zap_zero_pages(pmd_t *pmd, unsigned long start,
 
 static inline void zap_zero_pages(struct mm_struct *mm)
 {
-	struct mm_walk walk = { .pmd_entry = __zap_zero_pages };
+	static const struct mm_walk_ops ops = {
+		.pmd_entry = __zap_zero_pages,
+	};
+	struct mm_walk walk = {
+		.ops = &ops,
+		.mm = mm,
+	};
 
-	walk.mm = mm;
 	walk_page_range(0, TASK_SIZE, &walk);
 }
 
@@ -2591,11 +2596,15 @@ static int __s390_enable_skey_hugetlb(pte_t *pte, unsigned long addr,
 
 int s390_enable_skey(void)
 {
-	struct mm_walk walk = {
+	static const struct mm_walk_ops ops = {
 		.hugetlb_entry = __s390_enable_skey_hugetlb,
 		.pte_entry = __s390_enable_skey_pte,
 	};
 	struct mm_struct *mm = current->mm;
+	struct mm_walk walk = {
+		.ops = &ops,
+		.mm = mm,
+	};
 	struct vm_area_struct *vma;
 	int rc = 0;
 
@@ -2614,7 +2623,6 @@ int s390_enable_skey(void)
 	}
 	mm->def_flags &= ~VM_MERGEABLE;
 
-	walk.mm = mm;
 	walk_page_range(0, TASK_SIZE, &walk);
 
 out_up:
@@ -2635,10 +2643,15 @@ static int __s390_reset_cmma(pte_t *pte, unsigned long addr,
 
 void s390_reset_cmma(struct mm_struct *mm)
 {
-	struct mm_walk walk = { .pte_entry = __s390_reset_cmma };
+	static const struct mm_walk_ops ops = {
+

Re: [git pull] drm pull for 5.3-rc1

2019-07-15 Thread Linus Torvalds
On Mon, Jul 15, 2019 at 11:38 AM Dave Airlie  wrote:
>
> The reason I was waiting for the HMM tree to land, is a single silent
> merge conflict needs to be resolved after merging this as below.

There's more than that there.

For example, the DRM_AMDGPU_USERPTR config has a

depends on ARCH_HAS_HMM
select HMM_MIRROR

and that won't work any more. The hmm tree changed the requirements to be

depends on HMM_MIRROR

instead.

Now, arguably the hmm tree change in this respect is an annoyance -
the old model was much more user-friendly where the drivers that
wanted HMM would just select it when it was available.

See commit 43535b0aefab ("mm: remove the HMM config option").

I've done the merge, but my tests are still on-going. And after I've
finished those, I'll compare against your suggested merge to see if I
missed anything in turn..

 Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PULL] drm-next fixes for -rc1

2019-07-19 Thread Linus Torvalds
On Fri, Jul 19, 2019 at 8:43 AM Daniel Vetter  wrote:
>
> drm fixes for -rc1:
>
> nouveau:
> - bugfixes + TU116 enabling (minor iteration):w

Asking the important question:

 - WTH is that ":w" thing?

I have a theory that it's just a "I'm confused by 'vi'" marker.  Very
understandable.

But I'm also entertaining the possibility that it's a new "whistling
guy" emoticon. Or maybe a "hungry baby bird face" emoticon.

Admittedly not a _great_ new addition to the emoticons I've seen, but
hey, I'm not judging. Much.

I left it in the merge message for posterity, regardless.

Linus


Re: [PATCH] fbdev: Ditch fb_edid_add_monspecs

2019-07-21 Thread Linus Torvalds
On Sun, Jul 21, 2019 at 1:20 PM Daniel Vetter  wrote:
>
> It's dead code ever since

Lovely. Ack.

   Linus


Re: [git pull] drm for 5.19-rc1

2022-06-07 Thread Linus Torvalds
On Tue, Jun 7, 2022 at 3:23 AM Geert Uytterhoeven  wrote:
>
> These header files are heavy users of large constants lacking the "U"
> suffix e.g.:
>
> #define NB_ADAPTER_ID__SUBSYSTEM_ID_MASK 0xL

As Andreas says, this is not undefined behavior.

A hexadecimal integer constant will always get a type that fits the
actual value. So on a 32-bit architecture, because 0x doesn't
fit in 'long', it will automatically become 'unsigned long'.

Now, a C compiler might still warn about such implicit type
conversions, but I'd be a bit surprised if any version of gcc actually
would do that, because this behavior for hex constants is *very*
traditional, and very common.

It's also true that the type of the constant - but not the value -
will be different on 32-bit and 64-bit architectures (ie on 64-bit, it
will be plain "long" and never extended to "unsigned long", because
the hex value obviously fits just fine).

I don't see any normal situation where that really matters, since any
normal use will have the same result.

The case you point to at

  
https://lore.kernel.org/r/cak8p3a0qrihbr_2fq7uz5w2jmljv7czfrrarcmmjohvndj3...@mail.gmail.com

is very different, because the constant "1" is always just a plain
signed "int". So when you do "(1 << 31)", that is now a signed integer
with the top bit set, and so it will have an actual negative value,
and that can cause various problems (when right-shifted, or when
compared to other values).

But hexadecimal constants can be signed types, but they never have
negative values.

 Linus


Re: Report in ata_scsi_port_error_handler()

2022-02-16 Thread Linus Torvalds
On Tue, Feb 15, 2022 at 10:37 PM Damien Le Moal
 wrote:
>
> On 2/16/22 13:16, Byungchul Park wrote:
> > [2.051040] ===
> > [2.051406] DEPT: Circular dependency has been detected.
> > [2.051730] 5.17.0-rc1-00014-gcf3441bb2012 #2 Tainted: GW
> > [2.051991] ---
> > [2.051991] summary
> > [2.051991] ---
> > [2.051991] *** DEADLOCK ***
> > [2.051991]
> > [2.051991] context A
> > [2.051991] [S] (unknown)(&(&ap->eh_wait_q)->dmap:0)
> > [2.051991] [W] __raw_spin_lock_irq(&host->lock:0)
> > [2.051991] [E] event(&(&ap->eh_wait_q)->dmap:0)
> > [2.051991]
> > [2.051991] context B
> > [2.051991] [S] __raw_spin_lock_irqsave(&host->lock:0)
> > [2.051991] [W] wait(&(&ap->eh_wait_q)->dmap:0)
> > [2.051991] [E] spin_unlock(&host->lock:0)
>
> Sleeping with a spinlock held would be triggering warnings already, so
> these reports seem bogus to me.

Yeah, Matthew pointed out the same thing for another use-case, where
it looks like DEPT is looking at the state at the wrong point (not at
the scheduling point, but at prepare_to_sleep()).

This ata_port_wait() is the exact same pattern, ie we have

spin_lock_irqsave(ap->lock, flags);

while (ap->pflags & (ATA_PFLAG_EH_PENDING | ATA_PFLAG_EH_IN_PROGRESS)) {
prepare_to_wait(&ap->eh_wait_q, &wait, TASK_UNINTERRUPTIBLE);
spin_unlock_irqrestore(ap->lock, flags);
schedule();

and DEPT has incorrectly taken it to mean that 'ap->lock' is held
during the wait, when it is actually released before actually waiting.

For the spin-locks, this is all very obvious (because they'd have been
caught long ago by much simpler debug code), but the same
prepare_to_wait -> wait pattern can most definitely happen with
sleeping locks too, so they are all slightly suspect.

And yes, the detailed reports are hard to read because the locations
are given as "ata_port_wait_eh+0x52/0xc0". Running them through
scripts/decode_stacktrace.sh to turn them into filename and line
numbers - and also sort out inlining - would help a lot.

Byungchul, could you fix those two issues? Some of your reports may
well be entirely valid, but the hard-to-read hex offsets and the
knowledge that at least some of them are confused about how
prepare_to_wait -> wait actually works makes the motivation to look at
the details much less..

   Linus


Re: [PATCH RFC v6 00/21] DEPT(Dependency Tracker)

2022-05-04 Thread Linus Torvalds
On Wed, May 4, 2022 at 1:19 AM Byungchul Park  wrote:
>
> Hi Linus and folks,
>
> I've been developing a tool for detecting deadlock possibilities by
> tracking wait/event rather than lock(?) acquisition order to try to
> cover all synchonization machanisms.

So what is the actual status of reports these days?

Last time I looked at some reports, it gave a lot of false positives
due to mis-understanding prepare_to_sleep().

For this all to make sense, it would need to not have false positives
(or at least a very small number of them together with a way to sanely
get rid of them), and also have a track record of finding things that
lockdep doesn't.

Maybe such reports have been sent out with the current situation, and
I haven't seen them.

 Linus


Re: [git pull] drm for 5.20/6.0

2022-08-03 Thread Linus Torvalds
On Tue, Aug 2, 2022 at 10:38 PM Dave Airlie  wrote:
>
> This is a conflicty one. The late revert in 5.19 of the amdgpu buddy
> allocator causes major conflict fallout. The buddy allocator code in
> this one works, so the resolutions are usually just to take stuff from
> this. It might actually be cleaner if you revert
> 925b6e59138cefa47275c67891c65d48d3266d57 (Revert "drm/amdgpu: add drm
> buddy support to amdgpu") first in your tree then merge this.

Ugh, what a pain. The other conflicts are also due to just randomly
duplicated commits, with *usually* your drm tree having the superset
(so "just take yours" is the easy resolution), but not always (ie the
Intel firmware-69 mess was apparently not dealt with in the
development tree).

Nasty.

I think I have it resolved, am still doing a full build test, and will
then compare against what your suggested merge is.

  Linus


Re: [git pull] drm for 5.20/6.0

2022-08-03 Thread Linus Torvalds
On Wed, Aug 3, 2022 at 7:46 PM Linus Torvalds
 wrote:
>
> I think I have it resolved, am still doing a full build test, and will
> then compare against what your suggested merge is.

Hmm.

I end up with *almost* the same thing.

Except I ended up with a

select DRM_BUDDY

for the DRM_AMDGPU config entry, and you don't have that.

I *think* my version is correct, in that clearly the amdgpu driver now
uses that buddy logic (just doing a random "grep drm_buddy_block" to
see).

But this was messy enough to resolve that I think people should
double-check my end, and maybe I just got confused at some point in
the process.

And while I seem to have gotten the same result as you did on the i915
firmware side too, again, I'd like people to re-verify.

   Linus


Re: [git pull] drm for 5.20/6.0

2022-08-03 Thread Linus Torvalds
On Wed, Aug 3, 2022 at 8:37 PM Dave Airlie  wrote:
>
> Actually I did miss that so that looks good.

.. I wish it did, but I just actually test-booted my desktop with the
result, and it crashes the X server.  This seems to be the splat in
Xorg.0.log:

  (II) Initializing extension DRI2
  (II) AMDGPU(0): Setting screen physical size to 2032 x 571
  (EE)
  (EE) Backtrace:
  (EE) 0: /usr/libexec/Xorg (OsLookupColor+0x13d) [0x55b1dc61258d]
  (EE) 1: /lib64/libc.so.6 (__sigaction+0x50) [0x7f7972a3ea70]
  (EE) 2: /usr/lib64/xorg/modules/drivers/amdgpu_drv.so
(AMDGPUCreateWindow_oneshot+0x101) [0x7f797207ddd1]
  (EE) 3: /usr/libexec/Xorg (compIsAlternateVisual+0xdc4) [0x55b1dc545fa4]
  (EE) 4: /usr/libexec/Xorg (InitRootWindow+0x17) [0x55b1dc4e0047]
  (EE) 5: /usr/libexec/Xorg (miPutImage+0xd4c) [0x55b1dc49e60b]
  (EE) 6: /lib64/libc.so.6 (__libc_start_call_main+0x80) [0x7f7972a29550]
  (EE) 7: /lib64/libc.so.6 (__libc_start_main+0x89) [0x7f7972a29609]
  (EE) 8: /usr/libexec/Xorg (_start+0x25) [0x55b1dc49f2c5]
  (EE)
  (EE) Segmentation fault at address 0x4
  (EE)
Fatal server error:
  (EE) Caught signal 11 (Segmentation fault). Server aborting

so something is going horribly wrong. No kernel oops, though.

It works on my intel laptop, so it's amdgpu somewhere.

I guess I will start bisecting. Oy vey.

 Linus


Re: [git pull] drm for 5.20/6.0

2022-08-03 Thread Linus Torvalds
On Wed, Aug 3, 2022 at 8:53 PM Dave Airlie  wrote:
>
> > It works on my intel laptop, so it's amdgpu somewhere.
>
> I'll spin my ryzen up to see if I can reproduce, and test against the
> drm-next pre-merge tree as well.

So it's not my merge - I've had a bad result in the middle of the DRM
history too.

On a positive note, my arm64 machine works fine, but that's just using
fbdev so ...

But another datapoint to say that it's amdgpu-specific. Not that that
was really in doubt.

Linus


Re: [git pull] drm for 5.20/6.0

2022-08-03 Thread Linus Torvalds
On Wed, Aug 3, 2022 at 9:24 PM Dave Airlie  wrote:
>
> I've reproduced it, I'll send you a revert pile when I confirm it is
> the buddy allocator.

I've bisected it to 86bd6706c404..074293dd9f61 and don't see "buddy"
in any of those commits.

I'll do a few more. It's close enough already that it should be just
four more reboots to pinpoint exactly which commit breaks.

  Linus


Re: [git pull] drm for 5.20/6.0

2022-08-03 Thread Linus Torvalds
On Wed, Aug 3, 2022 at 9:27 PM Linus Torvalds
 wrote:
>
> I'll do a few more. It's close enough already that it should be just
> four more reboots to pinpoint exactly which commit breaks.

commit 5d945cbcd4b16a29d6470a80dfb19738f9a4319f is the first bad commit.

I think it's supposed to make no semantic changes, but it clearly does.

What a pain to figure out what's wrong in there, and I assume it
doesn't revert cleanly either.

Bringing in the guilty parties. See

  
https://lore.kernel.org/all/CAHk-=wj+yzaunxiewhfcrkbdlsqkizdr1q3yjlaqpo6avq2...@mail.gmail.com/

for the beginning of this thread.

Linus


Re: mainline build failure due to 6fdd2077ec03 ("drm/amd/amdgpu: add memory training support for PSP_V13")

2022-08-04 Thread Linus Torvalds
On Thu, Aug 4, 2022 at 12:35 AM Sudip Mukherjee (Codethink)
 wrote:
>
> I will be happy to test any patch or provide any extra log if needed.

It sounds like that file just needs to get a

#include 

there, and for some reason architectures other than alpha and mips end
up getting it accidentally through other headers.

Mind testing just adding that header file, and perhaps even sending a
patch if (when) that works for you?

Linus


Re: [PATCH] drm/amd/display: restore plane with no modifiers code.

2022-08-04 Thread Linus Torvalds
On Wed, Aug 3, 2022 at 10:50 PM Dave Airlie  wrote:
>
> With this applied, I get gdm back.

I can confirm that it makes thing work again for me too. Applied.

 Linus


Re: mainline build failure for x86_64 allmodconfig with clang

2022-08-04 Thread Linus Torvalds
On Thu, Aug 4, 2022 at 11:37 AM Sudip Mukherjee (Codethink)
 wrote:
>
> git bisect points to 3876a8b5e241 ("drm/amd/display: Enable building new 
> display engine with KCOV enabled").

Ahh. So that was presumably why it was disabled before - because it
presumably does disgusting things that make KCOV generate even bigger
stack frames than it already has.

Those functions do seem to have fairly big stack footprints already (I
didn't try to look into why, I assume it's partly due to aggressive
inlining, and probably some automatic structures on stack). But gcc
doesn't seem to make it all that much worse with KCOV (and my clang
build doesn't enable KCOV).

So it's presumably some KCOV-vs-clang thing. Nathan?

  Linus


Re: mainline build failure for x86_64 allmodconfig with clang

2022-08-04 Thread Linus Torvalds
On Thu, Aug 4, 2022 at 1:43 PM Nathan Chancellor  wrote:
>
> I do note that commit 1b54a0121dba ("drm/amd/display: Reduce stack size
> in the mode support function") did have a workaround for GCC. It appears
> clang will still inline mode_support_configuration(). If I mark it as
> 'noinline', the warning disappears in that file.

That sounds like probably the best option for now. Gcc does not inline
that function (at least for allmodconfig builds in my testing), so if
that makes clang match what gcc does, it seems a reasonable thing to
do.

Linus


Re: mainline build failure for x86_64 allmodconfig with clang

2022-08-07 Thread Linus Torvalds
On Sun, Aug 7, 2022 at 10:36 AM David Laight  wrote:
>
> Or just shoot the software engineer who thinks 100 arguments
> is sane. :-)

I suspect the issue is that it's not primarily a software engineer who
wrote that code.

Hardware people writing code are about as scary as software engineers
with a soldering iron.

   Linus


drm pull request (was Re: )

2022-05-06 Thread Linus Torvalds
On Thu, May 5, 2022 at 9:07 PM Dave Airlie  wrote:
>
> pretty quiet week, one fbdev, msm, kconfig, and 2 amdgpu fixes, about
> what I'd expect for rc6.

You're not getting the automated pr-tracker-bot response, because your
subject line was missing...

Just a "how did that happen" together with a "here's the manual
response instead".

  Linus


Re: Adding CI results to the kernel tree was Re: [RFC v2] drm/msm: Add initial ci/ subdirectory

2022-05-11 Thread Linus Torvalds
On Tue, May 10, 2022 at 10:07 PM Dave Airlie  wrote:
>
> > And use it to store expectations about what the drm/msm driver is
> > supposed to pass in the IGT test suite.
>
> I wanted to loop in Linus/Greg to see if there are any issues raised
> by adding CI results file to the tree in their minds, or if any other
> subsystem has done this already, and it's all fine.
>
> I think this is a good thing after our Mesa experience, but Mesa has a
> lot tighter integration here, so I want to get some more opinions
> outside the group.

Honestly, my immediate reaction is that I think it might be ok, but

 (a) are these things going to absolutely balloon over time?

 (b) should these not be separated out?

Those two issues kind of interact.

If it's a small and targeted test-suite, by all means keep it in the
kernel, but why not make it part of "tools/testing/selftests"

But if people expect this to balloon and we end up having megabytes of
test output, then I really think it should be a separate git tree.

A diffstat like this:

>  7 files changed, 791 insertions(+)

is not a problem at all. But I get the feeling that this is just the
tip of the iceberg, and people will want to not just have the result
files, but start adding actual *input* files that may be largely
automated stuff and may be tens of megabytes in size.

Because the result files on their own aren't really self-contained,
and then people will want to keep them in sync with the test-files
themselves, and start adding those, and now it *really* is likely very
unwieldy.

Or if that doesn't happen, and the actual input test files stay in a
separate CI repo, and then you end up having random coherency issues
with that CI repo, and it all gets to be either horribly messy, or the
result files in the kernel end up really stale.

So honestly, I personally don't see a good end result here.  This
particular small patch? *This* one looks fine to me, except I really
think tools/testing/selftests/gpu would be a much more logical place
for it.

But I don't see a way forward that is sane.

Can somebody argue otherwise?

Linus


Re: Adding CI results to the kernel tree was Re: [RFC v2] drm/msm: Add initial ci/ subdirectory

2022-05-11 Thread Linus Torvalds
On Wed, May 11, 2022 at 11:40 AM Rob Clark  wrote:
>
> It is missing in this revision of the RFC, but the intention is to
> have the gitlab-ci.yml point to a specific commit SHA in the
> gfx-ci/drm-ci[1] tree, to solve the problem of keeping the results in
> sync with the expectations.  Ie. a kernel commit would control moving
> to a new version of i-g-t (and eventually deqp and/or piglit), and at
> the same time make any necessary updates in the expectations files.

Wouldn't it then be better to just have the expectation files in the
ci tree too?

The kernel tree might have just the expected *failures* listed, if
there are any. Presumably the ci tree has to have the expected results
anyway, so what's the advantage of listing non-failures?

  Linus


Re: Adding CI results to the kernel tree was Re: [RFC v2] drm/msm: Add initial ci/ subdirectory

2022-05-11 Thread Linus Torvalds
On Wed, May 11, 2022 at 12:08 PM Linus Torvalds
 wrote:
>
> The kernel tree might have just the expected *failures* listed, if
> there are any. Presumably the ci tree has to have the expected results
> anyway, so what's the advantage of listing non-failures?

.. put another way: I think a list of "we are aware that these
currently fail" is quite reasonable for a development tree, maybe even
with a comment in the commit that created them about why they
currently fail.

That also ends up being very nice if you fix a problem, and the fix
commit might then remove the failure for the list, and that all makes
perfect sense.

But having just the raw output of "these are the expected CI results"
that is being done and specified by some other tree entirely - that
seems pointless and just noise to me. There's no actual reason to have
that kind of noise - and update that kind of noise - that I really
see.

Linus


Re: [git pull] drm for 5.19-rc1

2022-05-25 Thread Linus Torvalds
On Tue, May 24, 2022 at 11:07 PM Dave Airlie  wrote:
>
> AMD has started some new GPU support [...]

Oh Christ. Which means another set of auto-generated monster headers. Lovely.

  Linus


Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")

2022-05-27 Thread Linus Torvalds
On Fri, May 27, 2022 at 2:07 AM Sudip Mukherjee
 wrote:
>
> declared with attribute error: BUILD_BUG_ON failed: sizeof(*edid) != 
> EDID_LENGTH
>
> And, reverting it on top of mainline branch has fixed the build failure.

Hmm. That BUILD_BUG_ON() looks entirely correct, and if that sizeof()
doesn't work, then the code doesn't work.

I'm not seeing what could go wrong in there, with all the structures I
see being marked as __packed__.

I wonder if the union in 'struct detailed_timing' also wants that
"__attribute__((packed))" but I also wonder what it is that would make
this fail on spear3xx but not elsewhere.

Very strange. It would be interesting to know where that sizeof goes
wrong, but it would seem to be something very peculiar to your build
environment (either that config, or your compiler).

 Linus


Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")

2022-05-27 Thread Linus Torvalds
On Fri, May 27, 2022 at 4:41 PM Sudip Mukherjee
 wrote:
>
> I just tested with various values, sizeof(*edid) is 144 bytes at that place.

Hmm. What compiler do you have? Because it seems very broken.

You don't actually have to try with various sizes, you could have just
done something like

 int size_of_edid(const struct edid *edid)
 {
return sizeof(*edid);
 }

and then "make drivers/gpu/drm/drm_edid.s" to generate assembly and
see what it looks like (obviously removing the BUG_ON() in order to
build).

That obviously generates code like

movl$128, %eax
ret

for me, and looking at the definition of that type I really can't see
how it would ever generate anything else. But it's apparently not even
close for you.

I suspect some of the structs inside of that 'struct edid' end up
getting aligned, despite the '__attribute__((packed))'. For example,
'struct est_timings' is supposed to be just 3 bytes, and it's at an
odd offset too (byte offset 35 in the 'struct edid' if I did the math
correctly).

But it obviously doesn't happen for me or for most other people, so
it's something in your setup. Unusual compiler?

  Linus


Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")

2022-05-28 Thread Linus Torvalds
On Sat, May 28, 2022 at 5:13 AM Sudip Mukherjee
 wrote:
>
> just tried this with
> make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- drivers/gpu/drm/drm_edid.s
>
> size_of_edid:
> mov r0, #144@,
> ldmfd   sp, {fp, sp, pc}@

So digging a bit deeper - since I have am arm compiler after all - I
note that 'sizeof(detailed_timings)' is 88.

Which is completely wrong. It should be 72 bytes (an array of 4
structures, each 18 bytes in size).

I have not dug deeper, but that is clearly the issue.

Now, why that only happens on that spear3xx_defconfig, I have no idea.

 Linus


Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")

2022-05-28 Thread Linus Torvalds
On Sat, May 28, 2022 at 10:40 AM Linus Torvalds
 wrote:
>
> So digging a bit deeper - since I have am arm compiler after all - I
> note that 'sizeof(detailed_timings)' is 88.

Hmm.

sizeof() both

detailed_timings[0].data.other_data.data.range.formula.gtf2

and

detailed_timings[0].data.other_data.data.range.formula.cvt

is 7.

But the *union* of those things is

detailed_timings[0].data.other_data.data.range.formula

and its size is 8 (despite having an alignment of just 1).

The attached patch would seem to fix it for me.

Not very much tested, and I have no idea what it is that triggers this
only on spear3xx_defconfig.

Some ARM ABI issue that is triggered by some very particular ARM
compiler flag enabled by that config?

Adding some ARM (and SPEAR, and SoC) people in case they have any idea.

This smells like a compiler bug triggered by "there's a 16-bit member
field in that gtf2 structure, and despite it being packed and aligned
to 1, we somehow still align the size to 2".

I dunno. But marking those unions packed too doesn't seem wrong, and
does seem to fix it.

  Linus
 include/drm/drm_edid.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
index c3204a58fb09..b2756753370b 100644
--- a/include/drm/drm_edid.h
+++ b/include/drm/drm_edid.h
@@ -121,7 +121,7 @@ struct detailed_data_monitor_range {
 			u8 supported_scalings;
 			u8 preferred_refresh;
 		} __attribute__((packed)) cvt;
-	} formula;
+	} __attribute__((packed)) formula;
 } __attribute__((packed));
 
 struct detailed_data_wpindex {
@@ -154,7 +154,7 @@ struct detailed_non_pixel {
 		struct detailed_data_wpindex color;
 		struct std_timing timings[6];
 		struct cvt_timing cvt[4];
-	} data;
+	} __attribute__((packed)) data;
 } __attribute__((packed));
 
 #define EDID_DETAIL_EST_TIMINGS 0xf7
@@ -172,7 +172,7 @@ struct detailed_timing {
 	union {
 		struct detailed_pixel_timing pixel_data;
 		struct detailed_non_pixel other_data;
-	} data;
+	} __attribute__((packed)) data;
 } __attribute__((packed));
 
 #define DRM_EDID_INPUT_SERRATION_VSYNC (1 << 0)


Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")

2022-05-28 Thread Linus Torvalds
On Sat, May 28, 2022 at 11:59 AM Arnd Bergmann  wrote:
>
> It's CONFIG_ARM_AEABI, which is normally set everywhere. Without this
> option, you the kernel is built for the old 'OABI' that forces all non-packed
> struct members to be at least 16-bit aligned.

Looks like forced word (32 bit) alignment to me.

I wonder how many other structures that messes up, but I committed the
EDID fix for now.

This has presumably been broken for a long time, but maybe the
affected targets don't typically use EDID and kernel modesetting, and
only use some fixed display setup instead.

Those structure definitions go back a _loong_ time (from a quick 'git
blame' I see November 2008).

But despite that, I did not mark my fix 'cc:stable' because I don't
know if any of those machines affected by this bad arm ABI issue could
possibly care.

At least my tree hopefully now builds on them, with the BUILD_BUG_ON()
that uncovered this.

   Linus


Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")

2022-05-31 Thread Linus Torvalds
On Tue, May 31, 2022 at 1:04 AM Arnd Bergmann  wrote:
>
> As an experiment: what kind of results would we get when looking
> for packed structures and unions that contain any of these:

Yeah, any atomics or locks should always be aligned, and won't even
work (or might be *very* slow) on multiple architectures. Even x86 -
which does very well on unaligned data - reacts badly to sufficiently
unaligned atomics (ie cacheline crossing).

I don't think we have that. Not only because it would already cause
breakage, but simply because the kinds of structures that people pack
aren't generally the kind that contain these kinds of things.

That said, you might have a struct that is packed, but that
intentionally aligns parts of itself, so it *could* be valid.

But it would probably not be a bad idea to check that packed
structures/unions don't have atomic types or locks in them. I _think_
we're all good, but who knows..

Linus


Re: mainline build failure due to f1e4c916f97f ("drm/edid: add EDID block count and size helpers")

2022-06-01 Thread Linus Torvalds
On Wed, Jun 1, 2022 at 3:28 PM Keisuke Nishimura
 wrote:
>
>
> I found 13 definitions of packed structure that contains:
>  > - spinlock_t
>  > - atomic_t
>  > - dma_addr_t
>  > - phys_addr_t
>  > - size_t
>  > - struct mutex
>  > - struct device
>
> - raw_spinlock_t

Ok, so I don't think dma_addr_t/phys_addr_t/size_t are problematic,
they are just regular integers.

And 'struct device' is problematic only as it then contains any of the
atomic types (which it does)

> security/tomoyo/common.h: atomic_t in tomoyo_shared_acl_head
> drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h: spinlock_t in 
> key_map
> arch/s390/include/asm/kvm_host.h: atomic_t in kvm_s390_sie_block

So these do look problematic.

I'm not actually clear on why tomoyo_shared_acl_head would be packed.
That makes no sense to me.

Same goes for key_map, it's not clear what the reason for that
__packed is, and it's clearly bogus. It might work, almost by mistake,
but it's wrong to try to pack that spinlock_t.

The s390 kvm use actually looks fine: the structure is packed, but
it's also aligned, and the spin-lock is at the beginning, so the
"packing" part is about the other members, not the first one.

The two that look wrong look like they will probably work anyway
(they'll presumably be effectively word-aligned, and that's sufficient
for spinlocks in practice).

But let's cc the tomoyo and chelsio people.

 Linus


Annoying AMDGPU boot-time warning due to simplefb / amdgpu resource clash

2022-06-26 Thread Linus Torvalds
So this has been going on for a while, and it's quite annoying.

At bootup, my main desktop (Threadripper 3970X with radeon graphics)
now complains about

  resource sanity check: requesting [mem 0xd000-0xdfff], which
spans more than BOOTFB [mem 0xd000-0xd02f]

and then gives me a nasty callchain that is basically the amdgpu probe
sequence ending in amdgpu_bo_init() doing the
arch_io_reserve_memtype_wc() which is then what complains.

That "BOOTFB" resource is from sysfb_simplefb.c, and I think what
started triggering this is commit c96898342c38 ("drivers/firmware:
Don't mark as busy the simple-framebuffer IO resource").

Because it turns out that that removed the IORESOURCE_BUSY, which in
turn is what makes the resource conflict code complain about it now,
because

/*
 * if a resource is "BUSY", it's not a hardware resource
 * but a driver mapping of such a resource; we don't want
 * to warn for those; some drivers legitimately map only
 * partial hardware resources. (example: vesafb)
 */

so the issue is that now the resource code - correctly - says "hey,
you have *two* conflicting driver mappings".

And that commit claims it did it because "which can lead to drivers
requesting the same memory resource to fail", but - once again - the
link in the commit that might actually tell more is just one of those
useless patch submission links again.

So who knows why that commit was actually done, but it's causing annoyance.

If simplefb is actually still using that frame buffer, it's a problem.
If it isn't, then maybe that resource should have been released?

I really think that commit c96898342c38 is buggy. It talks about "let
drivers to request it as busy instead", but then it registers a
resource that isn't actually a proper real resource. It's just a
random incomplete chunk of the actual real thing, so it will still
interfere with resource allocation, and in fact now interferes even
with that "set memtype" because of this valid warning.

 Linus


Re: Annoying AMDGPU boot-time warning due to simplefb / amdgpu resource clash

2022-06-27 Thread Linus Torvalds
On Mon, Jun 27, 2022 at 1:02 AM Javier Martinez Canillas
 wrote:
>
> The flag was dropped because it was causing drivers that requested their
> memory resource with pci_request_region() to fail with -EBUSY (e.g: the
> vmwgfx driver):
>
> https://www.spinics.net/lists/dri-devel/msg329672.html

See, *that* link would have been useful in the commit.

Rather than the useless link it has.

Anyway, removing the busy bit just made things worse.

> > If simplefb is actually still using that frame buffer, it's a problem.
> > If it isn't, then maybe that resource should have been released?
>
> It's supposed to be released once amdgpu asks for conflicting framebuffers
> to be removed calling drm_aperture_remove_conflicting_pci_framebuffers().

That most definitely doesn't happen. This is on a running system:

  [torvalds@ryzen linux]$ cat /proc/iomem | grep BOOTFB
- : BOOTFB

so I suspect that the BUSY bit was never the problem - even for
vmwgfx). The problem was that simplefb doesn't remove its resource.

Guys, the *reason* for resource management is to catch people that
trample over each other's resources.

You literally basically disabled the code that checked for it by
removing the BUSY flag, and just continued to have conflicting
resources.

That isn't a "fix", that is literally "we are ignoring and breaking
the whole reason that the resource tree exists, but we'll still use it
for no good reason".

Yeah, yeah, most modern drivers ignore the IO resource tree, because
they end up working on another resource level entirely: they work on
not the IO resources, but on the "driver level" instead, and just
attach to PCI devices.

So these days, few enough drivers even care about the IO resource
tree, and it's mostly used for (a) legacy devices (think ISA) and (b)
the actual bus resource handling (so the PCI code itself uses it to
sort out resource use and avoid conflicts, but PCI drivers themselves
generally then don't care, because the bus has "taken care of it".

So that's why the amdgpu driver itself doesn't care about resource
allocations, and we only get a warning for that memory type case, not
for any deeper resource case.

And apparently the vmwgfx driver still uses that legacy "let's claim
all PCI resources in the resource tree" instead of just claiming the
device itself. Which is why it hit this whole BOOTFB resource thing
even harder.

But the real bug is that BOOTFB seems to claim this resource even
after it is done with it and other drivers want to take over.

Not the BUSY bit.

 Linus


Re: [git pull] drm fixes for 5.18-rc2

2022-04-07 Thread Linus Torvalds
On Thu, Apr 7, 2022 at 2:20 PM Dave Airlie  wrote:
>
> I think this should fix the amdgpu splat you have been seeing since rc1.

Not the machine I'm currently traveling with, but I'll double-check
when I'm back home.

Thanks,
Linus


Re: [git pull] drm fixes for 5.18-rc3

2022-04-14 Thread Linus Torvalds
On Thu, Apr 14, 2022 at 2:33 PM Dave Airlie  wrote:
>
>   git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2022-04-15

.. and since I'm  now back home, I can confirm that my boot-time splat
I saw before is all gone.

It presumably went away with the previous pull request already, but I
hadn't remembered to check the issue until this pull reminded me about
the issue.

Thanks,
Linus


Re: [PATCH v13 5/9] drm/i915: Check for integer truncation on scatterlist creation

2022-09-28 Thread Linus Torvalds
On Wed, Sep 28, 2022 at 1:15 AM Gwan-gyeong Mun
 wrote:
>
> +   if (check_assign(obj->base.size >> PAGE_SHIFT, &npages))
> +   return -E2BIG;

I have to say, I find that new "check_assign()" macro use to be disgusting.

It's one thing to check for overflows.

It's another thing entirely to just assign something to a local variable.

This disgusting "let's check and assign" needs to die. It makes the
code a completely unreadable mess. The "user" wersion is even worse.

If you worry about overflow, then use a mix of

 (a) use a sufficiently large type to begin with

 (b) check for value range separately

and in this particular case, I also suspect that the whole range check
should have been somewhere else entirely - at the original creation of
that "obj" structure, not at one random end-point where it is used.

In other words, THIS WHOLE PATCH is just end-points checking the size
requirements of that "base.size" thing much too late, when it should
have been checked originally for some "maximum acceptable base size"
instead.

And that "maximum acceptable base size" should *not* be about "this is
the size of the variables we use". It should be a sanity check of
"this value is sane and fits in sane use cases".

Because "let's plug security checks" is most definitely not about
picking random assignments and saying "let's check this one". It's
about trying to catch things earlier than that.

Kees, you need to reign in the craziness in overflow.h.

 Linus


Re: [git pull] drm fixes for 5.19-rc7

2022-07-16 Thread Linus Torvalds
On Fri, Jul 15, 2022 at 2:09 PM Nathan Chancellor  wrote:
>
> On Fri, Jul 15, 2022 at 01:36:17PM +1000, Dave Airlie wrote:
> > Matthew Auld (1):
> >   drm/i915/ttm: fix sg_table construction
>
> This patch breaks i386_defconfig with both GCC and clang:
>
>   ld: drivers/gpu/drm/i915/i915_scatterlist.o: in function 
> `i915_rsgt_from_mm_node':
>   i915_scatterlist.c:(.text+0x1a7): undefined reference to `__udivdi3'

Yeah, we definitely don't want arbitrary 64x64 divides in the kernel,
and the fact that we don't include libgcc.a once again caught this
horrid issue.

The offending code is

if (sg_alloc_table(st, DIV_ROUND_UP(node->size, segment_pages),
   GFP_KERNEL)) {

and I have to say that all of those games with "u64 page_alignment"
that commit aff1e0b09b54 ("drm/i915/ttm: fix sg_table construction")
introduces are absolutely disgusting.

And they are just *pointlessly* disgusting.

Why is that "page_alignment" a "u64"?

And why is it a "size", instead of being a "number of bits"?

The code literally does things like

const u64 max_segment = round_down(UINT_MAX, page_alignment);

which means that

 (a) page_alignment must be a power-of-two for this to work
(round_down() only works in powers of two)

 (b) the result obviously must fit in an "unsigned int", since it's
rounding down a UINT_MAX!

So (a) makes it doubtful that "page_alignment" should have been a
value (as opposed to mask), and (b) then questions why was that made
an "u64" value when it cannot have a u64 range?

And if max_segments cannot have a 64-bit range, then segment_pages here:

u64 segment_pages = max_segment >> PAGE_SHIFT;

sure cannot.

Fixing those then uncovers other things:

len = min(block_size, max_segment - sg->length);

now complains about mixing types ("max_segment - sg->length" being
u32), because 'block_size' is 64, bit, and that does seem to make some
amount of sense:

block_size = node->size << PAGE_SHIFT;

with the 'node->size' being from drm_mm_node, and that size is a
'u64'. That I *could* see being more than 32 bits on a 64-bit
architecture. Ok.

But then that means that 'len' cannot be a 64-bit value either, and it
should probably have been

u32 len;
..
len = min_t(u64, block_size, max_segment - sg->length);

and that would just have been a lot nicer on 32-bit x86, avoiding a
lot of pointlessly 64-bit things.

That said, even those type simplifications do not fix the fundamental
issue. That "DIV_ROUND_UP()" still ends up being a 64-bit divide,
although now it's at least a "64-by-32" bit divide.

Which needs to be handled by "do_div()" rather than the generic
DIV_ROUND_UP() helper, because sadly, at least gcc still generates a
full __udivdi3() even for the 64-by-32 divides.

Can Intel GPU people please take a look?

 Linus


Re: [git pull] drm fixes for 5.19-rc7

2022-07-16 Thread Linus Torvalds
On Sat, Jul 16, 2022 at 2:35 PM Linus Torvalds
 wrote:
>
> That said, even those type simplifications do not fix the fundamental
> issue. That "DIV_ROUND_UP()" still ends up being a 64-bit divide,
> although now it's at least a "64-by-32" bit divide.

Hmm. The "DIV_ROUND_UP()" issue could be solved by just making the
rule be that the max_segment size is always a power of two.

Then you don't need the (expensive!) DIV_ROUND_UP(), and can just use
the regular "round_up()" that works on powers-of-two.

And the simplest way to do that is to just make "max_segments" be 2GB.

The whole "round_down(UINT_MAX, page_alignment)" seems entirely
pointless. Do you really want segments that are some odd number just
under the 4GB mark, and force expensive divides?

For consistency, I used the same value in
i915_rsgt_from_buddy_resource(). I have no idea if that makes sense.

Anyway, the attached patch is COMPLETELY UNTESTED. But it at least
seems to compile. Maybe.

  Linus
 drivers/gpu/drm/i915/i915_scatterlist.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_scatterlist.c b/drivers/gpu/drm/i915/i915_scatterlist.c
index f63b50b71e10..830dcd833d4e 100644
--- a/drivers/gpu/drm/i915/i915_scatterlist.c
+++ b/drivers/gpu/drm/i915/i915_scatterlist.c
@@ -81,7 +81,9 @@ struct i915_refct_sgt *i915_rsgt_from_mm_node(const struct drm_mm_node *node,
 	  u64 region_start,
 	  u64 page_alignment)
 {
-	const u64 max_segment = round_down(UINT_MAX, page_alignment);
+	// Make sure max_segment (and thus segment_pages) is
+	// a power of two that fits in 32 bits.
+	const u64 max_segment = 1ul << 31;
 	u64 segment_pages = max_segment >> PAGE_SHIFT;
 	u64 block_size, offset, prev_end;
 	struct i915_refct_sgt *rsgt;
@@ -96,7 +98,7 @@ struct i915_refct_sgt *i915_rsgt_from_mm_node(const struct drm_mm_node *node,
 
 	i915_refct_sgt_init(rsgt, node->size << PAGE_SHIFT);
 	st = &rsgt->table;
-	if (sg_alloc_table(st, DIV_ROUND_UP(node->size, segment_pages),
+	if (sg_alloc_table(st, round_up(node->size, segment_pages),
 			   GFP_KERNEL)) {
 		i915_refct_sgt_put(rsgt);
 		return ERR_PTR(-ENOMEM);
@@ -159,7 +161,7 @@ struct i915_refct_sgt *i915_rsgt_from_buddy_resource(struct ttm_resource *res,
 {
 	struct i915_ttm_buddy_resource *bman_res = to_ttm_buddy_resource(res);
 	const u64 size = res->num_pages << PAGE_SHIFT;
-	const u64 max_segment = round_down(UINT_MAX, page_alignment);
+	const u64 max_segment = 1u << 31;
 	struct drm_buddy *mm = bman_res->mm;
 	struct list_head *blocks = &bman_res->blocks;
 	struct drm_buddy_block *block;


Re: [GIT PULL] Please pull hmm changes

2019-12-18 Thread Linus Torvalds
On Wed, Dec 18, 2019 at 6:59 AM Jason Gunthorpe  wrote:
>
> Do you think calling it 'mmn_subscriptions' is clear?

Why do you want that "mmn"?

Guys, the "mmn" part is clear from the _context_. The function name is

When the function name is something like "mmu_interval_read_begin()",
and the filename is "mm/mmu_notifier.c", you do NOT NEED silly
prefixes like "mmn" for local variables.

They add NOTHING.

And they make the code an illegible mess.

Yes, global function names need to be unique, and if they aren't
really core, they want some prefix that explains the context, because
global functions are called from _outside_ the context that explains
them.

But if it's a "struct mmu_interval_notifier" pointer, and it's inside
a file that is all about these pointers, it shouldn't be called
"mmn_xyz".  That's not a name. That's line noise.

So call it a "notifier". Maybe even an "interval_notifier" if you
don't mind the typing. Name it by something _descriptive_. And if you
want.

And "subscriptions" is a lovely name. What does the "mmn" buy you?

Just to clarify: the names I really hated were the local variable
names (and the argument names) that were all entirely within the
context of mm/mmu_notifier.c. Calling something "mmn_mm" is a random
jumble of letters that looks more like you're humming than you're
speaking.

Don't mumble. Speak _clearly_.

The other side of "short names" is that some non-local conventions
exist because they are _so_ global. So if it's just a mm pointer, call
it "mm". We do have some very core concepts in the kernel that
permeate _everything_, and those core things we tend to have very
short names for. So whenever you're working with VM code, you'll see
lots of small names like "mm", "vma", "pte" etc. They aren't exactly
clear, but they are _globally_ something you read and learn when you
work on the Linux VM code.

That's very diofferent from "mmn" - the "mmn" thing isn't some global
shorthand, it is just a local abomination.

So "notifier_mm" makes sense - it's the mm for a notifier. But
"mmn_notifier" does not, because "mmn" only makes sense in a local
context, and in that local context it's not any new information at
all.

See the difference? Two shorthands, but one makes sense and adds
information, while the other is just unnecessary and pointless and
doesn't add anything at all.

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [GIT PULL] Please pull hmm changes

2019-12-18 Thread Linus Torvalds
On Wed, Dec 18, 2019 at 10:37 AM Jason Gunthorpe  wrote:
>
> I think this is what you are looking for?

I think that with these names, I would have had an easier time reading
the original patch that made me go "Eww", yes.

Of course, now that it's just a rename patch, it's not like the patch
is all that legible, but yeah, I think the naming is saner.

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v18 00/19] kunit: introduce KUnit, the Linux kernel unit testing framework

2019-10-04 Thread Linus Torvalds
On Fri, Oct 4, 2019 at 2:39 PM Theodore Y. Ts'o  wrote:
>
> This question is primarily directed at Shuah and Linus
>
> What's the current status of the kunit series now that Brendan has
> moved it out of the top-level kunit directory as Linus has requested?

We seemed to decide to just wait for 5.5, but there is nothing that
looks to block that. And I encouraged Shuah to find more kunit cases
for when it _does_ get merged.

So if the kunit branch is stable, and people want to start using it
for their unit tests, then I think that would be a good idea, and then
during the 5.5 merge window we'll not just get the infrastructure,
we'll get a few more users too and not just examples.

 Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v18 00/19] kunit: introduce KUnit, the Linux kernel unit testing framework

2019-10-06 Thread Linus Torvalds
On Sun, Oct 6, 2019 at 9:55 AM Theodore Y. Ts'o  wrote:
>
> Well, one thing we *can* do is if (a) if we can create a kselftest
> branch which we know is stable and won't change, and (b) we can get
> assurances that Linus *will* accept that branch during the next merge
> window, those subsystems which want to use kself test can simply pull
> it into their tree.

Yes.

At the same time, I don't think it needs to be even that fancy. Even
if it's not a stable branch that gets shared between different
developers, it would be good to just have people do a "let's try this"
throw-away branch to use the kunit functionality and verify that
"yeah, this is fairly convenient for ext4".

It doesn't have to be merged in that form, but just confirmation that
the infrastructure is helpful before it gets merged would be good.

   Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [v4] vgacon: Fix a UAF in vgacon_invert_region

2020-03-06 Thread Linus Torvalds
On Fri, Mar 6, 2020 at 4:38 AM Daniel Vetter  wrote:
>
> Linus, since this missed the -fixes pull from Dave maybe double check I'm
> not grossly wrong here and apply directly?

Hmm. I don't have the original email, mind just sending it to me (with
the proper added sign-off chain)?

It does strike me that there's nothing that seems to check for
overflow in the "(width << 1) * height" calculation. Hmm?

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [v4] vgacon: Fix a UAF in vgacon_invert_region

2020-03-06 Thread Linus Torvalds
On Fri, Mar 6, 2020 at 7:12 AM Daniel Vetter  wrote:
>
> I'll stuff it into a pull and throw that your way, that's simplest.

Thanks.

> btw we did add dri-devel to lore a while back, so should be there:

Indeed. I tried (incompetently) to look up your message ID, but I
didn't put the dri-devel part and saw the 404, and assumed it wasn't
there.

My bad.

> > It does strike me that there's nothing that seems to check for
> > overflow in the "(width << 1) * height" calculation. Hmm?
>
> Indeed I failed to hunt for that :-/ But I think we're good, in
> vc_do_resize() we have
>
> if (cols > VC_RESIZE_MAXCOL || lines > VC_RESIZE_MAXROW)
> return -EINVAL;

Perfect. I just looked at the quoted patch itself.

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Bug 206175] Fedora >= 5.4 kernels instantly freeze on boot without producing any display output

2020-01-14 Thread Linus Torvalds
Dave, Alex,
 there's an odd bugreport on bugzilla, where Artem is seeing an odd
early-boot failure.

That one almost certainly has nothing to do with you guys, but see the
later odd (and apparently unrelated) report about some AMD graphics
firmware issue and a black screen.

 Linus

On Tue, Jan 14, 2020 at 1:17 PM  wrote:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=206175
>
> --- Comment #9 from Matt Yates (m...@fast-mail.org) ---
> My BIOS vendor is "Insyde Corp.".  There is a TPM module.  When I disabled it,
> it caused my EFI boot entry to disappear, so I couldn't test it.
>
> However, I think we may have two separate problems.  I switched back from
> Fedora to Debian Testing, and the Debian installer upgraded the kernel from 
> 5.3
> to 5.4 series prior to the first boot.  The 5.4 kernel booted up on first 
> boot.
>  I could see boot messages scrolling, but the screen went to a black while
> trying to load lightdm because I did not have the "firmware-amd-graphics"
> package installed required for graphics.  After installing the amd graphics
> package, the 5.4 kernel freezes as before (right at the start of the boot
> process).  The 5.3 kernel boots as normal, and graphics work.
>
> The "firmware-amd-graphics" package (version 20190717-2) was the only thing I
> changed, so I guess the problem must be some sort of conflict with the amd
> graphics firmware and the 5.4 kernel.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3 2/7] uaccess: Tell user_access_begin() if it's for a write or not

2020-01-23 Thread Linus Torvalds
On Thu, Jan 23, 2020 at 4:59 AM Christophe Leroy
 wrote:
>
> On 32 bits powerPC (book3s/32), only write accesses to user are
> protected and there is no point spending time on unlocking for reads.

Honestly, I'm starting to think that 32-bit ppc just needs to look
more like everybody else, than make these changes.

We used to have a read/write argument to the old "verify_area()" and
"access_ok()" model, and it was a mistake. It was due to odd i386 user
access issues. We got rid of it. I'm not convinced this is any better
- it looks very similar and for odd ppc access issues.

But if we really do want to do this, then:

> Add an argument to user_access_begin() to tell when it's for write and
> return an opaque key that will be used by user_access_end() to know
> what was done by user_access_begin().

You should make it more opaque than "unsigned long".

Also, it shouldn't be a "is this a write". What if it's a read _and_ a
write? Only a write? Only a read?

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3 2/7] uaccess: Tell user_access_begin() if it's for a write or not

2020-01-23 Thread Linus Torvalds
On Thu, Jan 23, 2020 at 11:47 AM christophe leroy
 wrote:
>
> I'm going to leave it aside, at least for the time being, and do it as a
> second step later after evaluating the real performance impact. I'll
> respin tomorrow in that way.

Ok, good.

>From a "narrow the access window type" standpoint it does seem to be a
good idea to specify what kind of user accesses will be done, so I
don't hate the idea, it's more that I'm not convinced it matters
enough.

On x86, we have made the rule that user_access_begin/end() can contain
_very_ few operations, and objtool really does enforce that. With
objtool and KASAN, you really end up with very small ranges of
user_access_begin/end().

And since we actually verify it statically on x86-64, I would say that
the added benefit of narrowing by access type is fairly small. We're
not going to have complicated code in that user access region, at
least in generic code.

> > Also, it shouldn't be a "is this a write". What if it's a read _and_ a
> > write? Only a write? Only a read?
>
> Indeed that was more: does it includes a write. It's either RO or RW

I would expect that most actual users would be RO or WO, so it's a bit
odd to have those choices.

Of course, often writing ends up requiring read permissions anyway if
the architecture has problems with alignment handling or similar, but
still... The real RW case does exist conceptually (we have
"copy_in_user()", after all), but still feels like it shouldn't be
seen as the only _interface_ choice.

IOW, an architecture may decide to turn WO into RW because of
architecture limitations (or, like x86 and arm, ignore the whole
RO/RW/WO _entirely_ because there's just a single "allow user space
accesses" flag), but on an interface layer if we add this flag, I
really think it should be an explicit "read or write or both".

So thus my "let's try to avoid doing it in the first place, but if we
_do_ do this, then do it right" plea.

 Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm for 5.6-rc1

2020-01-30 Thread Linus Torvalds
On Wed, Jan 29, 2020 at 9:58 PM Dave Airlie  wrote:
>
> It has two known conflicts, one in i915_gem_gtt, where you should juat
> take what's in the pull (it looks messier than it is),

That doesn't seem right. If I do that, I lose the added GEM_BUG_ON()'s.

I think the proper merge resolution does this:

diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index f10b2c41571c..f4fec7eb4064 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -131,6 +131,7 @@ static void gen6_ppgtt_insert_entries(struct
i915_address_space *vm,

vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt));
do {
+   GEM_BUG_ON(iter.sg->length < I915_GTT_PAGE_SIZE);
vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma);

iter.dma += I915_GTT_PAGE_SIZE;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 077b8f7cf6cb..4d1de2d97d5c 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -379,6 +379,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
do {
+   GEM_BUG_ON(iter->sg->length < I915_GTT_PAGE_SIZE);
vaddr[gen8_pd_index(idx, 0)] = pte_encode | iter->dma;

iter->dma += I915_GTT_PAGE_SIZE;
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 79096722ce16..531d501be01f 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -787,7 +787,7 @@ static int ggtt_probe_common(struct i915_ggtt
*ggtt, u64 size)
 * readback check when writing GTT PTE entries.
 */
if (IS_GEN9_LP(i915) || INTEL_GEN(i915) >= 10)
-   ggtt->gsm = ioremap_nocache(phys_addr, size);
+   ggtt->gsm = ioremap(phys_addr, size);
else
ggtt->gsm = ioremap_wc(phys_addr, size);
if (!ggtt->gsm) {

since those ppgtt_insert_entries functions had moved to their
gen-specific files.

No?

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [git pull] drm for 5.6-rc1

2020-01-30 Thread Linus Torvalds
On Thu, Jan 30, 2020 at 8:13 AM Linus Torvalds
 wrote:
>
> That doesn't seem right. If I do that, I lose the added GEM_BUG_ON()'s.

Just for your ref: see commit ecc4d2a52df6 ("drm/i915/userptr: fix
size calculation") for the source of those debug statements, and then
2c86e55d2ab5 ("drm/i915/gtt: split up i915_gem_gtt") on the other side
that just moved the functions..

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: drm fixes for 5.3-rc9

2019-09-14 Thread Linus Torvalds
On Thu, Sep 12, 2019 at 8:56 AM Dave Airlie  wrote:
>
> Hey Linus,
>
> From the maintainer summit, just some last minute fixes for final,
> details in the tag.

So because my mailbox was more unruly than normal (because of same
maintainer summit travel), I almost missed this email entirely.

Why? Because you don't have the normal "git pull" anywhere in the
email, so it doesn't trigger my search for important emails.

There's a "git" in the email body, but there's not a "pull" anywhere.
Could you add either a "please pull" or something to the email body -
or to make things _really_ obvious, add the "[GIT PULL]" prefix to the
subject line? Or anything, really, to whatever script or workflow you
use to generate these?

 Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: drm fixes for 5.3-rc9

2019-09-15 Thread Linus Torvalds
On Sun, Sep 15, 2019 at 8:12 AM Dave Airlie  wrote:
>
> I've been manually writing the subject lines, seems I need to fix my brain.

Note that my "find git pull requests" logic doesn't need it in the
subject line at all, so if you just change whatever script you use to
generate the email body to have an additional "pull" in there
somewhere, that's perfectly workable too.

> The reason I do that is I generate on one machine the body, and send
> it via the gmail webui on whatever machine I'm using. This helps
> avoids google tagging my emails as spam for generating them using
> someone elses smtp servers. I should probably setup properly sending
> gmail to avoid that.

Don't worry too much about it too much.

I _do_ try to always read all my email, it's just that I find things
faster that match that pattern.

And I don't think I've actually lost one of your pull requests (knock
wood), they might just end up delayed a bit.

That said - having "[GIT PULL]" in the subject line is how the
pr-tracker-bot finds the emails too, so if you do the subject line
thing, you'll not only trigger my search term, you'll also get the
nice notifications from the bot when I've pushed out my merge.

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [git pull] drm tree for 5.4-rc1

2019-09-19 Thread Linus Torvalds
On Thu, Sep 19, 2019 at 12:09 PM Dave Airlie  wrote:
>
> There are a few merge conflicts across the board, we have a shared
> rerere cache which meant I hadn't noticed them until I avoided the
> cache.
> https://cgit.freedesktop.org/drm/drm/log/?h=drm-5.4-merge
> contains what we've done, none of them are too crazy.

Hmm. My merge isn't identical to that. It's close though. Different
order for one #define which might be just from you and me merging
different directions.

But I also ended up removing the .gem_prime_export initialization to
drm_gem_prime_export, because it's the default if none exists. That's
the left-over from

3baeeb21983a ("drm/mtk: Drop drm_gem_prime_export/import")

after the import stayed around because it got turned into an actually
non-default one.

I think that both of our merges are right - equivalent but just
slightly different.

But the reason I'm pointing this out is that I also get the feeling
that if it needs that dev->dev_private difference from the default
function in prime_import(), wouldn't it need the same for prime_export
too?

I don't know the code, and I don't know the hardware, but just from a
"pattern matching" angle I just wanted to check whether maybe there's
need for a mtk_drm_gem_prime_export() wrapper that does that same
thing with

struct mtk_drm_private *private = dev->dev_private;

.. use private->dev  instead of dev->dev ..

So I'm just asking that somebody that knows that drm/mtk code should
double-check that oddity.

Thanks,
  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Ack to merge through DRM? WAS Re: [PATCH v2 1/5] mm: Add write-protect and clean utilities for address space ranges

2019-09-26 Thread Linus Torvalds
On Thu, Sep 26, 2019 at 5:03 AM Thomas Hellström (VMware)
 wrote:
>
> I wonder if I can get an ack from an mm maintainer to merge this through
> DRM along with the vmwgfx patches? Andrew? Matthew?

It would have helped to actually point to the patch itself, instead of
just quoting the commit message.

Looks like this:

 https://lore.kernel.org/lkml/20190926115548.44000-2-thomas...@shipmail.org/

but why is the code in question not just using the regular page
walkers. The commit log shows no explanation of what's so special
about this?

Is the only reason the locking magic? Because if that's the reason,
then afaik we already have a function for that: it's
__walk_page_range().

Yes, it's static right now, but that's imho not a reason to duplicate
all the walking (badly).

Is there some other magic reason that isn't documented?

  Linus


Re: Ack to merge through DRM? WAS Re: [PATCH v2 1/5] mm: Add write-protect and clean utilities for address space ranges

2019-09-26 Thread Linus Torvalds
On Thu, Sep 26, 2019 at 1:09 PM Thomas Hellström (VMware)
 wrote:
>
> That said, if people are OK with me modifying the assert in
> pud_trans_huge_lock() and make __walk_page_range non-static, it should
> probably be possible to make it work, yes.

I don't think you need to modify that assert at all.

That thing only exists when there's a "pud_entry" op in the walker,
and then you absolutely need to have that mmap_lock.

As far as I can tell, you fundamentally only ever work on a pte level
in your address space walker already and actually have a WARN_ON() on
the pud_huge thing, so no pud entry can possibly apply.

So no, the assert in pud_trans_huge_lock() does not seem to be a
reason not to just use the existing page table walkers.

And once you get rid of the walking, what is left? Just the "iterate
over the inode mappings" part. Which could just be done in
mm/pagewalk.c, and then you don't even need to remove the static.

So making it be just another walking in pagewalk.c would seem to be
the simplest model.

Call it "walk_page_mapping()". And talk extensively about how the
locking differs a lot from the usual "walk_page_vma()" things.

The then actual "apply" functions (what a horrid name) could be in the
users. They shouldn't be mixed in with the walking functions anyway.
They are callbacks, not walkers.

 Linus


Re: Ack to merge through DRM? WAS Re: [PATCH v2 1/5] mm: Add write-protect and clean utilities for address space ranges

2019-09-26 Thread Linus Torvalds
On Thu, Sep 26, 2019 at 1:55 PM Thomas Hellström (VMware)
 wrote:
>
> Well, we're working on supporting huge puds and pmds in the graphics
> VMAs, although in the write-notify cases we're looking at here, we would
> probably want to split them down to PTE level.

Well, that's what the existing walker code does if you don't have that
"pud_entry()" callback.

That said, I assume you would *not* want to do that if the huge
pud/pmd is already clean and read-only, but just continue.

So you may want to have a special pud_entry() that handles that case.
Eventually. Maybe. Although honestly, if you're doing dirty tracking,
I doubt it makes much sense to use largepages.

> Looking at zap_pud_range() which when called from unmap_mapping_pages()
> uses identical locking (no mmap_sem), it seems we should be able to get
> away with i_mmap_lock(), making sure the whole page table doesn't
> disappear under us. So it's not clear to me why the mmap_sem is strictly
> needed here. Better to sort those restrictions out now rather than when
> huge entries start appearing.

zap_pud_range()actually does have that

   VM_BUG_ON_VMA(!rwsem_is_locked(&tlb->mm->mmap_sem), vma);

exactly for the case where it might have to split the pud entry.

Zapping the whole thing it does do without the assert.

I'm not going to swear the mmap_sem is absolutely required, since a
shared vma should be stable due to the i_mmap_lock, but splitting the
hugepage really is a fairly big deal.

It can't happen if you zap the *whole* mapping, but it can happen if
you have a start/end range. Like you do.

Also, in general it's probably not a great idea to look at
zap_page_range() (and copy_page_range()) for ideas.

They are kind of special, since they tend to be used for fundamental
whole-address-space operations (ie fork/exit) and so as a result they
get to do special things that a normal page walker generally shouldn't
do.

It's why they've never gotten translated to use the generic walker code.

   Linus


Re: Ack to merge through DRM? WAS Re: [PATCH v2 1/5] mm: Add write-protect and clean utilities for address space ranges

2019-09-27 Thread Linus Torvalds
On Fri, Sep 27, 2019 at 5:17 AM Kirill A. Shutemov  wrote:
>
> > Call it "walk_page_mapping()". And talk extensively about how the
> > locking differs a lot from the usual "walk_page_vma()" things.
>
> Walking mappings of a page is what rmap does. This code thas to be
> integrated there.

Well, that's very questionable.

The rmap code mainly does the "page -> virtual" mapping.  One page at a time.

The page walker code does the "virtual -> pte" mapping. Always a whole
range at a time.

The new code wants a combination of both.

It very much is about walking ranges - as in mm/pagewalk.c. It's just
that it walks potentially multiple ranges, based on where the address
space is mapped.

I think it has way more commonalities with the page walking code than
it has with the rmap code. But yes, there is some of that "look up
mappings based on address space" in there too, but it's the least part
of it

And as Thomas pointed out, it also has commonalities with
unmap_mapping_pages() in mm/memory.c. In many ways that part is the
closest.

I'd say that from a code sharing standpoint, mm/rmap.c is absolutely
the wrong place. It's the furthest away from what Thomas wants to do.

The mm/pagewalk.c code has the most actual code that could be shared,
and the addition would be smallest there.

And conceptually the closest analogue in terms of what it _does_ is
unmap_mapping_range() in mm/memory.c, but I see no room for sharing
actual code there unless we completely change how we do
zap_page_range() and add a lot of configurability there (which we
don't want, because page table teardown at exit is really a pretty
critical operation - I commonly see copy_page_range() and
zap_page_range() on profiles if you have things like script-heavyu
traditional UNIX loads).

So I think conceptually, mm/memory.c and unmap_mapping_range() is
closest but I don't think it's practical to share code.

And between mm/pagewalk.c and mm/rmap.c, I think the page walking has
way more of actual practical code sharing, and is also conceptually
closer because most of the code is about walking a range, not looking
up the mapping of one page.

   Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Ack to merge through DRM? WAS Re: [PATCH v2 1/5] mm: Add write-protect and clean utilities for address space ranges

2019-09-30 Thread Linus Torvalds
On Mon, Sep 30, 2019 at 6:04 AM Kirill A. Shutemov  wrote:
>
> Have you seen page_vma_mapped_walk()? I made it specifically for rmap code
> to cover cases when a THP is mapped with PTEs. To me it's not a big
> stretch to make it cover multiple pages too.

I agree that is closer, but it does make for calling that big complex
function for every iteration step.

Of course, you are right that the callback approach is problematic
too, now that we have retpoline issues, making those very expensive.
But at least that hopefully gets fixed some day and gets to be a rare
problem.

Matter ot taste, I guess.

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [git pull] drm for 5.5-rc1

2019-11-27 Thread Linus Torvalds
On Wed, Nov 27, 2019 at 4:59 PM Dave Airlie  wrote:
>
> my sample merge is here:
> https://cgit.freedesktop.org/~airlied/linux/log/?h=drm-next-5.5-merged

Hmm. I think you missed a couple: you left a duplicate
intel_update_rawclk() around (it got moved into
intel_power_domains_init_hw() by commit 2f216a850715 ("drm/i915:
update rawclk also on resume"), and you left the "select
REFCOUNT_FULL" that no longer exists.

And apparently nobody bothered to tell me about the semantic conflict
with the media tree due to the changed calling convention of
cec_notifier_cec_adap_unregister(). Didn't that show up in linux-next?

Anyway, merged and pushed out,

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [git pull] mm + drm vmwgfx coherent

2019-11-30 Thread Linus Torvalds
On Thu, Nov 28, 2019 at 5:15 PM Dave Airlie  wrote:
>
> This is just a separated pull for the mm pagewalking + drm/vmwgfx work
> Thomas did and you were involved in, I've left it separate in case you
> don't feel as comfortable with it as the other stuff.

Thanks, pulled (and the delay wasn't because of me being nervous about
the code, it was just because of turkey and a day of rest afterwards).

And I appreciate the separation - not because I wasn't comfortable
with the final code, but simply because it's a rather different thing
than the usual drm code. Having that as a separate pull and not mixed
up with the regular driver updates is just how I prefer it.

Thanks,

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [GIT PULL] Please pull hmm changes

2019-11-30 Thread Linus Torvalds
On Mon, Nov 25, 2019 at 12:42 PM Jason Gunthorpe  wrote:
>
> You will probably be most interested in the patch "mm/mmu_notifier: add an
> interval tree notifier".

I'm trying to read that patch, and I'm completely unable to by the
absolutely *horrid* variable names.

There are zero excuses for names like "mmn_mm". WTF?

I'll try to figure the code out, but my initial reaction was "yeah,
not in my VM".

   Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [GIT PULL] Please pull hmm changes

2019-11-30 Thread Linus Torvalds
On Sat, Nov 30, 2019 at 10:03 AM Linus Torvalds
 wrote:
>
> I'll try to figure the code out, but my initial reaction was "yeah,
> not in my VM".

Why is it ok to sometimes do

WRITE_ONCE(mni->invalidate_seq, cur_seq);

(to pair with the unlocked READ_ONCE), and sometimes then do

mni->invalidate_seq = mmn_mm->invalidate_seq;

My initial guess was that latter is only done at initialization time,
but at least in one case it's done *after* the mni has been added to
the mmn_mm (oh, how I despise those names - I can only repeat: WTF?).

See __mmu_interval_notifier_insert() in the
mmn_mm->active_invalidate_ranges case.

I'm guessing that it doesn't matter, because when inserting the
notifier the sequence number is presumably not used until after the
insertion (and any use though mmn_mm is protected by the
mmn_mm->lock), but it still looks odd to me.

   Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [GIT PULL] Please pull hmm changes

2019-11-30 Thread Linus Torvalds
On Mon, Nov 25, 2019 at 12:42 PM Jason Gunthorpe  wrote:
>
> Here is this batch of hmm updates, I think we are nearing the end of this
> project for now, although I suspect there will be some more patches related to
> hmm_range_fault() in the next cycle.

I've ended up pulling this, but I'm not entirely happy with the code.
You've already seen the comments on it in the earlier replies.

Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: drm pull for v5.3-rc1

2019-08-06 Thread Linus Torvalds
On Tue, Aug 6, 2019 at 12:38 AM Christoph Hellwig  wrote:
>
> Seems like no one took this up.  Below is a version which I think is
> slightly better by also moving the mm_walk structure initialization
> into the helpers, with an outcome of just a handful of added lines.

Ack. Agreed, I think that's a nicer interface.

In fact, I do note that a lot of the users don't actually use the
"void *private" argument at all - they just want the walker - and just
pass in a NULL private pointer. So we have things like this:

> +   if (walk_page_range(&init_mm, va, va + size, &set_nocache_walk_ops,
> +   NULL)) {

and in a perfect world we'd have arguments with default values so that
we could skip those entirely for when people just don't need it.

I'm not a huge fan of C++ because of a lot of the complexity (and some
really bad decisions), but many of the _syntactic_ things in C++ would
be nice to use. This one doesn't seem to be one that the gcc people
have picked up as an extension ;(

Yes, yes, we could do it with a macro, I guess.

   #define walk_page_range(mm, start,end, ops, ...) \
   __walk_page_range(mm, start, end, (NULL , ## __VA_ARGS__))

but I'm not sure it's worthwhile.

  Linus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: drm pull for v5.3-rc1

2019-08-07 Thread Linus Torvalds
On Tue, Aug 6, 2019 at 11:40 PM Christoph Hellwig  wrote:
>
> I'm not an all that huge fan of super magic macro loops.  But in this
> case I don't see how it could even work, as we get special callbacks
> for huge pages and holes, and people are trying to add a few more ops
> as well.

Yeah, in this case we definitely don't want to make some magic loop walker.

Loops are certainly simpler than callbacks for most cases (and often
faster because you don't have indirect calls which now are getting
quite expensive), but the walker code really does end up having tons
of different cases that you'd have to handle with magic complex
conditionals or switch statements instead.

So the "walk over range using this set of callbacks" is generally the
right interface. If there is some particular case that might be very
simple and the callback model is expensive due to indirect calls for
each page, then such a case should probably use the normal page
walking loops (that we *used* to have everywhere - the "walk_range()"
interface is the "new" model for all the random odd special cases).

Linus


Re: [git pull] drm fixes for 5.13-rc6

2021-06-11 Thread Linus Torvalds
On Thu, Jun 10, 2021 at 8:41 PM Dave Airlie  wrote:
>
>   git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2021-06-11

I think anongit.freedesktop.org is sick. Can you ask somebody to give
it some tender loving? It's just disconnecting immediately..

 Linus


Re: [git pull] drm fixes for 5.13-rc6

2021-06-11 Thread Linus Torvalds
On Fri, Jun 11, 2021 at 10:07 AM Linus Torvalds
 wrote:
>
> I think anongit.freedesktop.org is sick. Can you ask somebody to give
> it some tender loving? It's just disconnecting immediately..

Either somebody gave the site a hug, or it decided to just get better
on its own. It's working now,

  Linus


Re: drm next for 6.3-rc1

2023-02-24 Thread Linus Torvalds
On Fri, Feb 24, 2023 at 5:30 PM Dave Airlie  wrote:
>
> Any issues with this? I get nervous around 48hrs :-)

It was merged on Wednesday evening. See commit a5c95ca18a98.

If you were waiting for a pr-tracker-bot reply, I think you need to
put "{GIT PULL]" in the subject line for the automation to trigger on
it.

 Linus


Re: [PATCH RFC v7 00/23] DEPT(Dependency Tracker)

2023-01-16 Thread Linus Torvalds
[ Back from travel, so trying to make sense of this series.. ]

On Sun, Jan 8, 2023 at 7:33 PM Byungchul Park  wrote:
>
> I've been developing a tool for detecting deadlock possibilities by
> tracking wait/event rather than lock(?) acquisition order to try to
> cover all synchonization machanisms. It's done on v6.2-rc2.

Ugh. I hate how this adds random patterns like

if (timeout == MAX_SCHEDULE_TIMEOUT)
sdt_might_sleep_strong(NULL);
else
sdt_might_sleep_strong_timeout(NULL);
   ...
sdt_might_sleep_finish();

to various places, it seems so very odd and unmaintainable.

I also recall this giving a fair amount of false positives, are they all fixed?

Anyway, I'd really like the lockdep people to comment and be involved.
We did have a fairly recent case of "lockdep doesn't track page lock
dependencies because it fundamentally cannot" issue, so DEPT might fix
those kinds of missing dependency analysis. See

https://lore.kernel.org/lkml/60d41f05f139a...@google.com/

for some context to that one, but at teh same time I would *really*
want the lockdep people more involved and acking this work.

Maybe I missed the email where you reported on things DEPT has found
(and on the lack of false positives)?

   Linus


Re: [git pull] drm for 6.1-rc1

2022-10-05 Thread Linus Torvalds
On Tue, Oct 4, 2022 at 8:42 PM Dave Airlie  wrote:
>
> This is very conflict heavy, mostly the correct answer is picking
> the version from drm-next.

Ugh, yes, that was a bit annoying.

I get the same end result as you did, but I do wonder if the drm
people should try to keep some kind of separate "fixes" branches for
things that go both into the development tree and then get sent to me
for fixes pulls?

Hopefully this "lots of pointless noise" was a one-off, but it might
be due to how you guys work..

  Linus


Re: [git pull] drm for 6.1-rc1

2022-10-06 Thread Linus Torvalds
On Tue, Oct 4, 2022 at 8:42 PM Dave Airlie  wrote:
>
> Lots of stuff all over, some new AMD IP support and gang
> submit support [..]

Hmm.

I have now had my main desktop lock up twice after pulling this.
Nothing in the dmesg after a reboot, and nothing in particular that
seems to trigger it, so I have a hard time even guessing what's up,
but the drm changes are the primary suspect.

I will try to see if I can get any information out of the machine, but
with the symptom being just a dead machine ...

This is the same (old) Radeon device:

   49:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
[AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev e7)

with dual 4k monitors, running on my good old Threadripper setup.

Again, there's no explicit reason to blame the drm pull, except that
it started after that merge (that machine ran the kernel with the
networking pull for a day with no problems, and while there were other
pull requests in between them, they seem to be fairly unrelated to the
hardware I have).

But the lockup is so sporadic (twice in the last day) that I really
can't bisect it, so I'm afraid I have very very little info.

Any suggestions?

  Linus


Re: mainline build failure due to 5d8c3e836fc2 ("drm/amd/display: fix array-bounds error in dc_stream_remove_writeback()")

2022-10-06 Thread Linus Torvalds
On Thu, Oct 6, 2022 at 1:51 AM Sudip Mukherjee (Codethink)
 wrote:
>
> This is only seen with gcc-11, gcc-12 builds are ok.

Hmm. This seems to be some odd gcc issue.

I *think* that what is going on is that the test

j = 0 ; j < MAX_DWB_PIPES

makes gcc decide that "hey, j is in the range [0,MAX_DWB_PIPES[", and
then since MAX_DWB_PIPES is 1, that simplifies to "j must be zero".
Good range analysis so far.

But for 'i', we start off with that lower bound of 0, but the upper
bound is not fixed (the test for "i" is: "i < stream->num_wb_info").

And then that "if (i != j)", so now gcc decides that it can simplify
that to "if (i != 0)", and then simplifies *that* to "oh, the lower
bound of 'i' in that code is actually 1.

So then it decides that "stream->writeback_info[i]" must be out of bounds.

Of course, the *reality* is that stream->num_wb_info should be <=
MAX_DWB_PIPES, and as such (with the current MAX_DWB_PIPES value of 1)
it's not that 'i' can be 1, it's that the code in question cannot be
reached at all.

What confuses me is that error message ("array subscript [0, 0] is
outside array bounds of 'struct dc_writeback_info[1]') which seems to
be aware that the value is actually 0.

So this seems to be some gcc-11 range analysis bug, but I don't know
what the fix is. I suspect some random code change would magically
just make gcc realize it's ok after all, but since it all depends on
random gcc confusion, I don't know what the random code change would
be.

The fix *MAY* be to just add a '&& i < MAX_DWB_PIPES' to that loop
too, and then gcc will see that both i and j are always 0, and that
the code is unreachable and not warn about it. Hmm? Can you test that?

And the reason gcc-12 builds are ok probably isn't that gcc-12 gets
this right, it's simply that gcc-12 gets so many *opther* things wrong
that we already disabled -Warray-bounds with gcc-12 entirely.

If somebody cannot come up with a fix, I suspect the solution is "gcc
array bounds analysis is terminally buggy" and we just need to disable
it for gcc-11 too.

Kees, any idea? Who else might be interested in fixing a -Warray-bounds issue?

 Linus


Re: [git pull] drm for 6.1-rc1

2022-10-06 Thread Linus Torvalds
On Thu, Oct 6, 2022 at 12:30 PM Dave Airlie  wrote:
>
> netconsole?

I've never been really successful with that in the past, and haven't
used it for decades. I guess I could try if nothing else works.

   Linus


Re: [git pull] drm for 6.1-rc1

2022-10-06 Thread Linus Torvalds
On Thu, Oct 6, 2022 at 12:28 PM Alex Deucher  wrote:
>
> Maybe you are seeing this which is an issue with GPU TLB flushes which
> is kind of sporadic:
> https://gitlab.freedesktop.org/drm/amd/-/issues/2113

Well, that seems to be 5.19, and while timing changes (or whatever
other software updates) could have made it start trigger, this machine
has been pretty solid otgerwise.

> Are you seeing any GPU page faults in your kernel log?

Nothing even remotely like that "no-retry page fault" in that issue
report. Of course, if it happens just before the whole thing locks
up...

   Linus


Re: [git pull] drm for 6.1-rc1

2022-10-06 Thread Linus Torvalds
On Thu, Oct 6, 2022 at 1:25 PM Dave Airlie  wrote:
>
>
> [ 1234.778760] BUG: kernel NULL pointer dereference, address: 0088
> [ 1234.778813] RIP: 0010:drm_sched_job_done.isra.0+0xc/0x140 [gpu_sched]

As far as I can tell, that's the line

struct drm_gpu_scheduler *sched = s_fence->sched;

where 's_fence' is NULL. The code is

   0: 0f 1f 44 00 00nopl   0x0(%rax,%rax,1)
   5: 41 54push   %r12
   7: 55push   %rbp
   8: 53push   %rbx
   9: 48 89 fb  mov%rdi,%rbx
   c:* 48 8b af 88 00 00 00 mov0x88(%rdi),%rbp <-- trapping instruction
  13: f0 ff 8d f0 00 00 00 lock decl 0xf0(%rbp)
  1a: 48 8b 85 80 01 00 00 mov0x180(%rbp),%rax

and that next 'lock decl' instruction would have been the

atomic_dec(&sched->hw_rq_count);

at the top of drm_sched_job_done().

Now, as to *why* you'd have a NULL s_fence, it would seem that
drm_sched_job_cleanup() was called with an active job. Looking at that
code, it does

if (kref_read(&job->s_fence->finished.refcount)) {
/* drm_sched_job_arm() has been called */
dma_fence_put(&job->s_fence->finished);
...

but then it does

job->s_fence = NULL;

anyway, despite the job still being active. The logic of that kind of
"fake refcount" escapes me. The above looks fundamentally racy, not to
say pointless and wrong (a refcount is a _count_, not a flag, so there
could be multiple references to it, what says that you can just
decrement one of them and say "I'm done").

Now, _why_ any of that happens, I have no idea. I'm just looking at
the immediate "that pointer is NULL" thing, and reacting to what looks
like a completely bogus refcount pattern.

But that odd refcount pattern isn't new, so it's presumably some user
on the amd gpu side that changed.

The problem hasn't happened again for me, but that's not saying a lot,
since it was very random to begin with.

 Linus


Re: mainline build failure due to 5d8c3e836fc2 ("drm/amd/display: fix array-bounds error in dc_stream_remove_writeback()")

2022-10-06 Thread Linus Torvalds
On Thu, Oct 6, 2022 at 1:50 PM Sudip Mukherjee
 wrote:
>
> > And it looks like Sudip's proposed fix for this particular code is
> > additionally fixing unsigned vs signed as well. I think -Warray-bounds
> > did its job (though, with quite a confusing index range in the report).
>
> Not my. Linus's. I just tested. :)

I suspect Kees meant Stephen's other patch that Hamza pointed at, and
that is perhaps the cleaner version.

That said, I hate how this forces us to write random code changes just
to make a compiler just randomly _happen_ to not complain about it.

   Linus


Re: [git pull] drm fixes for 6.1-rc1

2022-10-13 Thread Linus Torvalds
On Thu, Oct 13, 2022 at 5:29 PM Dave Airlie  wrote:
>
> Round of fixes for the merge window stuff, bunch of amdgpu and i915
> changes, this should have the gcc11 warning fix, amongst other
> changes.

Some of those amd changes aren't "fixes". They are some major code changes.

We're still in the merge window, so I'm letting it slide, but calling
then "fixes" really stretches things. They are fixes exactly the same
way completely new development can "fix" things.

  Linus


Re: [PATCH] drm/amd/display: Fix build breakage with CONFIG_DEBUG_FS=n

2022-10-14 Thread Linus Torvalds
On Fri, Oct 14, 2022 at 8:22 AM Nathan Chancellor  wrote:
>
> After commit 8799c0be89eb ("drm/amd/display: Fix vblank refcount in vrr
> transition"), a build with CONFIG_DEBUG_FS=n is broken due to a
> misplaced brace, along the lines of:

Thanks, applied.

  Linus


Re: [PATCH mm-unstable v1 20/20] mm: rename FOLL_FORCE to FOLL_PTRACE

2022-11-16 Thread Linus Torvalds
On Wed, Nov 16, 2022 at 2:30 AM David Hildenbrand  wrote:
>
> Let's make it clearer that functionality provided by FOLL_FORCE is
> really only for ptrace access.

I'm not super-happy about this one.

I do understand the "let's rename the bit so that no new user shows up".

And it's true that the main traditional use is ptrace.

But from the patch itself it becomes obvious that no, it's not *just*
ptrace. At least not yet.

It's used for get_arg_page(), which uses it to basically look up (and
install) pages in the newly created VM.

Now, I'm not entirely sure why it even uses FOLL_FORCE, - I think it
might be historical, because the target should always be the new stack
vma.

Following the history of it is a big of a mess, because there's a
number of renamings and re-organizations, but it seems to go back to
2007 and commit b6a2fea39318 ("mm: variable length argument support").

Before that commit, we kept our own array of "this is the set of pages
that I will install in the new VM". That commit basically just inserts
the pages directly into the VM instead, getting rid of the array size
limitation.

So at a minimum, I think that FOLL_FORCE would need to be removed
before any renaming to FOLL_PTRACE, because that's not some kind of
small random case.

It *might* be as simple as just removing it, but maybe there's some
reason for having it that I don't immediately see.

There _are_ also small random cases too, like get_cmdline(). Maybe
that counts as ptrace, but the execve() case most definitely does not.

Linus


Re: [PATCH mm-unstable v1 20/20] mm: rename FOLL_FORCE to FOLL_PTRACE

2022-11-17 Thread Linus Torvalds
On Thu, Nov 17, 2022 at 2:58 PM Kees Cook  wrote:
>
> Oh, er, why does get_arg_page() even need FOLL_FORCE? This is writing the
> new stack contents to the nascent brpm->vma, which was newly allocated
> with VM_STACK_FLAGS, which an arch can override, but they all appear to 
> include
> VM_WRITE | VM_MAYWRITE.

Yeah, it does seem entirely superfluous.

It's been there since the very beginning (although in that original
commit b6a2fea39318 it was there as a '1' to the 'force' argument to
get_user_pages()).

I *think* it can be just removed. But as long as it exists, it should
most definitely not be renamed to FOLL_PTRACE.

There's a slight worry that it currently hides some other setup issue
that makes it matter, since it's been that way so long, but I can't
see what it is.

 Linus


Re: [PATCH RFC 16/19] mm/frame-vector: remove FOLL_FORCE usage

2022-11-22 Thread Linus Torvalds
On Tue, Nov 22, 2022 at 4:25 AM Hans Verkuil  wrote:
>
> I tracked the use of 'force' all the way back to the first git commit
> (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the
> reason is lost in the mists of time.

Well, not entirely.

For archaeology reasons, I went back to the old BK history, which
exists as a git conversion in

https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/

and there you can actually find it.

Not with a lot of explanations, though - it's commit b7649ef789
("[PATCH] videobuf update"):

This updates the video-buf.c module (helper module for video buffer
management).  Some memory management fixes, also some adaptions to the
final v4l2 api.

but it went from

 err = get_user_pages(current,current->mm,
-data, dma->nr_pages,
-rw == READ, 0, /* don't force */
+data & PAGE_MASK, dma->nr_pages,
+rw == READ, 1, /* force */
 dma->pages, NULL);

in that commit.

So it goes back to October 2002.

> Looking at this old LWN article https://lwn.net/Articles/28548/ suggests
> that it might be related to calling get_user_pages for write buffers

The timing roughly matches.

> I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still
> allow drivers to read from the buffer?

The issue with some of the driver hacks has been that

 - they only want to read, and the buffer may be read-only

 - they then used FOLL_WRITE despite that, because they want to break
COW (due to the issue that David is now fixing with his series)

 - but that means that the VM layer says "nope, you can't write to
this read-only user mapping"

 - ... and then they use FOLL_FORCE to say "yes, I can".

iOW, the FOLL_FORCE may be entirely due to an (incorrect, but
historically needed) FOLL_WRITE.

 Linus


Re: [RFC][PATCH v3 00/33] timers: Use timer_shutdown*() before freeing timers

2022-11-04 Thread Linus Torvalds
On Thu, Nov 3, 2022 at 10:48 PM Steven Rostedt  wrote:
>
> Ideally, I would have the first patch go into this rc cycle, which is mostly
> non functional as it will allow the other patches to come in via the 
> respective
> subsystems in the next merge window.

Ack.

I also wonder if we could do the completely trivially correct
conversions immediately.

I'm talking about the scripted ones where it's currently a
"del_timer_sync()", and the very next action is freeing whatever data
structure the timer is in (possibly with something like free_irq() in
between - my point is that there's an unconditional free that is very
clear and unambiguous), so that there is absolutely no question about
whether they should use "timer_shutdown_sync()" or not.

IOW, things like patches 03, 17 and 31, and at least parts others in
this series.

This series clearly has several much more complex cases that need
actual real code review, and I think it would help to have the
completely unambiguous cases out of the way, just to get rid of noise.

So I'd take that first patch, and a scripted set of "this cannot
change any semantics" patches early.

Linus


Re: [RFC][PATCH v3 00/33] timers: Use timer_shutdown*() before freeing timers

2022-11-04 Thread Linus Torvalds
On Fri, Nov 4, 2022 at 12:42 PM Steven Rostedt  wrote:
>
> Linus, should I also add any patches that has already been acked by the
> respective maintainer?

No, I'd prefer to keep only the ones that are 100% unambiguously not
changing any semantics.

  Linus


Re: [PATCH v4a 00/38] timers: Use timer_shutdown*() before freeing timers

2022-11-05 Thread Linus Torvalds
On Fri, Nov 4, 2022 at 11:01 PM Steven Rostedt  wrote:
>
> Patch 1 fixes an issue with sunrpc/xprt where it incorrectly uses
> del_singleshot_timer_sync() for something that is not a oneshot timer. As this
> will be converted to shutdown, this needs to be fixed first.

So this is the kind of thing that I would *not* want to get eartly.

I really would want to get just the infrastructure in to let people
start doing conversions.

And then the "mindlessly obvious patches that are done by scripting
and can not possibly matter".

The kinds that do not *need* review, because they are mechanical, and
that just cause pointless noise for the rest of the patches that *do*
want review.

Not this kind of thing that is so subtle that you have to explain it.
That's not a "scripted patch for no semantic change".

So leave the del_singleshot_timer_sync() cases alone, they are
irrelevant for the new infrastructure and for the "mindless scripted
conversion" patches.

> Patches 2-4 changes existing timer_shutdown() functions used locally in ARM 
> and
> some drivers to better namespace names.

Ok, these are relevant.

> Patch 5 implements the new timer_shutdown() and timer_shutdown_sync() 
> functions
> that disable re-arming the timer after they are called.

This is obviously what I'd want early so that people can start doign
this in their trees.

> Patches 6-28 change all the locations where there's a kfree(), kfree_rcu(),
> kmem_cache_free() and one call_rcu() call where the RCU function frees the
> timer (the workqueue patch) in the same function as the del_timer{,_sync}() is
> called on that timer, and there's no extra exit path between the del_timer and
> freeing of the timer.

So honestly, I was literally hoping for a "this is the coccinelle
script" kind of patch.

Now there seems to be a number of patches here that are actualyl
really hard to see that they are "obviously correct" and I can't tell
if they are actually scripted or not.

They don't *look* scripted, but I can't really tell.  I looked at the
patches with ten lines of context, and I didn't see the immediately
following kfree() even in that expanded patch context, so it's fairly
far away.

Others in the series were *definitely* not scripted, doing clearly
manual cleanups:

-if (dch->timer.function) {
-del_timer(&dch->timer);
-dch->timer.function = NULL;
-}
+timer_shutdown(&dch->timer);

so no, this does *not* make me feel "ok, this is all trivial".

IOW, I'd really want *just* the infrastructure and *just* the provably
trivial stuff. If it wasn't some scripted really obvious thing that
cannot possibly change anything and that wasn't then edited manually
for some reason, I really don't want it early.

IOW, any early conversions I'd take are literally about removing pure
mindless noise. Not about doing conversions.

And I wouldn't mind it as a single conversion patch that has the
coccinelle script as the explanation.

Really just THAT kind of "100% mindless conversion".

   Linus


Re: [PATCH v4a 00/38] timers: Use timer_shutdown*() before freeing timers

2022-11-05 Thread Linus Torvalds
On Sat, Nov 5, 2022 at 11:04 AM Steven Rostedt  wrote:
>
> Here's the changes I made after running the script

Please. No.

What part of "I don't want extra crud" was I unclear on?

I'm not interested in converting everything. That's clearly a 6.,2
issue, possibly even longer considering how complicated the networking
side has been.

I'm not AT ALL interested in "oh, I then added my own small cleanups
on top to random files because I happened to notice them".

Repeat after me: "If the script didn't catch them, they weren't
trivially obvious".

And it does seem that right now the script itself is a bit too
generous, which is why it didn't notice that sometimes there wasn't a
kfree after all because of a goto around it. So clearly that "..."
doesn't really work, I think it accepts "_any_ path leads to the
second situation" rather than "_all_ paths lead to the second
situation".

But yeah, my coccinelle-foo is very weak too, and maybe there's no
pattern for "no flow control".

I would also like the coccinelle script to notice the "timer is used
afterwards", so that it does *not* modify that case that does

del_timer(&dch->timer);
dch->timer.function = NULL;

since now the timer is modified in between the del_timer() and the kfree.

Again, that timer modification is then made pointless by changing the
del_timer() to a "timer_shutdown()", but at that point it is no longer
a "so obvious non-semantic change that it should be scripted". At that
point it's a manual thing.

So I think the "..." in your script should be "no flow control, and no
access to the timer", but do not know how to do that in coccinelle.

Julia?

And this thread has way too many participants, I suspect some email
systems will just mark it as spam as a result. Which is partly *why* I
would like to get rid of noisy changes that really don't matter - but
I would like it to be truly mindlessly obvious that there are *zero*
questions about it, and absolutely no manual intervention because the
patch is so strict that it's just unquestionably correct.

  Linus


Re: [PATCH v4a 00/38] timers: Use timer_shutdown*() before freeing timers

2022-11-05 Thread Linus Torvalds
On Sat, Nov 5, 2022 at 2:03 PM Jason A. Donenfeld  wrote:
>
> Something that might help here is changing the `...` into
> `... when exists` or into `... when != ptr` or similar.

I actually tried that.

You don't want "when exists", you'd want "when forall", but that seems
to be the default.

And trying "when != ptr->timer" actually does the right thing in that
it gets rid of the case where the timer is modified outside of the
del_timer() case, *but* it also causes odd other changes to the
output.

Look at what it generates for that

   drivers/media/usb/pvrusb2/pvrusb2-hdw.c

file, which finds a lot of triggers with the "when !=  ptr->timer",
but only does one without it.

So I gave up, just because I clearly don't understand the rules.

(Comparing output is also fun because the ordering of the patches is
random, so consecutive runs with the same rule will give different
patches. I assume that it's just because it's done in parallel, but it
doesn't help the "try to see what changes when you change the script"
;)

 Linus


Re: [PATCH RFC 00/19] mm/gup: remove FOLL_FORCE usage from drivers (reliable R/O long-term pinning)

2022-11-07 Thread Linus Torvalds
On Mon, Nov 7, 2022 at 8:18 AM David Hildenbrand  wrote:
>
> So instead, make R/O long-term pinning work as expected, by breaking COW
> in a COW mapping early, such that we can remove any FOLL_FORCE usage from
> drivers.

Nothing makes me unhappy from a quick scan through these patches.

And I'd really love to just have this long saga ended, and FOLL_FORCE
finally relegated to purely ptrace accesses.

So an enthusiastic Ack from me.

   Linus


Re: [git pull] drm for 6.2-rc1

2022-12-13 Thread Linus Torvalds
On Mon, Dec 12, 2022 at 6:56 PM Dave Airlie  wrote:
>
> There are a bunch of conflicts, one in amdgpu is a bit nasty, I've
> cc'ed Christian/Alex to make sure they know to check whatever
> resolution you find. The one I have is what we have in drm-tip tree.

Hmm. My merge resolution is slightly different from yours.

You seem to have basically dropped commit b09d6acba1d9 ("drm/amdgpu:
handle gang submit before VMID").

Now, there are other fence changes in the drm tree that may mean that
that commit *should* be dropped, so it's entirely possible that my
resolution which kept that ordering change might be wrong and your
resolution that just took the drm tip code is the right one.

Christian? Alex? Can you please double-check what I just pushed out?

Linus


Re: [git pull] drm for 6.2-rc1

2022-12-14 Thread Linus Torvalds
On Wed, Dec 14, 2022 at 12:05 AM Christian König
 wrote:
>
> Anyway we need to re-apply b09d6acba1d9 which should be trivial.

Note that my resolution did exactly that (*), it's just that when I
double-checked against Dave's suggested merge that I noticed I'd done
things differently than he did.

(*) Well, when I say "did exactly that" I don't actually know some of
the other fencing changes that happened, so there may be a reason why
something further should still be done.  So I can only point to my
merge commit a594533df0f6 and ask people to verify.

It does all at least work for me. Knock wood.

  Linus


Disabling -Warray-bounds for gcc-13 too

2023-04-23 Thread Linus Torvalds
Kees,
  I made the mistake of upgrading my M2 Macbook Air to Fedora-38, and
in the process I got gcc-13 which is not WERROR-clean because we only
limited the 'array-bounds' warning to gcc-11 and gcc-12. But gcc-13
has all the same issues.

And I want to be able to do my arm64 builds with WERROR on still...

I guess it never made much sense to hope it was going to go away
without having a confirmation, so I just changed it to be gcc-11+.

A lot of the warnings seem just crazy, with gcc just not getting the
bounds right, and then being upset about us going backwards with
'container_of()' etc. Ok, so the kernel is special. We do odd things.
I get it, gcc ends up being confused.

But before I disabled it, I did take a look at a couple of warnings
that didn't look like the sea of crazy.

And one of them is from you.

In particular, commit 4076ea2419cf ("drm/nouveau/disp: Fix
nvif_outp_acquire_dp() argument size") cannot possibly be right, It
changes

 nvif_outp_acquire_dp(struct nvif_outp *outp, u8 dpcd[16],

to

 nvif_outp_acquire_dp(struct nvif_outp *outp, u8 dpcd[DP_RECEIVER_CAP_SIZE],

and then does

memcpy(args.dp.dpcd, dpcd, sizeof(args.dp.dpcd));

where that 'args.dp.dpcd' is a 16-byte array, and DP_RECEIVER_CAP_SIZE is 15.

So yeah, it's copying 16 bytes from an argument that claims to be 15
bytes in size.

I think that commit was wrong, and the problem is that the 'dpcd'
array is something 15 and sometimes 16. For example, we have

  struct nouveau_encoder {
...
union {
struct {
...
u8 dpcd[DP_RECEIVER_CAP_SIZE];
} dp;
};

so there it's indeed 15 bytes, but then we have

union nvif_outp_acquire_args {
struct nvif_outp_acquire_v0 {
...
union {
...
struct {
...
__u8 dpcd[16];
} dp;

where it's 16.

I think it's all entirely harmless from a code generation standpoint,
because the 15-byte field will be padded out to 16 bytes in the
structure that contains it, but it's most definitely buggy.

So that warning does find real cases of wrong code. But when those
real cases are hidden by hundreds of lines of unfixable false
positives, we don't have much choice.

But could the Nouveau driver *please* pick a size for the dhcp[] array
and stick with it?

The other driver where the warnings didn't look entirely crazy was the
ath/carl9170 wireless driver, but I didn't look closer at that one.

 Linus


Re: mainline build failure due to 322458c2bb1a ("drm/ttm: Reduce the number of used allocation orders for TTM pages")

2023-04-26 Thread Linus Torvalds
On Wed, Apr 26, 2023 at 10:44 AM Sudip Mukherjee (Codethink)
 wrote:
>
> drivers/gpu/drm/ttm/ttm_pool.c:73:29: error: variably modified 
> 'global_write_combined' at file scope
>73 | static struct ttm_pool_type global_write_combined[TTM_DIM_ORDER];
>   | ^

Ugh.

This is because we have

  #define TTM_DIM_ORDER (__TTM_DIM_ORDER <= MAX_ORDER ?
__TTM_DIM_ORDER : MAX_ORDER)

which looks perfectly fine as a constant ("pick the smaller of
MAX_ORDER and __TTM_DIM_ORDER").

But:

  #define __TTM_DIM_ORDER (TTM_MAX_ORDER + 1)
  #define TTM_MAX_ORDER (PMD_SHIFT - PAGE_SHIFT)

which still _looks_ fine, but  on 64-bit powerpc, we then have

   #define PTE_INDEX_SIZE  __pte_index_size

so that __TTM_DIM_ORDER constant isn't actually a constant at all.

I would suggest that the DRM code just make those fixed-size arrays
use "MAX_ORDER", because even though it might be a bit bigger than
necessary, that's still not a very big constant (ie typically it's
"11", but it could be up to 64).

It's a bit sad how that macro that _looks_ like a constant (and is one
pretty much everywhere else) isn't actually constant on powerpc, but
looking around it looks fairly unavoidable.

Maybe the DRM code could try to avoid using things like PMD_SHIFT entirely?

Linus


Re: [PATCH 0/2] docs & checkpatch: allow Closes tags with links

2023-03-16 Thread Linus Torvalds
On Thu, Mar 16, 2023 at 4:43 AM Matthieu Baerts
 wrote:
>
> @Linus: in short, we would like to continue using the "Closes:" tag (or
> similar, see below) with a URL in commit messages. They are useful to
> have public bug trackers doing automated actions like closing a specific
> ticket. Any objection from your side?

As long as it's a public link, I guess that just documents what the
drm people have been doing.

I'm not convinced "Closes" is actually any better than just "Link:",
though. I would very much hope and expect that the actual closing of
any bug report is actually done separately and verified, rather than
some kind of automated "well, the commit says it closes it, so.."

So honestly, I feel like "Link:" is just a better thing, and I worry
that "Closes:" is then going to be used for random internal crap.
We've very much seen people wanting to do that - having their own
private bug trackers, and then using the commit message to refer to
them, which I am *violently* against. If it's only useful to some
closed community, it shouldn't be in the public commits.

And while the current GPU people seem to use "Closes:" the right way
(and maybe some other groups do too - but it does seem to be mostly a
freedesktop thing), I really think it is amenable to mis-use in ways
"Link:" is not.

The point of "Link:" is explicitly two-fold:

 - it makes it quite obvious that you expect an actual valid web-link,
not some internal garbage

 - random people always want random extensions, and "Link:" is
_designed_ to counter-act that creeping "let's add a random new tag"
disease. It's very explicitly "any relevant link".

and I really question the value of adding new types of tags,
particularly ones that seem almost designed to be mis-used.

So I'm not violently against it, and 99% of the existing uses seem
fine. But I do note that some of the early "Closes:" tags in the
kernel were very much complete garbage, and exactly the kind of thing
that I absolutely detest.

What does

Closes: 10437

mean? That's crazy talk. (And yes, in that case it was a
kernel.bugzilla.org number, which is perfectly fine, but I'm using it
as a very real example of how "Closes:" ends up being very naturally
to mis-use).

End result: I don't hate our current "Closes:" uses. But I'm very wary of it.

I'm not at all convinced that it really adds a lot of value over
"Link:", and I am, _very_ aware of how easily it can be then taken to
be a "let's use our own bug tracker cookies here".

So I will neither endorse nor condemn it, but if I see people using it
wrong, I will absolutely put my foot down.

Linus


Re: [Intel-gfx] [BUG 6.3-rc1] Bad lock in ttm_bo_delayed_delete()

2023-03-17 Thread Linus Torvalds
On Wed, Mar 15, 2023 at 5:22 PM Steven Rostedt  wrote:
>
> I hope that this gets in by -rc3, as I want to start basing my next branch
> on that tag.

My tree should have it now as commit c00133a9e87e ("drm/ttm: drop
extra ttm_bo_put in ttm_bo_cleanup_refs").

Linus


Re: Linux 6.3-rc3

2023-03-20 Thread Linus Torvalds
On Mon, Mar 20, 2023 at 11:05 AM Nathan Chancellor  wrote:
>
> On the clang front, I am still seeing the following warning turned error
> for arm64 allmodconfig at least:
>
>   drivers/gpu/host1x/dev.c:520:6: error: variable 'syncpt_irq' is 
> uninitialized when used here [-Werror,-Wuninitialized]
>   if (syncpt_irq < 0)
>   ^~

Hmm. I do my arm64 allmodconfig builds with gcc, and I'm surprised
that gcc doesn't warn about this.

That syncpt_irq thing isn't written to anywhere, so that's pretty egregious.

We use -Wno-maybe-uninitialized because gcc gets it so wrong, but
that's different from the "-Wuninitialized" thing (without the
"maybe").

I've seen gcc mess this up when there is one single assignment,
because then the SSA format makes it *so* easy to just use that
assignment out-of-order (or unconditionally), but this case looks
unusually clear-cut.

So the fact that gcc doesn't warn about it is outright odd.

> If that does not come to you through other means before -rc4, could you
> just apply it directly so that I can stop applying it to our CI? :)

Bah. I took it now, there's no excuse for that thing.

Do we have any gcc people around that could explain why gcc failed so
miserably at this trivial case?

   Linus


Re: Linux 6.3-rc3

2023-03-20 Thread Linus Torvalds
On Mon, Mar 20, 2023 at 11:26 AM Linus Torvalds
 wrote:
>
> Hmm. I do my arm64 allmodconfig builds with gcc, and I'm surprised
> that gcc doesn't warn about this.

Side note: I'm also wondering why that TEGRA_HOST1X config has that
ARM dependency in

depends on ARCH_TEGRA || (ARM && COMPILE_TEST)

because it seems to build just fine at least on x86-64 if I change it to be just

depends on ARCH_TEGRA || COMPILE_TEST

ie there seems to be nothing ARM-specific in there.

Limiting it to just the tegra platform by default makes 100% sense,
but that "only do compile-testing on ARM" seems a bit bogus.

That limit goes back to forever (commit 6f44c2b5280f: "gpu: host1x:
Increase compile test coverage" back in Nov 2013), so maybe things
didn't use to work as well back in the dark ages?

None of this explains why gcc didn't catch it, but at least allowing
the build on x86-64 would likely have made it easier for people to see
clang catching this.

   Linus


Re: Linux 6.3-rc3

2023-03-20 Thread Linus Torvalds
On Mon, Mar 20, 2023 at 11:56 AM Nathan Chancellor  wrote:
>
> I did see a patch fly by to fix that:
>
> https://lore.kernel.org/20230316082035.567520-3-christian.koe...@amd.com/
>
> It seems like the DRM_TEGRA half of it is broken though:
>
> https://lore.kernel.org/202303170635.a2rsq1wu-...@intel.com/

Hmm. x86-64 has 'vmap()' too.

So I think that DRM_TEGRA breakage is likely just due to a missing
header file include that then (by luck and mistake) gets included on
arm.

You need  for 'vmap()'.

There might be something else going on, I didn't look deeply at it.

   Linus


<    1   2   3   4   5   6   7   8   >