On Wed, Sep 09, 2020 at 03:03:24PM -0300, Jason Gunthorpe wrote: > On Wed, Sep 09, 2020 at 07:25:34PM +0200, Gerald Schaefer wrote: > > I actually had to draw myself a picture to get some hold of > > this, or rather a walk-through with a certain pud-crossing > > range in a folded 3-level scenario. Not sure if I would have > > understood my explanation above w/o that, but I hope you can > > make some sense out of it. Or draw yourself a picture :-) > > What I don't understand is how does anything work with S390 today? > > If the fix is only to change pxx_addr_end() then than generic code > like mm/pagewalk.c will iterate over a *different list* of page table > entries. > > It's choice of entries to look at is entirely driven by pxx_addr_end(). > > Which suggest to me that mm/pagewalk.c also doesn't work properly > today on S390 and this issue is not really about stack variables? > > Fundamentally if pXX_offset() and pXX_addr_end() must be consistent > together, if pXX_offset() is folded then pXX_addr_end() must cause a > single iteration of that level.
Your observation is correct. Another way to describe the problem is existing pXd_addr_end helpers could be applied to mismatching levels on s390 (e.g p4d_addr_end applied to pud or pgd_addr_end applied to p4d). As you noticed, all *_pXd_range iterators could be called with address ranges that exceed single pXd table. However, when it happens with pointers to real page tables (passed to *_pXd_range iterators) we still operate on valid tables, which just (lucky for us) happened to be folded. Thus we still reference correct table entries. It is only gup_fast case that exposes the issue. It hits because pointers to stack copies are passed to gup_pXd_range iterators, not pointers to real page tables itself. As Gerald mentioned, it is very difficult to explain in a clear way. Hopefully, one could make sense ot of it. > Jason