> -----Original Message-----
> From: Jonathan Cameron <jonathan.came...@huawei.com>
> Sent: Saturday, April 6, 2024 12:46 AM
> To: Jonathan Cameron via <qemu-devel@nongnu.org>
> Cc: Jonathan Cameron <jonathan.came...@huawei.com>; Yao, Xingtao/姚 幸涛
> <yaoxt.f...@fujitsu.com>; fan...@samsung.com; Cao, Quanquan/曹 全全
> <ca...@fujitsu.com>
> Subject: Re: [PATCH] mem/cxl_type3: fix hpa to dpa logic
> 
> On Mon, 1 Apr 2024 17:00:50 +0100
> Jonathan Cameron via <qemu-devel@nongnu.org> wrote:
> 
> > On Thu, 28 Mar 2024 06:24:24 +0000
> > "Xingtao Yao (Fujitsu)" <yaoxt.f...@fujitsu.com> wrote:
> >
> > > Jonathan
> > >
> > > thanks for your reply!
> > >
> > > > -----Original Message-----
> > > > From: Jonathan Cameron <jonathan.came...@huawei.com>
> > > > Sent: Wednesday, March 27, 2024 9:28 PM
> > > > To: Yao, Xingtao/姚 幸涛 <yaoxt.f...@fujitsu.com>
> > > > Cc: fan...@samsung.com; qemu-devel@nongnu.org; Cao, Quanquan/曹 全
> 全
> > > > <ca...@fujitsu.com>
> > > > Subject: Re: [PATCH] mem/cxl_type3: fix hpa to dpa logic
> > > >
> > > > On Tue, 26 Mar 2024 21:46:53 -0400
> > > > Yao Xingtao <yaoxt.f...@fujitsu.com> wrote:
> > > >
> > > > > In 3, 6, 12 interleave ways, we could not access cxl memory properly,
> > > > > and when the process is running on it, a 'segmentation fault' error 
> > > > > will
> > > > > occur.
> > > > >
> > > > > According to the CXL specification '8.2.4.20.13 Decoder Protection',
> > > > > there are two branches to convert HPA to DPA:
> > > > > b1: Decoder[m].IW < 8 (for 1, 2, 4, 8, 16 interleave ways)
> > > > > b2: Decoder[m].IW >= 8 (for 3, 6, 12 interleave ways)
> > > > >
> > > > > but only b1 has been implemented.
> > > > >
> > > > > To solve this issue, we should implement b2:
> > > > >   DPAOffset[51:IG+8]=HPAOffset[51:IG+IW] / 3
> > > > >   DPAOffset[IG+7:0]=HPAOffset[IG+7:0]
> > > > >   DPA=DPAOffset + Decoder[n].DPABase
> > > > >
> > > > > Links:
> > > >
> https://lore.kernel.org/linux-cxl/3e84b919-7631-d1db-3e1d-33000f3f3868@fujits
> > > > u.com/
> > > > > Signed-off-by: Yao Xingtao <yaoxt.f...@fujitsu.com>
> > > >
> > > > Not implementing this was intentional (shouldn't seg fault obviously) 
> > > > but
> > > > I thought we were not advertising EP support for 3, 6, 12?  The HDM 
> > > > Decoder
> > > > configuration checking is currently terrible so we don't prevent
> > > > the bits being set (adding device side sanity checks for those decoders
> > > > has been on the todo list for a long time).  There are a lot of ways of
> > > > programming those that will blow up.
> > > >
> > > > Can you confirm that the emulation reports they are supported.
> > > >
> https://elixir.bootlin.com/qemu/v9.0.0-rc1/source/hw/cxl/cxl-component-utils.c
> > > > #L246
> > > > implies it shouldn't and so any software using them is broken.
> > > yes, the feature is not supported by QEMU, but I can still create a
> 6-interleave-ways region on kernel layer.
> > >
> > > I checked the source code of kernel, and found that the kernel did not 
> > > check
> this bit when committing decoder.
> > > we may add some check on kernel side.
> >
> > ouch.  We definitely want that check!  The decoder commit will fail
> > anyway (which QEMU doesn't yet because we don't do all the sanity checks
> > we should). However failing on commit is nasty as the reason should have
> > been detected earlier.
> >
> > >
> > > >
> > > > The non power of 2 decodes always made me nervous as the maths is more
> > > > complex and any changes to that decode will need careful checking.
> > > > For the power of 2 cases it was a bunch of writes to edge conditions etc
> > > > and checking the right data landed in the backing stores.
> > > after applying this modification, I tested some command by using these
> memory, like 'ls', 'top'..
> > > and they can be executed normally, maybe there are some other problems I
> haven't met yet.
> >
> > I usually run a bunch of manual tests with devmem2 to ensure the edge cases 
> > are
> handled
> > correctly, but I've not really seen any errors that didn't also show up in 
> > running
> > stressors (e.g. stressng) or just memhog on the memory.
> 
> 
> Hi Yao,
> 
> If you have time, please spin a v2 that also sets the relevant
> flag to say the QEMU emulation supports this interleave.
OK, I will.
> 
> Whilst we test the kernel fixes, we can just drop that patch but longer term 
> I'm
> find with having this support in general in the QEMU emulation - so I won't 
> queue
> it up as a fix, but instead as a feature.
> 
> Thanks,
> 
> Jonathan
> 
> >
> > Jonathan
> >
> > >
> > > >
> > > > Joanthan
> > > >
> > > >
> > > > > ---
> > > > >  hw/mem/cxl_type3.c | 15 +++++++++++----
> > > > >  1 file changed, 11 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> > > > > index b0a7e9f11b..2c1218fb12 100644
> > > > > --- a/hw/mem/cxl_type3.c
> > > > > +++ b/hw/mem/cxl_type3.c
> > > > > @@ -805,10 +805,17 @@ static bool cxl_type3_dpa(CXLType3Dev *ct3d,
> hwaddr
> > > > host_addr, uint64_t *dpa)
> > > > >              continue;
> > > > >          }
> > > > >
> > > > > -        *dpa = dpa_base +
> > > > > -            ((MAKE_64BIT_MASK(0, 8 + ig) & hpa_offset) |
> > > > > -             ((MAKE_64BIT_MASK(8 + ig + iw, 64 - 8 - ig - iw) &
> hpa_offset)
> > > > > -              >> iw));
> > > > > +        if (iw < 8) {
> > > > > +            *dpa = dpa_base +
> > > > > +                ((MAKE_64BIT_MASK(0, 8 + ig) & hpa_offset) |
> > > > > +                 ((MAKE_64BIT_MASK(8 + ig + iw, 64 - 8 - ig - iw) &
> > > > hpa_offset)
> > > > > +                  >> iw));
> > > > > +        } else {
> > > > > +            *dpa = dpa_base +
> > > > > +                ((MAKE_64BIT_MASK(0, 8 + ig) & hpa_offset) |
> > > > > +                 ((((MAKE_64BIT_MASK(ig + iw, 64 - ig - iw) &
> hpa_offset)
> > > > > +                   >> (ig + iw)) / 3) << (ig + 8)));
> > > > > +        }
> > > > >
> > > > >          return true;
> > > > >      }
> > >
> >
> >


Reply via email to