Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks

2024-05-03 Thread Shiyang Ruan via




在 2024/5/1 5:00, Alison Schofield 写道:

On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:

The length of Physical Address in General Media Event Record/DRAM Event
Record is 64-bit, so the field mask should be defined as such length.
Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
mask off the upper-32-bits of DPA addresses. The cxl_poison event is
unaffected.

If userspace was doing its own DPA-to-HPA translation this could lead to
incorrect page retirement decisions, but there is no known consumer
(like rasdaemon) of this event today.

Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
Cc: 
Cc: Dan Williams 
Cc: Davidlohr Bueso 
Cc: Jonathan Cameron 
Cc: Ira Weiny 
Signed-off-by: Shiyang Ruan 


Hi Ruan,

This fixup is important for the Event DPA->HPA translation work, so I
grabbed it, updated it with most* of the review comments, and posted
with that set. I expect you saw that in your mailbox.


Yes, I saw that.  Nice fix!



DaveJ queued it in a topic branch for 6.10 here:
https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.10/dpa-to-hpa



That's good!


*I did not create a common mask for events and poison because I wanted to
limit the changes. If you'd like to make that change it would be welcomed.


Ok~


--
Thanks,
Ruan.



-- Alison


---
  drivers/cxl/core/trace.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index e5f13260fc52..cdfce932d5b1 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
   * DRAM Event Record
   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
   */
-#define CXL_DPA_FLAGS_MASK 0x3F
+#define CXL_DPA_FLAGS_MASK 0x3FULL
  #define CXL_DPA_MASK  (~CXL_DPA_FLAGS_MASK)
  
  #define CXL_DPA_VOLATILE			BIT(0)

--
2.34.1





Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks

2024-04-30 Thread Alison Schofield
On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.
> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
> unaffected.
> 
> If userspace was doing its own DPA-to-HPA translation this could lead to
> incorrect page retirement decisions, but there is no known consumer
> (like rasdaemon) of this event today.
> 
> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
> Cc: 
> Cc: Dan Williams 
> Cc: Davidlohr Bueso 
> Cc: Jonathan Cameron 
> Cc: Ira Weiny 
> Signed-off-by: Shiyang Ruan 

Hi Ruan,

This fixup is important for the Event DPA->HPA translation work, so I
grabbed it, updated it with most* of the review comments, and posted
with that set. I expect you saw that in your mailbox.

DaveJ queued it in a topic branch for 6.10 here:
https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.10/dpa-to-hpa

*I did not create a common mask for events and poison because I wanted to
limit the changes. If you'd like to make that change it would be welcomed.

-- Alison

> ---
>  drivers/cxl/core/trace.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..cdfce932d5b1 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
>   * DRAM Event Record
>   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>   */
> -#define CXL_DPA_FLAGS_MASK   0x3F
> +#define CXL_DPA_FLAGS_MASK   0x3FULL
>  #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
>  
>  #define CXL_DPA_VOLATILE BIT(0)
> -- 
> 2.34.1
> 



Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks

2024-04-25 Thread Ira Weiny
Shiyang Ruan wrote:
> 
> 
> 在 2024/4/24 5:04, Ira Weiny 写道:
> > Alison Schofield wrote:
> >> On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
> > 
> > [snip]
> > 
> >>> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> >>> index e5f13260fc52..cdfce932d5b1 100644
> >>> --- a/drivers/cxl/core/trace.h
> >>> +++ b/drivers/cxl/core/trace.h
> >>> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
> >>>* DRAM Event Record
> >>>* CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> >>>*/
> >>> -#define CXL_DPA_FLAGS_MASK   0x3F
> >>> +#define CXL_DPA_FLAGS_MASK   0x3FULL
> >>>   #define CXL_DPA_MASK(~CXL_DPA_FLAGS_MASK)
> >>>   
> >>>   #define CXL_DPA_VOLATILEBIT(0)
> >>
> >> This works but I'm thinking this is the time to convene on one
> >> CXL_EVENT_DPA_MASK for both all CXL events, rather than having
> >> cxl_poison event be different.
> >>
> >> I prefer how poison defines it:
> >>
> >> cxlmem.h:#define CXL_POISON_START_MASK  GENMASK_ULL(63, 6)
> >>
> >> Can we rename that CXL_EVENT_DPA_MASK and use for all events?
> 
> cxlmem.h:CXL_POISON_START_MASK is defined for MBOX commands (poison 
> record, the lower 3 bits indicate "Error Source"), but CXL_DPA_MASK here 
> is for events.  They have different meaning though their values are 
> same.  So, I don't think we should consolidate them.

By definition the DPA in all these payloads can't use the lower 6 bits.
We are defining a mask to get the DPA.  This has nothing to do with what
may be stored in the other 6 bits.

Defining a common DPA mask is correct per both sections of the spec.

> 
> > 
> > Ah!  Great catch.  I dont' know why I only masked off the 2 used bits.
> 
> Per spec, the lowest 2 bits of CXL event's DPA are defined for "Volatile 
> or not" and "not repairable".  So there is no mistake here.

I appreciate your not calling out my code as a bug.  :-D

However, bits [5:2] are also Reserved.  So it is incorrect to mask off
only the lower 2 bits.  Even though the reserved bits must be 0 per the
spec, it is still better to properly mask all reserved bits because they
are by definition not part of the DPA.

Could you create a common macro for the next version of the patch?

Thanks,
Ira

> 
> > That was short sighted of me.
> > 
> > Yes we should consolidate these.
> > Ira
> 
> --
> Thanks,
> Ruan.
> 





Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks

2024-04-25 Thread Shiyang Ruan via




在 2024/4/24 5:04, Ira Weiny 写道:

Alison Schofield wrote:

On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:


[snip]


diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index e5f13260fc52..cdfce932d5b1 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
   * DRAM Event Record
   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
   */
-#define CXL_DPA_FLAGS_MASK 0x3F
+#define CXL_DPA_FLAGS_MASK 0x3FULL
  #define CXL_DPA_MASK  (~CXL_DPA_FLAGS_MASK)
  
  #define CXL_DPA_VOLATILE			BIT(0)


This works but I'm thinking this is the time to convene on one
CXL_EVENT_DPA_MASK for both all CXL events, rather than having
cxl_poison event be different.

I prefer how poison defines it:

cxlmem.h:#define CXL_POISON_START_MASK  GENMASK_ULL(63, 6)

Can we rename that CXL_EVENT_DPA_MASK and use for all events?


cxlmem.h:CXL_POISON_START_MASK is defined for MBOX commands (poison 
record, the lower 3 bits indicate "Error Source"), but CXL_DPA_MASK here 
is for events.  They have different meaning though their values are 
same.  So, I don't think we should consolidate them.




Ah!  Great catch.  I dont' know why I only masked off the 2 used bits.


Per spec, the lowest 2 bits of CXL event's DPA are defined for "Volatile 
or not" and "not repairable".  So there is no mistake here.



That was short sighted of me.

Yes we should consolidate these.
Ira


--
Thanks,
Ruan.




Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks

2024-04-23 Thread Ira Weiny
Alison Schofield wrote:
> On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:

[snip]

> > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> > index e5f13260fc52..cdfce932d5b1 100644
> > --- a/drivers/cxl/core/trace.h
> > +++ b/drivers/cxl/core/trace.h
> > @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
> >   * DRAM Event Record
> >   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> >   */
> > -#define CXL_DPA_FLAGS_MASK 0x3F
> > +#define CXL_DPA_FLAGS_MASK 0x3FULL
> >  #define CXL_DPA_MASK   (~CXL_DPA_FLAGS_MASK)
> >  
> >  #define CXL_DPA_VOLATILE   BIT(0)
> 
> This works but I'm thinking this is the time to convene on one 
> CXL_EVENT_DPA_MASK for both all CXL events, rather than having
> cxl_poison event be different.
> 
> I prefer how poison defines it:
> 
> cxlmem.h:#define CXL_POISON_START_MASK  GENMASK_ULL(63, 6)
> 
> Can we rename that CXL_EVENT_DPA_MASK and use for all events?

Ah!  Great catch.  I dont' know why I only masked off the 2 used bits.
That was short sighted of me.

Yes we should consolidate these.
Ira



Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks

2024-04-23 Thread Alison Schofield
On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.
> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
> unaffected.
> 
> If userspace was doing its own DPA-to-HPA translation this could lead to
> incorrect page retirement decisions, but there is no known consumer
> (like rasdaemon) of this event today.
> 

So, an invalid DPA is emitted in the trace event log and that could
lead to 'incorrect page retirement decisions...'

> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
> Cc: 
> Cc: Dan Williams 
> Cc: Davidlohr Bueso 
> Cc: Jonathan Cameron 
> Cc: Ira Weiny 
> Signed-off-by: Shiyang Ruan 
> ---
>  drivers/cxl/core/trace.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..cdfce932d5b1 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
>   * DRAM Event Record
>   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>   */
> -#define CXL_DPA_FLAGS_MASK   0x3F
> +#define CXL_DPA_FLAGS_MASK   0x3FULL
>  #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
>  
>  #define CXL_DPA_VOLATILE BIT(0)

This works but I'm thinking this is the time to convene on one 
CXL_EVENT_DPA_MASK for both all CXL events, rather than having
cxl_poison event be different.

I prefer how poison defines it:

cxlmem.h:#define CXL_POISON_START_MASK  GENMASK_ULL(63, 6)

Can we rename that CXL_EVENT_DPA_MASK and use for all events?

--Alison

> -- 
> 2.34.1
> 



Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks

2024-04-23 Thread Dan Williams
Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.
> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
> unaffected.
> 
> If userspace was doing its own DPA-to-HPA translation this could lead to
> incorrect page retirement decisions, but there is no known consumer
> (like rasdaemon) of this event today.
> 
> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
> Cc: 
> Cc: Dan Williams 
> Cc: Davidlohr Bueso 
> Cc: Jonathan Cameron 
> Cc: Ira Weiny 
> Signed-off-by: Shiyang Ruan 
> ---
>  drivers/cxl/core/trace.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..cdfce932d5b1 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
>   * DRAM Event Record
>   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>   */
> -#define CXL_DPA_FLAGS_MASK   0x3F
> +#define CXL_DPA_FLAGS_MASK   0x3FULL
>  #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
>  
>  #define CXL_DPA_VOLATILE BIT(0)

Looks good,

Reviewed-by: Dan Williams 



Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks

2024-04-23 Thread Ira Weiny
Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.
> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
> unaffected.
> 
> If userspace was doing its own DPA-to-HPA translation this could lead to
> incorrect page retirement decisions, but there is no known consumer
> (like rasdaemon) of this event today.
> 
> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
> Cc: 
> Cc: Dan Williams 
> Cc: Davidlohr Bueso 
> Cc: Jonathan Cameron 
> Cc: Ira Weiny 

Apologies I thought I saw this go in before.  But perhaps it was a
different mask.

Reviewed-by: Ira Weiny 

> Signed-off-by: Shiyang Ruan 
> ---
>  drivers/cxl/core/trace.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..cdfce932d5b1 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
>   * DRAM Event Record
>   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>   */
> -#define CXL_DPA_FLAGS_MASK   0x3F
> +#define CXL_DPA_FLAGS_MASK   0x3FULL
>  #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
>  
>  #define CXL_DPA_VOLATILE BIT(0)
> -- 
> 2.34.1
> 





[PATCH v3 1/2] cxl/core: correct length of DPA field masks

2024-04-17 Thread Shiyang Ruan via
The length of Physical Address in General Media Event Record/DRAM Event
Record is 64-bit, so the field mask should be defined as such length.
Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
mask off the upper-32-bits of DPA addresses. The cxl_poison event is
unaffected.

If userspace was doing its own DPA-to-HPA translation this could lead to
incorrect page retirement decisions, but there is no known consumer
(like rasdaemon) of this event today.

Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
Cc: 
Cc: Dan Williams 
Cc: Davidlohr Bueso 
Cc: Jonathan Cameron 
Cc: Ira Weiny 
Signed-off-by: Shiyang Ruan 
---
 drivers/cxl/core/trace.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index e5f13260fc52..cdfce932d5b1 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
  * DRAM Event Record
  * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
  */
-#define CXL_DPA_FLAGS_MASK 0x3F
+#define CXL_DPA_FLAGS_MASK 0x3FULL
 #define CXL_DPA_MASK   (~CXL_DPA_FLAGS_MASK)
 
 #define CXL_DPA_VOLATILE   BIT(0)
-- 
2.34.1