Re: 6.10/bisected/regression - commits bc87d666c05 and 6d4279cb99ac cause appearing green flashing bar on top of screen on Radeon 6900XT and 120Hz

2024-06-07 Thread Linux regression tracking (Thorsten Leemhuis)
[CCing the other amd drm maintainers]

On 05.06.24 14:04, Mikhail Gavrilov wrote:
> On Sun, May 26, 2024 at 7:06 PM Mikhail Gavrilov
>  wrote:
>>
>> Day before yesterday I replaced 7900XTX to 6900XT for got clear in
>> which kernel first time appeared warning message "DMA-API: amdgpu
>> :0f:00.0: cacheline tracking EEXIST, overlapping mappings aren't
>> supported".
>> The kernel 6.3 and older won't boot on a computer with Radeon 7900XTX.

Mikhail: are those details in any way relevant? Then in the future best
leave them out (or make things easier to follow), they make the bug
report confusing and sounds like this is just a bug, when it fact from
your bisection is sounds like this is a regression.

Anyway, @amd maintainers: is there a reason why this report did not get
at least a single reply? Or was there some progress somewhere and I just
missed it? Or would it be better if Mikhail would report this to
https://gitlab.freedesktop.org/drm/amd/-/issues/ ?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

>> When I booted the system with 6900XT I saw a green flashing bar on top
>> of the screen when I typed commands in the gnome terminal which was
>> maximized on full screen.
>>
>> Demonstration: https://youtu.be/tTvwQ_5pRkk
>> For reproduction you need Radeon 6900XT GPU connected to 120Hz OLED TV by 
>> HDMI.
>>
>> I bisected the issue and the first commit which I found was 6d4279cb99ac.
>> commit 6d4279cb99ac4f51d10409501d29969f687ac8dc (HEAD)
>> Author: Rodrigo Siqueira 
>> Date:   Tue Mar 26 10:42:05 2024 -0600
>>
>> drm/amd/display: Drop legacy code
>>
>> This commit removes code that are not used by display anymore.
>>
>> Acked-by: Hamza Mahfooz 
>> Signed-off-by: Rodrigo Siqueira 
>> Signed-off-by: Alex Deucher 
>>
>>  drivers/gpu/drm/amd/display/dc/inc/hw/stream_encoder.h |  4 
>>  drivers/gpu/drm/amd/display/dc/inc/resource.h  |  7 ---
>>  drivers/gpu/drm/amd/display/dc/optc/dcn20/dcn20_optc.c | 10 
>> --
>>  drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c | 33
>> +
>>  4 files changed, 1 insertion(+), 53 deletions(-)
>>
>> Every time after bisecting I usually make sure that I found the right
>> commit and build the kernel with revert of the bad commit.
>> But this time I again observed an issue after running a kernel builded
>> without commit 6d4279cb99ac.
>> And I decided to find a second bad commit.
>> The second bad commit has been bc87d666c05.
>> commit bc87d666c05a13e6d4ae1ddce41fc43d2567b9a2 (HEAD)
>> Author: Rodrigo Siqueira 
>> Date:   Tue Mar 26 11:55:19 2024 -0600
>>
>> drm/amd/display: Add fallback configuration for set DRR in DCN10
>>
>> Set OTG/OPTC parameters to 0 if something goes wrong on DCN10.
>>
>> Acked-by: Hamza Mahfooz 
>> Signed-off-by: Rodrigo Siqueira 
>> Signed-off-by: Alex Deucher 
>>
>>  drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c | 15 ---
>>  1 file changed, 12 insertions(+), 3 deletions(-)
>>
>> After reverting both these commits on top of 54f71b0369c9 the issue is gone.
>>
>> I also attach the build config.
>>
>> My hardware specs: https://linux-hardware.org/?probe=f25a873c5e
>>
>> Rodrigo or anyone else from the AMD team can you look please.
>>
> 
> Did anyone watch?
> 


Re: 6.10/regression/bisected commit c4cb23111103 causes sleeping function called from invalid context at kernel/locking/mutex.c:585

2024-05-28 Thread Linux regression tracking (Thorsten Leemhuis)
On 22.05.24 23:18, Chris Bainbridge wrote:
> On Tue, May 21, 2024 at 02:39:06PM +0500, Mikhail Gavrilov wrote:
>> Yesterday on the fresh kernel snapshot
>> I spotted a new bug message with follow stacktrace:
>> [4.307097] BUG: sleeping function called from invalid context at
>> kernel/locking/mutex.c:585
> I am also getting this error on every boot. Decoded stacktrace:

TWIMC & for the record: Boris also reported this; Vasant Hegde replied
and said a fix is in the works:

https://lore.kernel.org/all/898d356d-ec7d-41de-82d8-3ed4dc559...@amd.com/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot dup:
https://lore.kernel.org/all/cabxgcsn1z2gj99zsdhqwynptxbymrqhejdff8axxxoiz_0g...@mail.gmail.com/


[PATCH] drm/xe: remove unused struct 'xe_gt_desc'

2024-05-22 Thread linux
From: "Dr. David Alan Gilbert" 

'xe_gt_desc' is unused since
commit 1e6c20be6c83 ("drm/xe: Drop extra_gts[] declarations and
XE_GT_TYPE_REMOTE").

Remove it.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/xe/xe_pci.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index f326dbb1cecd..2ca210480bd1 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -40,12 +40,6 @@ struct xe_subplatform_desc {
const u16 *pciidlist;
 };
 
-struct xe_gt_desc {
-   enum xe_gt_type type;
-   u32 mmio_adj_limit;
-   u32 mmio_adj_offset;
-};
-
 struct xe_device_desc {
/* Should only ever be set for platforms without GMD_ID */
const struct xe_graphics_desc *graphics;
-- 
2.45.1



Re: [PATCH] drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2

2024-05-21 Thread Linux regression tracking (Thorsten Leemhuis)
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

Hmm, from here it looks like the patch now that it was reviewed more
that a week ago is still not even in -next. Is there a reason?

I know, we are in the merge window. But at the same time this is a fix
(that already lingered on the lists for way too long before it was
reviewed) for a regression in a somewhat recent kernel, so it in Linus
own words should be "expedited"[1].

Or are we again just missing a right person for the job in the CC?
Adding Dave and Sima just in case.

Ciao, Thorsten

[1]
https://lore.kernel.org/all/CAHk-=wis_qqy4odnynnki5b7qhosmxtoj1jxo5wmb6sruwq...@mail.gmail.com/

On 12.05.24 18:11, Limonciello, Mario wrote:
> On 5/10/2024 4:24 AM, Jani Nikula wrote:
>> On Fri, 10 May 2024, "Lin, Wayne"  wrote:
>>>> -Original Message-
>>>> From: Limonciello, Mario 
>>>> Sent: Friday, May 10, 2024 3:18 AM
>>>> To: Linux regressions mailing list ;
>>>> Wentland, Harry
>>>> ; Lin, Wayne 
>>>> Cc: ly...@redhat.com; imre.d...@intel.com; Leon Weiß
>>>> >>> bochum.de>; sta...@vger.kernel.org; dri-devel@lists.freedesktop.org;
>>>> amd-
>>>> g...@lists.freedesktop.org; intel-...@lists.freedesktop.org
>>>> Subject: Re: [PATCH] drm/mst: Fix NULL pointer dereference at
>>>> drm_dp_add_payload_part2
>>>>
>>>> On 5/9/2024 07:43, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>>> On 18.04.24 21:43, Harry Wentland wrote:
>>>>>> On 2024-03-07 01:29, Wayne Lin wrote:
>>>>>>> [Why]
>>>>>>> Commit:
>>>>>>> - commit 5aa1dfcdf0a4 ("drm/mst: Refactor the flow for payload
>>>>>>> allocation/removement") accidently overwrite the commit
>>>>>>> - commit 54d217406afe ("drm: use mgr->dev in drm_dbg_kms in
>>>>>>> drm_dp_add_payload_part2") which cause regression.
>>>>>>>
>>>>>>> [How]
>>>>>>> Recover the original NULL fix and remove the unnecessary input
>>>>>>> parameter 'state' for drm_dp_add_payload_part2().
>>>>>>>
>>>>>>> Fixes: 5aa1dfcdf0a4 ("drm/mst: Refactor the flow for payload
>>>>>>> allocation/removement")
>>>>>>> Reported-by: Leon Weiß 
>>>>>>> Link:
>>>>>>> https://lore.kernel.org/r/38c253ea42072cc825dc969ac4e6b9b600371cc8.c
>>>>>>> a...@ruhr-uni-bochum.de/
>>>>>>> Cc: ly...@redhat.com
>>>>>>> Cc: imre.d...@intel.com
>>>>>>> Cc: sta...@vger.kernel.org
>>>>>>> Cc: regressi...@lists.linux.dev
>>>>>>> Signed-off-by: Wayne Lin 
>>>>>>
>>>>>> I haven't been deep in MST code in a while but this all looks pretty
>>>>>> straightforward and good.
>>>>>>
>>>>>> Reviewed-by: Harry Wentland 
>>>>>
>>>>> Hmmm, that was three weeks ago, but it seems since then nothing
>>>>> happened to fix the linked regression through this or some other
>>>>> patch. Is there a reason? The build failure report from the CI maybe?
>>>>
>>>> It touches files outside of amd but only has an ack from AMD.  I
>>>> think we
>>>> /probably/ want an ack from i915 and nouveau to take it through.
>>>
>>> Thanks, Mario!
>>>
>>> Hi Thorsten,
>>> Yeah, like what Mario said. Would also like to have ack from i915 and
>>> nouveau.
>>
>> It usually works better if you Cc the folks you want an ack from! ;)
>>
>> Acked-by: Jani Nikula 
>>
> 
> Thanks! Can someone with commit permissions take this to drm-misc?
> 
> 
> 


[PATCH v2] drm/bridge: analogix: remove unused struct 'bridge_init'

2024-05-20 Thread linux
From: "Dr. David Alan Gilbert" 

commit 6a1688ae8794 ("drm/bridge: ptn3460: Convert to I2C driver model")
has dropped all the users of the struct bridge_init from the
exynos_dp_core, while retaining unused structure definition.
Later on the driver was reworked and the definition migrated
to the analogix_dp driver. Remove unused struct bridge_init definition.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c 
b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index df9370e0ff23..1e03f3525a92 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -36,11 +36,6 @@
 
 static const bool verify_fast_training;
 
-struct bridge_init {
-   struct i2c_client *client;
-   struct device_node *node;
-};
-
 static int analogix_dp_init_dp(struct analogix_dp_device *dp)
 {
int ret;
-- 
2.45.1



[PATCH 2/3] drm/amd/display: remove unused struct 'aux_payloads'

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

'aux_payloads' is unused since
commit eae5ffa9bd7b ("drm/amd/display: Switch ddc to new aux interface")
Remove it.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c 
b/drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c
index c2d40979203e..d6d5bbf2108c 100644
--- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c
+++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c
@@ -51,10 +51,6 @@ struct i2c_payloads {
struct vector payloads;
 };
 
-struct aux_payloads {
-   struct vector payloads;
-};
-
 static bool i2c_payloads_create(
struct dc_context *ctx,
struct i2c_payloads *payloads,
-- 
2.45.1



[PATCH 3/3] drm/amd/display: remove unused struct 'dc_reg_sequence'

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

'dc_reg_sequence' was added in
commit 44788bbc309b ("drm/amd/display: refactor reg_update")

but isn't actually used.

Remove it.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/amd/display/dc/dc_helper.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc_helper.c 
b/drivers/gpu/drm/amd/display/dc/dc_helper.c
index 8f9a67825615..b81419c95222 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_helper.c
+++ b/drivers/gpu/drm/amd/display/dc/dc_helper.c
@@ -91,11 +91,6 @@ struct dc_reg_value_masks {
uint32_t mask;
 };
 
-struct dc_reg_sequence {
-   uint32_t addr;
-   struct dc_reg_value_masks value_masks;
-};
-
 static inline void set_reg_field_value_masks(
struct dc_reg_value_masks *field_value_mask,
uint32_t value,
-- 
2.45.1



[PATCH 1/3] drm/amdgpu: remove unused struct 'hqd_registers'

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

'hqd_registers' used to be used in a member of the 'bonaire_mqd'
struct. 'bonaire_mqd' was removed by
commit 486d807cd9a9 ("drm/amdgpu: remove duplicate definition of
cik_mqd")
It's now unused.

Remove 'hqd_registers' as well.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 38 ---
 1 file changed, 38 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index 541dbd70d8c7..f3544f02ffb9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -2757,44 +2757,6 @@ static int gfx_v7_0_mec_init(struct amdgpu_device *adev)
return 0;
 }
 
-struct hqd_registers {
-   u32 cp_mqd_base_addr;
-   u32 cp_mqd_base_addr_hi;
-   u32 cp_hqd_active;
-   u32 cp_hqd_vmid;
-   u32 cp_hqd_persistent_state;
-   u32 cp_hqd_pipe_priority;
-   u32 cp_hqd_queue_priority;
-   u32 cp_hqd_quantum;
-   u32 cp_hqd_pq_base;
-   u32 cp_hqd_pq_base_hi;
-   u32 cp_hqd_pq_rptr;
-   u32 cp_hqd_pq_rptr_report_addr;
-   u32 cp_hqd_pq_rptr_report_addr_hi;
-   u32 cp_hqd_pq_wptr_poll_addr;
-   u32 cp_hqd_pq_wptr_poll_addr_hi;
-   u32 cp_hqd_pq_doorbell_control;
-   u32 cp_hqd_pq_wptr;
-   u32 cp_hqd_pq_control;
-   u32 cp_hqd_ib_base_addr;
-   u32 cp_hqd_ib_base_addr_hi;
-   u32 cp_hqd_ib_rptr;
-   u32 cp_hqd_ib_control;
-   u32 cp_hqd_iq_timer;
-   u32 cp_hqd_iq_rptr;
-   u32 cp_hqd_dequeue_request;
-   u32 cp_hqd_dma_offload;
-   u32 cp_hqd_sema_cmd;
-   u32 cp_hqd_msg_type;
-   u32 cp_hqd_atomic0_preop_lo;
-   u32 cp_hqd_atomic0_preop_hi;
-   u32 cp_hqd_atomic1_preop_lo;
-   u32 cp_hqd_atomic1_preop_hi;
-   u32 cp_hqd_hq_scheduler0;
-   u32 cp_hqd_hq_scheduler1;
-   u32 cp_mqd_control;
-};
-
 static void gfx_v7_0_compute_pipe_init(struct amdgpu_device *adev,
   int mec, int pipe)
 {
-- 
2.45.1



[PATCH 0/3] A bunch of struct removals

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

A bunch of deadcode/struct removals in drm/amd

Signed-off-by: Dr. David Alan Gilbert 


Dr. David Alan Gilbert (3):
  drm/amdgpu: remove unused struct 'hqd_registers'
  drm/amd/display: remove unused struct 'aux_payloads'
  drm/amd/display: remove unused struct 'dc_reg_sequence'

 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 38 ---
 drivers/gpu/drm/amd/display/dc/dc_helper.c|  5 ---
 .../amd/display/dc/link/protocols/link_ddc.c  |  4 --
 3 files changed, 47 deletions(-)

-- 
2.45.1



[PATCH 3/6] drm/vmwgfx: remove unused struct 'vmw_stdu_dma'

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

'vmw_stdu_dma' is unused since
commit 39985eea5a6d ("drm/vmwgfx: Abstract placement selection")
Remove it.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
index 2041c4d48daa..50022e9e3519 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
@@ -85,11 +85,6 @@ struct vmw_stdu_update {
SVGA3dCmdUpdateGBScreenTarget body;
 };
 
-struct vmw_stdu_dma {
-   SVGA3dCmdHeader header;
-   SVGA3dCmdSurfaceDMA body;
-};
-
 struct vmw_stdu_surface_copy {
SVGA3dCmdHeader  header;
SVGA3dCmdSurfaceCopy body;
-- 
2.45.1



[PATCH 2/6] drm/nouveau: remove unused struct 'init_exec'

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

'init_exec' is unused since
commit cb75d97e9c77 ("drm/nouveau: implement devinit subdev, and new
init table parser")
Remove it.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/nouveau/nouveau_bios.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bios.c 
b/drivers/gpu/drm/nouveau/nouveau_bios.c
index 79cfab53f80e..8c3c1f1e01c5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bios.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bios.c
@@ -43,11 +43,6 @@
 #define BIOSLOG(sip, fmt, arg...) NV_DEBUG(sip->dev, fmt, ##arg)
 #define LOG_OLD_VALUE(x)
 
-struct init_exec {
-   bool execute;
-   bool repeat;
-};
-
 static bool nv_cksum(const uint8_t *data, unsigned int length)
 {
/*
-- 
2.45.1



[PATCH 1/6] drm/bridge: analogix: remove unused struct 'bridge_init'

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

'bridge_init' is unused, I think following:
commit 6a1688ae8794 ("drm/bridge: ptn3460: Convert to I2C driver model")
(which is where a git --follow finds it)
Remove it.

Build tested.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c 
b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index df9370e0ff23..1e03f3525a92 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -36,11 +36,6 @@
 
 static const bool verify_fast_training;
 
-struct bridge_init {
-   struct i2c_client *client;
-   struct device_node *node;
-};
-
 static int analogix_dp_init_dp(struct analogix_dp_device *dp)
 {
int ret;
-- 
2.45.1



[PATCH] drm/komeda: remove unused struct 'gamma_curve_segment'

2024-05-16 Thread linux
From: "Dr. David Alan Gilbert" 

'gamma_curve_segment' looks like it has never been used.
Remove it.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/arm/display/komeda/komeda_color_mgmt.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_color_mgmt.c 
b/drivers/gpu/drm/arm/display/komeda/komeda_color_mgmt.c
index d8e449e6ebda..50cb8f7ee6b2 100644
--- a/drivers/gpu/drm/arm/display/komeda/komeda_color_mgmt.c
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_color_mgmt.c
@@ -72,11 +72,6 @@ struct gamma_curve_sector {
u32 segment_width;
 };
 
-struct gamma_curve_segment {
-   u32 start;
-   u32 end;
-};
-
 static struct gamma_curve_sector sector_tbl[] = {
{ 0,4,  4   },
{ 16,   4,  4   },
-- 
2.45.0



Re: [PATCH] drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2

2024-05-09 Thread Linux regression tracking (Thorsten Leemhuis)
On 18.04.24 21:43, Harry Wentland wrote:
> On 2024-03-07 01:29, Wayne Lin wrote:
>> [Why]
>> Commit:
>> - commit 5aa1dfcdf0a4 ("drm/mst: Refactor the flow for payload 
>> allocation/removement")
>> accidently overwrite the commit
>> - commit 54d217406afe ("drm: use mgr->dev in drm_dbg_kms in 
>> drm_dp_add_payload_part2")
>> which cause regression.
>>
>> [How]
>> Recover the original NULL fix and remove the unnecessary input parameter 
>> 'state' for
>> drm_dp_add_payload_part2().
>>
>> Fixes: 5aa1dfcdf0a4 ("drm/mst: Refactor the flow for payload 
>> allocation/removement")
>> Reported-by: Leon Weiß 
>> Link: 
>> https://lore.kernel.org/r/38c253ea42072cc825dc969ac4e6b9b600371cc8.ca...@ruhr-uni-bochum.de/
>> Cc: ly...@redhat.com
>> Cc: imre.d...@intel.com
>> Cc: sta...@vger.kernel.org
>> Cc: regressi...@lists.linux.dev
>> Signed-off-by: Wayne Lin 
> 
> I haven't been deep in MST code in a while but this all looks
> pretty straightforward and good.
> 
> Reviewed-by: Harry Wentland 

Hmmm, that was three weeks ago, but it seems since then nothing happened
to fix the linked regression through this or some other patch. Is there
a reason? The build failure report from the CI maybe?

Wayne Lin, do you know what's up?

Ciao, Thorsten

>> ---
>>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 2 +-
>>  drivers/gpu/drm/display/drm_dp_mst_topology.c | 4 +---
>>  drivers/gpu/drm/i915/display/intel_dp_mst.c   | 2 +-
>>  drivers/gpu/drm/nouveau/dispnv50/disp.c   | 2 +-
>>  include/drm/display/drm_dp_mst_helper.h   | 1 -
>>  5 files changed, 4 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
>> index c27063305a13..2c36f3d00ca2 100644
>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
>> @@ -363,7 +363,7 @@ void dm_helpers_dp_mst_send_payload_allocation(
>>  mst_state = to_drm_dp_mst_topology_state(mst_mgr->base.state);
>>  new_payload = drm_atomic_get_mst_payload_state(mst_state, 
>> aconnector->mst_output_port);
>>  
>> -ret = drm_dp_add_payload_part2(mst_mgr, mst_state->base.state, 
>> new_payload);
>> +ret = drm_dp_add_payload_part2(mst_mgr, new_payload);
>>  
>>  if (ret) {
>>  amdgpu_dm_set_mst_status(>mst_status,
>> diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c 
>> b/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> index 03d528209426..95fd18f24e94 100644
>> --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> @@ -3421,7 +3421,6 @@ EXPORT_SYMBOL(drm_dp_remove_payload_part2);
>>  /**
>>   * drm_dp_add_payload_part2() - Execute payload update part 2
>>   * @mgr: Manager to use.
>> - * @state: The global atomic state
>>   * @payload: The payload to update
>>   *
>>   * If @payload was successfully assigned a starting time slot by 
>> drm_dp_add_payload_part1(), this
>> @@ -3430,14 +3429,13 @@ EXPORT_SYMBOL(drm_dp_remove_payload_part2);
>>   * Returns: 0 on success, negative error code on failure.
>>   */
>>  int drm_dp_add_payload_part2(struct drm_dp_mst_topology_mgr *mgr,
>> - struct drm_atomic_state *state,
>>   struct drm_dp_mst_atomic_payload *payload)
>>  {
>>  int ret = 0;
>>  
>>  /* Skip failed payloads */
>>  if (payload->payload_allocation_status != 
>> DRM_DP_MST_PAYLOAD_ALLOCATION_DFP) {
>> -drm_dbg_kms(state->dev, "Part 1 of payload creation for %s 
>> failed, skipping part 2\n",
>> +drm_dbg_kms(mgr->dev, "Part 1 of payload creation for %s 
>> failed, skipping part 2\n",
>>  payload->port->connector->name);
>>  return -EIO;
>>  }
>> diff --git a/drivers/gpu/drm/i915/display/intel_dp_mst.c 
>> b/drivers/gpu/drm/i915/display/intel_dp_mst.c
>> index 53aec023ce92..2fba66aec038 100644
>> --- a/drivers/gpu/drm/i915/display/intel_dp_mst.c
>> +++ b/drivers/gpu/drm/i915/display/intel_dp_mst.c
>> @@ -1160,7 +1160,7 @@ static void intel_mst_enable_dp(struct 
>> intel_atomic_state *state,
>>  if (first_mst_stream)
>>  intel_ddi_wait_for_fec_status(encoder, pipe_config, true);
>>  
>> -drm_dp_add_payload_part2(_dp->mst_mgr, >base,
>> +drm_dp_add_payload_part2(_dp->mst_mgr,
>>   drm_atomic_get_mst_payload_state(mst_state, 
>> connector->port));
>>  
>>  if (DISPLAY_VER(dev_priv) >= 12)
>> diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c 
>> b/drivers/gpu/drm/nouveau/dispnv50/disp.c
>> index 0c3d88ad0b0e..88728a0b2c25 100644
>> --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
>> +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
>> @@ -915,7 +915,7 @@ nv50_msto_cleanup(struct drm_atomic_state *state,
>>  msto->disabled = false;
>>  

Re: [Regression] 6.9.0: WARNING: workqueue: WQ_MEM_RECLAIM ttm:ttm_bo_delayed_delete [ttm] is flushing !WQ_MEM_RECLAIM events:qxl_gc_work [qxl]

2024-05-08 Thread Linux regression tracking (Thorsten Leemhuis)
On 08.05.24 14:35, Anders Blomdell wrote:
> On 2024-05-07 07:04, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 06.05.24 16:30, David Wang wrote:
>>>> On 30.04.24 08:13, David Wang wrote:
>>
>>>> And confirmed that the warning is caused by
>>>> 07ed11afb68d94eadd4ffc082b97c2331307c5ea and reverting it can fix.
>>>
>>> The kernel warning still shows up in 6.9.0-rc7.
>>> (I think 4 high load processes on a 2-Core VM could easily trigger
>>> the kernel warning.)
>>
>> Thx for the report. Linus just reverted the commit 07ed11afb68 you
>> mentioned in your initial mail (I put that quote in again, see above):
>>
>> 3628e0383dd349 ("Reapply "drm/qxl: simplify qxl_fence_wait"")
>> https://git.kernel.org/torvalds/c/3628e0383dd349f02f882e612ab6184e4bb3dc10
>>
>> So this hopefully should be history now.
>>
> Since this affects the 6.8 series (6.8.7 and onwards), I made a CC to
> sta...@vger.kernel.org

Ohh, good idea, I thought Linus had added a stable tag, but that is not
the case. Adding Greg as well and making things explicit:

@Greg: you might want to add 3628e0383dd349 ("Reapply "drm/qxl: simplify
qxl_fence_wait"") to all branches that received 07ed11afb68d94 ("Revert
"drm/qxl: simplify qxl_fence_wait"") (which afaics went into v6.8.7,
v6.6.28, v6.1.87, and v5.15.156).

Ciao, Thorsten


Re: [Regression] 6.9.0: WARNING: workqueue: WQ_MEM_RECLAIM ttm:ttm_bo_delayed_delete [ttm] is flushing !WQ_MEM_RECLAIM events:qxl_gc_work [qxl]

2024-05-06 Thread Linux regression tracking (Thorsten Leemhuis)



On 06.05.24 16:30, David Wang wrote:
>> On 30.04.24 08:13, David Wang wrote:

>> And confirmed that the warning is caused by
>> 07ed11afb68d94eadd4ffc082b97c2331307c5ea and reverting it can fix.
>
> The kernel warning still shows up in 6.9.0-rc7.
> (I think 4 high load processes on a 2-Core VM could easily trigger the kernel 
> warning.)

Thx for the report. Linus just reverted the commit 07ed11afb68 you
mentioned in your initial mail (I put that quote in again, see above):

3628e0383dd349 ("Reapply "drm/qxl: simplify qxl_fence_wait"")
https://git.kernel.org/torvalds/c/3628e0383dd349f02f882e612ab6184e4bb3dc10

So this hopefully should be history now.

Ciao, Thorsten


Re: nouveau: r535.c:1266:3: error: label at end of compound statement default: with gcc-8

2024-04-29 Thread Linux regression tracking (Thorsten Leemhuis)



On 29.04.24 17:06, Naresh Kamboju wrote:
> Following build warnings / errors noticed on Linux next-20240429 tag on the
> arm64, arm and riscv with gcc-8 and gcc-13 builds pass.
> 
> Reported-by: Linux Kernel Functional Testing 
> 
> Commit id:
>  b58a0bc904ff nouveau: add command-line GSP-RM registry support
> 
> Buids:
> --
>   gcc-8-arm64-defconfig - Fail
>   gcc-8-arm-defconfig - Fail
>   gcc-8-riscv-defconfig - Fail
> 
> Build log:
> 
> drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c: In function 'build_registry':
> drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1266:3: error: label at
> end of compound statement
>default:
>^~~
> make[7]: *** [scripts/Makefile.build:244:
> drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.o] Error 1

TWIMC, there is another report about this in this thread (sadly some of
its post did not make it to lore):

https://lore.kernel.org/all/162ef3c0-1d7b-4220-a21f-b0008657f...@redhat.com/

Ciao, Thorsten

> metadata:
>   git_describe: next-20240429
>   git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
>   git_short_log: b0a2c79c6f35 ("Add linux-next specific files for 20240429")
>   arch: arm64, arm, riscv
>   toolchain: gcc-8
> 
> Steps to reproduce:
> 
> # tuxmake --runtime podman --target-arch arm64 --toolchain gcc-8
> --kconfig defconfig
> 
> Links:
>  - 
> https://storage.tuxsuite.com/public/linaro/lkft/builds/2flcoOuqVJfhTvX4AOYsWMd5hqe/
>  - 
> https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240429/testrun/23704376/suite/build/test/gcc-8-defconfig/history/
>  - 
> https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240429/testrun/23705756/suite/build/test/gcc-8-defconfig/details/
> 
> 
> --
> Linaro LKFT
> https://lkft.linaro.org
> 
> 


Re: [REGRESSION] external monitor+Dell dock in 6.8

2024-04-02 Thread Linux regression tracking (Thorsten Leemhuis)
[Adding a few folks and list while dropping the stable list, as this is
unrelated to it]

On 31.03.24 07:59, Andrei Gaponenko wrote:
> 
> I noticed a regression with the mailine kernel pre-compiled by EPEL.
> I have just tried linux-6.9-rc1.tar.gz from kernel.org, and it still
> misbehaves.
> 
> The default setup: a laptop is connected to a dock, Dell WD22TB4, via
> a USB-C cable.  The dock is connected to an external monitor via a
> Display Port cable.  With a "good" kernel everything works.  With a
> "broken" kernel, the external monitor is still correctly identified by
> the system, and is shown as enabled in plasma systemsettings. The
> system also behaves like the monitor is working, for example, one can
> move the mouse pointer off the laptop screen.  However the external
> monitor screen stays black, and it eventually goes to sleep.

Just a quick heads up to ensure people are aware of it:

Imre Deak, turns out this is caused by a patch of yours: 55eaef16417448
("drm/i915/dp_mst: Handle the Synaptics HBlank expansion quirk"). Andrei
Gaponenko meanwhile filed a ticket about it here:

https://gitlab.freedesktop.org/drm/intel/-/issues/10637

Ciao, Thorsten

> Everything worked with EPEL mainline kernels up to and including
> kernel-ml-6.7.9-1.el9.elrepo.x86_64
> 
> The breakage is observed in
> 
> kernel-ml-6.8.1-1.el9.elrepo.x86_64
> kernel-ml-6.8.2-1.el9.elrepo.x86_64
> linux-6.9-rc1.tar.gz from kernel.org (with olddefconfig)
> 
> Other tests: using an HDMI cable instead of the Display Port cable
> between the monitor and the dock does not change things, black screen
> with the newer kernels.
> 
> Using a small HDMI-to-USB-C adapter instead of the dock results in a
> working system, even with the newer kernels.  So the breakage appears
> to be specific to the Dell WD22TB4 dock.
> 
> Operating System: AlmaLinux 9.3 (Shamrock Pampas Cat)
> 
> uname -mi: x86_64 x86_64
> 
> Laptop: Dell Precision 5470/02RK6V
> 
> lsusb |grep dock
> Bus 003 Device 007: ID 413c:b06e Dell Computer Corp. Dell dock
> Bus 003 Device 008: ID 413c:b06f Dell Computer Corp. Dell dock
> Bus 003 Device 006: ID 0bda:5413 Realtek Semiconductor Corp. Dell dock
> Bus 003 Device 005: ID 0bda:5487 Realtek Semiconductor Corp. Dell dock
> Bus 002 Device 004: ID 0bda:0413 Realtek Semiconductor Corp. Dell dock
> Bus 002 Device 003: ID 0bda:0487 Realtek Semiconductor Corp. Dell dock
> 
> dmesg and kernel config are attached to 
> https://bugzilla.kernel.org/show_bug.cgi?id=218663
> 
> #regzbot introduced: v6.7.9..v6.8.1

P.S.:

#regzbot duplicate: https://bugzilla.kernel.org/show_bug.cgi?id=218663
#regzbot duplicate: https://gitlab.freedesktop.org/drm/intel/-/issues/10637
#regzbot title: drm/i915/dp_mst: external monitor on Dell dock broke


Re: [PATCH 1/1] drm/qxl: fixes qxl_fence_wait

2024-03-20 Thread Linux regression tracking (Thorsten Leemhuis)
On 08.03.24 02:08, Alex Constantino wrote:
> Fix OOM scenario by doing multiple notifications to the OOM handler through
> a busy wait logic.
> Changes from commit 5a838e5d5825 ("drm/qxl: simplify qxl_fence_wait") would
> result in a '[TTM] Buffer eviction failed' exception whenever it reached a
> timeout.
> 
> Fixes: 5a838e5d5825 ("drm/qxl: simplify qxl_fence_wait")
> Link: 
> https://lore.kernel.org/regressions/fb0fda6a-3750-4e1b-893f-97a3e402b...@leemhuis.info
> Reported-by: Timo Lindfors 
> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1054514
> Signed-off-by: Alex Constantino 
> ---
>  drivers/gpu/drm/qxl/qxl_release.c | 20 ++--
>  1 file changed, 14 insertions(+), 6 deletions(-)

Hey Dave and Gerd as well as Thomas, Maarten and Maxime (the latter two
I just added to the CC), it seems to me this regression fix did not
maybe any progress since it was posted. Did I miss something, is it just
"we are busy with the merge window", or is there some other a reason?
Just wondering, I just saw someone on a Fedora IRC channel complaining
about the regression, that's why I'm asking. Would be really good to
finally get this resolved...

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

> diff --git a/drivers/gpu/drm/qxl/qxl_release.c 
> b/drivers/gpu/drm/qxl/qxl_release.c
> index 368d26da0d6a..51c22e7f9647 100644
> --- a/drivers/gpu/drm/qxl/qxl_release.c
> +++ b/drivers/gpu/drm/qxl/qxl_release.c
> @@ -20,8 +20,6 @@
>   * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
>   */
>  
> -#include 
> -
>  #include 
>  
>  #include "qxl_drv.h"
> @@ -59,14 +57,24 @@ static long qxl_fence_wait(struct dma_fence *fence, bool 
> intr,
>  {
>   struct qxl_device *qdev;
>   unsigned long cur, end = jiffies + timeout;
> + signed long iterations = 1;
> + signed long timeout_fraction = timeout;
>  
>   qdev = container_of(fence->lock, struct qxl_device, release_lock);
>  
> - if (!wait_event_timeout(qdev->release_event,
> + // using HZ as a factor since it is used in ttm_bo_wait_ctx too
> + if (timeout_fraction > HZ) {
> + iterations = timeout_fraction / HZ;
> + timeout_fraction = HZ;
> + }
> + for (int i = 0; i < iterations; i++) {
> + if (wait_event_timeout(
> + qdev->release_event,
>   (dma_fence_is_signaled(fence) ||
> -  (qxl_io_notify_oom(qdev), 0)),
> - timeout))
> - return 0;
> + (qxl_io_notify_oom(qdev), 0)),
> + timeout_fraction))
> + break;
> + }
>  
>   cur = jiffies;
>   if (time_after(cur, end))


Re: [PATCH] Fix divide-by-zero on DP unplug with nouveau

2024-03-11 Thread Linux regression tracking (Thorsten Leemhuis)
On 11.03.24 17:09, Imre Deak wrote:
> On Sat, Feb 10, 2024 at 09:24:59PM +, Chris Bainbridge wrote:
> Sorry for the delay.

Happens, thx for looking onto this!

>> The following trace occurs when using nouveau and unplugging a DP MST
>> adaptor:
> [...] 
>> +if (bpp_x16 == 0)
>> +return 0;
> 
> Could you please move the check to the beginnig of the function and add
> a debug message in case bpp_x16 is 0?
> 
> It looks odd that a driver calls this function with a 0 bpp_x16, and
> ideally it should be fixed in the driver. However as it's a regression
> and we don't have a better idea now:
> 
> Acked-by: Imre Deak 

Chris: as this went into 6.8, please consider adding a stable-tag to
ensure Greg picks this up.

Ciao, Thorsten



Re: [REGRESSION] Divide-by-zero on DisplayPort MST unplug with nouveau

2024-03-11 Thread Linux regression tracking (Thorsten Leemhuis)
On 07.03.24 18:58, Chris Bainbridge wrote:
> - Forwarded message from Chris Bainbridge  
> -
> 
> Date: Sat, 10 Feb 2024 21:24:59 +

Hmm, it looks like nobody is looking into this regression. Is there a
good reason?

Imre, or did you maybe just miss that Chris' regression seems to be
caused by a commit of yours? He initally proposed a fix (the forwarded
mail that is quoted here) more a month ago already here:
https://lore.kernel.org/all/ZcfpqwnkSoiJxeT9@debian.local/

Chris recently filed a ticket, too:
https://gitlab.freedesktop.org/drm/misc/kernel/-/issues/36

Mostly silence there as well. :-/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S: Chris, sorry, I had missed that you initially proposed the fix a
month ago; if I had noticed this earlier I had sent a mail like this one
earlier.
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

> From: Chris Bainbridge 
> To: dri-devel@lists.freedesktop.org
> Cc: ly...@redhat.com, ville.syrj...@linux.intel.com, 
> stanislav.lisovs...@intel.com,
>   mrip...@kernel.org, imre.d...@intel.com
> Subject: [PATCH] Fix divide-by-zero on DP unplug with nouveau
> 
> The following trace occurs when using nouveau and unplugging a DP MST
> adaptor:
>>  divide error:  [#1] PREEMPT SMP PTI
>  CPU: 7 PID: 2962 Comm: Xorg Not tainted 6.8.0-rc3+ #744
>  Hardware name: Razer Blade/DANA_MB, BIOS 01.01 08/31/2018
>  RIP: 0010:drm_dp_bw_overhead+0xb4/0x110 [drm_display_helper]
>  Code: c6 b8 01 00 00 00 75 61 01 c6 41 0f af f3 41 0f af f1 c1 e1 04 48 63 
> c7 31 d2 89 ff 48 8b 5d f8 c9 48 0f af f1 48 8d 44 06 ff <48> f7 f7 31 d2 31 
> c9 31 f6 31 ff 45 31 c0 45 31 c9 45 31 d2 45 31
>  RSP: 0018:b2c5c211fa30 EFLAGS: 00010206
>  RAX:  RBX:  RCX: 00f59b00
>  RDX:  RSI:  RDI: 
>  RBP: b2c5c211fa48 R08: 0001 R09: 0020
>  R10: 0004 R11:  R12: 00023b4a
>  R13: 91d37d165800 R14: 91d36fac6d80 R15: 91d34a764010
>  FS:  7f4a1ca3fa80() GS:91d6edbc() knlGS:
>  CS:  0010 DS:  ES:  CR0: 80050033
>  CR2: 559491d49000 CR3: 00011d180002 CR4: 003706f0
>  Call Trace:
>   
>   ? show_regs+0x6d/0x80
>   ? die+0x37/0xa0
>   ? do_trap+0xd4/0xf0
>   ? do_error_trap+0x71/0xb0
>   ? drm_dp_bw_overhead+0xb4/0x110 [drm_display_helper]
>   ? exc_divide_error+0x3a/0x70
>   ? drm_dp_bw_overhead+0xb4/0x110 [drm_display_helper]
>   ? asm_exc_divide_error+0x1b/0x20
>   ? drm_dp_bw_overhead+0xb4/0x110 [drm_display_helper]
>   ? drm_dp_calc_pbn_mode+0x2e/0x70 [drm_display_helper]
>   nv50_msto_atomic_check+0xda/0x120 [nouveau]
>   drm_atomic_helper_check_modeset+0xa87/0xdf0 [drm_kms_helper]
>   drm_atomic_helper_check+0x19/0xa0 [drm_kms_helper]
>   nv50_disp_atomic_check+0x13f/0x2f0 [nouveau]
>   drm_atomic_check_only+0x668/0xb20 [drm]
>   ? drm_connector_list_iter_next+0x86/0xc0 [drm]
>   drm_atomic_commit+0x58/0xd0 [drm]
>   ? __pfx___drm_printfn_info+0x10/0x10 [drm]
>   drm_atomic_connector_commit_dpms+0xd7/0x100 [drm]
>   drm_mode_obj_set_property_ioctl+0x1c5/0x450 [drm]
>   ? __pfx_drm_connector_property_set_ioctl+0x10/0x10 [drm]
>   drm_connector_property_set_ioctl+0x3b/0x60 [drm]
>   drm_ioctl_kernel+0xb9/0x120 [drm]
>   drm_ioctl+0x2d0/0x550 [drm]
>   ? __pfx_drm_connector_property_set_ioctl+0x10/0x10 [drm]
>   nouveau_drm_ioctl+0x61/0xc0 [nouveau]
>   __x64_sys_ioctl+0xa0/0xf0
>   do_syscall_64+0x76/0x140
>   ? do_syscall_64+0x85/0x140
>   ? do_syscall_64+0x85/0x140
>   entry_SYSCALL_64_after_hwframe+0x6e/0x76
>  RIP: 0033:0x7f4a1cd1a94f
>  Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 
> 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 
> ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
>  RSP: 002b:7ffd2f1df520 EFLAGS: 0246 ORIG_RAX: 0010
>  RAX: ffda RBX: 7ffd2f1df5b0 RCX: 7f4a1cd1a94f
>  RDX: 7ffd2f1df5b0 RSI: c01064ab RDI: 000f
>  RBP: c01064ab R08: 56347932deb8 R09: 56347a7d99c0
>  R10:  R11: 0246 R12: 56347938a220
>  R13: 000f R14: 563479d9f3f0 R15: 
>   
>  Modules linked in: rfcomm xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat 
> nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user 
> xfrm_algo xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp 
> llc ccm cmac algif_hash overlay algif_skcipher af_alg bnep 

Re: [pull] drm/msm: drm-msm-next-2024-02-29 for v6.9

2024-03-05 Thread Linux regression tracking (Thorsten Leemhuis)
On 29.02.24 20:04, Rob Clark wrote:
> 
> This is the main pull for v6.9, description below.
> 
> [...]
>
> GPU:
> - fix sc7180 UBWC config

Why was that queued for 6.9? That is a fix for a 6.8 regression that for
untrained eyes like mine does not look overly dangerous (but of course I
might be wrong with that).

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.


Re: [PATCH] drm/nouveau: keep DMA buffers required for suspend/resume

2024-03-03 Thread Linux regression tracking (Thorsten Leemhuis)
[adding a bunch of list and people as well as Timur Tabi, who authored
the culprit]

Sid Pranjale, thx for the report. FWIW, I'm just replying to add this to
the regression tracking to ensure it does not fall through the cracks.
Nevertheless let me mention two things while at it:

On 29.02.24 18:58, Sid Pranjale wrote:
> Nouveau deallocates a few buffers post GPU init which are required for GPU 
> suspend/resume to function correctly.
> This is likely not as big an issue on systems where the NVGPU is the only 
> GPU, but on multi-GPU set ups it leads to a regression where the kernel 
> module errors and results in a system-wide rendering freeze.

These lines are too long, see
Documentation/process/submitting-patches.rst for details.

> This commit addresses that regression by moving the two buffers required for 
> suspend and resume to be deallocated at driver unload instead of post init.
> 
> Fixes: 042b5f8 ("drm/nouveau: fix several DMA buffer leaks")

And that should be:

Fixes:  042b5f83841fbf ("drm/nouveau: fix several DMA buffer leaks")

> Signed-off-by: Sid Pranjale 
> ---
>  drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
> index a64c81385..a73a5b589 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
> @@ -1054,8 +1054,6 @@ r535_gsp_postinit(struct nvkm_gsp *gsp)
>   /* Release the DMA buffers that were needed only for boot and init */
>   nvkm_gsp_mem_dtor(gsp, >boot.fw);
>   nvkm_gsp_mem_dtor(gsp, >libos);
> - nvkm_gsp_mem_dtor(gsp, >rmargs);
> - nvkm_gsp_mem_dtor(gsp, >wpr_meta);
>  
>   return ret;
>  }
> @@ -2163,6 +2161,8 @@ r535_gsp_dtor(struct nvkm_gsp *gsp)
>  
>   r535_gsp_dtor_fws(gsp);
>  
> + nvkm_gsp_mem_dtor(gsp, >rmargs);
> + nvkm_gsp_mem_dtor(gsp, >wpr_meta);
>   nvkm_gsp_mem_dtor(gsp, >shm.mem);
>   nvkm_gsp_mem_dtor(gsp, >loginit);
>   nvkm_gsp_mem_dtor(gsp, >logintr);

To be sure the issue doesn't fall through the cracks unnoticed, I'm
adding it to regzbot, the Linux kernel regression tracking bot:

#regzbot ^introduced 042b5f83841fbf
#regzbot title drm/nouveau: rendering freezes with multi-GPU setup
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: drm/msm: VT console DisplayPort regression in 6.8-rc1

2024-02-27 Thread Linux regression tracking #update (Thorsten Leemhuis)
[send with a reduced set of recipients, we all get enough mail already]

On 27.02.24 13:40, Johan Hovold wrote:
> 
> Since 6.8-rc1 the VT console is no longer mirrored on an external
> display on coldplug or hotplug on the Lenovo ThinkPad X13s.
>

Thx for the report!

> I've previously reported this here:
> 
>   https://gitlab.freedesktop.org/drm/msm/-/issues/50

Then let's tell regzbot about is as well, in case the ticket comes back
to life now:

#regzbot duplicate: https://gitlab.freedesktop.org/drm/msm/-/issues/50

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.


Re: Bug#1061449: linux-image-6.7-amd64: a boot message from amdgpu

2024-02-14 Thread Linux regression tracking #update (Thorsten Leemhuis)
On 27.01.24 14:14, Salvatore Bonaccorso wrote:
> 
> In Debian (https://bugs.debian.org/1061449) we got the following
> quotred report:
> 
> On Wed, Jan 24, 2024 at 07:38:16PM +0100, Patrice Duroux wrote:
>> Package: src:linux
>> Version: 6.7.1-1~exp1
>> Severity: normal
>>
>> Giving a try to 6.7, here is a message extracted from dmesg:
>>
>> [4.177226] [ cut here ]
>> [4.177227] WARNING: CPU: 6 PID: 248 at
>> drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_factory.c:387
>> construct_phy+0xb26/0xd60 [amdgpu]
> 
> Analysis showed that this appears to be a regression from b17ef04bf3a4
> ("drm/amd/display: Pass pwrseq inst for backlight and ABM"). Does that
> ring some bells?
> 
> See: https://bugs.debian.org/1061449#27
> 
> #regzbot introduced: b17ef04bf3a4
> #regzbot link: https://bugs.debian.org/1061449
> #regzbot title: Regression by b17ef04bf3a4 ("drm/amd/display: Pass pwrseq 
> inst for backlight and ABM")

#regzbot monitor:
https://lore.kernel.org/amd-gfx/20240214184006.1356137-8-rodrigo.sique...@amd.com/
#regzbot fix: drm/amd/display: Only allow dig mapping to pwrseq in new asic
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: drm/msm: DisplayPort regressions in 6.8-rc1

2024-02-14 Thread Linux regression tracking (Thorsten Leemhuis)
On 13.02.24 19:00, Abhinav Kumar wrote:
> 
> Thanks for the report.
> 
> I do agree that pm runtime eDP driver got merged that time but I think
> the issue is either a combination of that along with DRM aux bridge
> https://patchwork.freedesktop.org/series/122584/ OR just the latter as
> even that went in around the same time.

In that case allow me a stupid question from the cheap seats:

Is there anything affected users can do to help getting us closer to the
real problem? Like testing a specific commit or two before or after the
merge of one of those features for example? That might help to rule out
a few things.

Ciao, Thorsten

> Thats why perhaps this issue was not seen with the chromebooks we tested
> on as they do not use pmic_glink (aux bridge).
> 
> So we will need to debug this on sc8280xp specifically or an equivalent
> device which uses aux bridge.
> 
> On 2/13/2024 3:42 AM, Johan Hovold wrote:
>> Hi,
>>
>> Since 6.8-rc1 the internal eDP display on the Lenovo ThinkPad X13s does
>> not always show up on boot.
>>
>> The logs indicate problems with the runtime PM and eDP rework that went
>> into 6.8-rc1:
>>
>> [    6.006236] Console: switching to colour dummy device 80x25
>> [    6.007542] [drm:dpu_kms_hw_init:1048] dpu hardware
>> revision:0x8000
>> [    6.007872] [drm:drm_bridge_attach [drm]] *ERROR* failed to
>> attach bridge /soc@0/phy@88eb000 to encoder TMDS-31: -16
>> [    6.007934] [drm:dp_bridge_init [msm]] *ERROR* failed to attach
>> panel bridge: -16
>> [    6.007983] msm_dpu ae01000.display-controller:
>> [drm:msm_dp_modeset_init [msm]] *ERROR* failed to create dp bridge: -16
>> [    6.008030] [drm:_dpu_kms_initialize_displayport:588] [dpu
>> error]modeset_init failed for DP, rc = -16
>> [    6.008050] [drm:_dpu_kms_setup_displays:681] [dpu
>> error]initialize_DP failed, rc = -16
>> [    6.008068] [drm:dpu_kms_hw_init:1153] [dpu error]modeset init
>> failed: -16
>> [    6.008388] msm_dpu ae01000.display-controller:
>> [drm:msm_drm_kms_init [msm]] *ERROR* kms hw init failed: -16
>> 
>> and this can also manifest itself as a NULL-pointer dereference:
>>
>> [    7.339447] Unable to handle kernel NULL pointer dereference at
>> virtual address 
>> 
>> [    7.643705] pc : drm_bridge_attach+0x70/0x1a8 [drm]
>> [    7.686415] lr : drm_aux_bridge_attach+0x24/0x38 [aux_bridge]
>> 
>> [    7.769039] Call trace:
>> [    7.771564]  drm_bridge_attach+0x70/0x1a8 [drm]
>> [    7.776234]  drm_aux_bridge_attach+0x24/0x38 [aux_bridge]
>> [    7.781782]  drm_bridge_attach+0x80/0x1a8 [drm]
>> [    7.786454]  dp_bridge_init+0xa8/0x15c [msm]
>> [    7.790856]  msm_dp_modeset_init+0x28/0xc4 [msm]
>> [    7.795617]  _dpu_kms_drm_obj_init+0x19c/0x680 [msm]
>> [    7.800731]  dpu_kms_hw_init+0x348/0x4c4 [msm]
>> [    7.805306]  msm_drm_kms_init+0x84/0x324 [msm]
>> [    7.809891]  msm_drm_bind+0x1d8/0x3a8 [msm]
>> [    7.814196]  try_to_bring_up_aggregate_device+0x1f0/0x2f8
>> [    7.819747]  __component_add+0xa4/0x18c
>> [    7.823703]  component_add+0x14/0x20
>> [    7.827389]  dp_display_probe+0x47c/0x568 [msm]
>> [    7.832052]  platform_probe+0x68/0xd8
>>
>> Users have also reported random crashes at boot since 6.8-rc1, and I've
>> been able to trigger hard crashes twice when testing an external display
>> (USB-C/DP), which may also be related to the DP regressions.
>>
>> I've opened an issue here:
>>
>> https://gitlab.freedesktop.org/drm/msm/-/issues/51
>>
>> but I also want Thorsten's help to track this so that it gets fixed
>> before 6.8 is released.
>>
>> #regzbot introduced: v6.7..v6.8-rc1
>>
>> The following series is likely the culprit:
>>
>> 
>> https://lore.kernel.org/all/1701472789-25951-1-git-send-email-quic_khs...@quicinc.com/
>>
>> Johan
> 
> 


Re: Bug#1061449: linux-image-6.7-amd64: a boot message from amdgpu

2024-01-28 Thread Linux regression tracking (Thorsten Leemhuis)
On 27.01.24 14:14, Salvatore Bonaccorso wrote:
>
> In Debian (https://bugs.debian.org/1061449) we got the following
> quotred report:
> 
> On Wed, Jan 24, 2024 at 07:38:16PM +0100, Patrice Duroux wrote:
>>
>> Giving a try to 6.7, here is a message extracted from dmesg:
>> [4.177226] [ cut here ]
>> [4.177227] WARNING: CPU: 6 PID: 248 at
>> drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_factory.c:387
>> construct_phy+0xb26/0xd60 [amdgpu]
> [...]

Not my area of expertise, but looks a lot like a duplicate of
https://gitlab.freedesktop.org/drm/amd/-/issues/3122#note_2252835

Mario (now CCed) already prepared a patch for that issue that seems to work.

HTH, Ciao, Thorsten


Re: [REGRESSION] rx7600 stopped working after "1cfb4d612127 drm/amdgpu: put MQDs in VRAM"

2023-12-06 Thread Linux regression tracking #update (Thorsten Leemhuis)
[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 26.10.23 19:33, Alexey Klimov wrote:
> #regzbot introduced: 1cfb4d612127
> #regzbot title: rx7600 stopped working after "1cfb4d612127 drm/amdgpu: put 
> MQDs in VRAM"
> 
> Hi all,
> 
> I've been playing with RX7600 and it was observed that amdgpu stopped working 
> between kernel 6.2 and 6.5.
> Then I narrowed it down to 6.4 <-> 6.5-rc1 and finally bisect pointed at 
> 1cfb4d6121276a829aa94d0e32a7f5e1830ebc21
> And I manually checked if it boots/works on the previous commit and the 
> mentioned one.

#regzbot fix: ba0fb4b48c19a
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: Bug#1054514: linux-image-6.1.0-13-amd64: Debian VM with qxl graphics freezes frequently

2023-12-06 Thread Linux regression tracking (Thorsten Leemhuis)
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

Gerd, it seems this regression[1] fell through the cracks. Could you
please take a look? Or is there a good reason why this can't be
addressed? Or was it dealt with and I just missed it?

[1] apparently caused by 5a838e5d5825c8 ("drm/qxl: simplify
qxl_fence_wait") [v5.13-rc1] from Gerd; for details see
https://lore.kernel.org/regressions/ztgydqrlk6wx_...@eldamar.lan/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 24.10.23 23:39, Timo Lindfors wrote:
> Hi,
> 
> On Tue, 24 Oct 2023, Salvatore Bonaccorso wrote:
>> Thanks for the excelent constructed report! I think it's best to
>> forward this directly to upstream including the people for the
>> bisected commit to get some idea.
> 
> Thanks for the quick reply!
> 
>> Can you reproduce the issue with 6.5.8-1 in unstable as well?
> 
> Unfortunately yes:
> 
> ansible@target:~$ uname -r
> 6.5.0-3-amd64
> ansible@target:~$ time sudo ./reproduce.bash
> Wed 25 Oct 2023 12:27:00 AM EEST starting round 1
> Wed 25 Oct 2023 12:27:24 AM EEST starting round 2
> Wed 25 Oct 2023 12:27:48 AM EEST starting round 3
> bug was reproduced after 3 tries
> 
> real    0m48.838s
> user    0m1.115s
> sys 0m45.530s
> 
> I also tested upstream tag v6.6-rc6:
> 
> ...
> + detected_version=6.6.0-rc6
> + '[' 6.6.0-rc6 '!=' 6.6.0-rc6 ']'
> + exec ssh target sudo ./reproduce.bash
> Wed 25 Oct 2023 12:37:16 AM EEST starting round 1
> Wed 25 Oct 2023 12:37:42 AM EEST starting round 2
> Wed 25 Oct 2023 12:38:10 AM EEST starting round 3
> Wed 25 Oct 2023 12:38:36 AM EEST starting round 4
> Wed 25 Oct 2023 12:39:01 AM EEST starting round 5
> Wed 25 Oct 2023 12:39:27 AM EEST starting round 6
> bug was reproduced after 6 tries
> 
> 
> For completeness, here is also the grub_set_default_version.bash script
> that I had to write to automate this (maybe these could be in debian
> wiki?):
> 
> #!/bin/bash
> set -x
> 
> version="$1"
> 
> idx=$(expr $(grep "menuentry " /boot/grub/grub.cfg | sed 1d |grep -n
> "'Debian GNU/Linux, with Linux $version'"|cut -d: -f1) - 1)
> exec sudo grub-set-default "1>$idx"
> 
> 
> 
> -Timo
> 
> 
> 


Re: [PATCH v2 2/2] drm/msm/dp: attach the DP subconnector property

2023-11-21 Thread Linux regression tracking (Thorsten Leemhuis)
On 21.11.23 19:50, Abhinav Kumar wrote:
> On 11/21/2023 9:57 AM, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 15.11.23 19:06, Abhinav Kumar wrote:
>>> On 11/15/2023 12:06 AM, Johan Hovold wrote:
>>>> On Wed, Oct 25, 2023 at 12:23:10PM +0300, Dmitry Baryshkov wrote:
>>>>> While developing and testing the commit bfcc3d8f94f4 ("drm/msm/dp:
>>>>> support setting the DP subconnector type") I had the patch [1] in my
>>>>> tree. I haven't noticed that it was a dependency for the commit in
>>>>> question. Mea culpa.
>>>>
>>>> This also broke boot on the Lenovo ThinkPad X13s.
>>>>
>>>> Would be nice to get this fixed ASAP so that further people don't have
>>>> to debug this known regression.
>>>
>>> I will queue this patch for -fixes rightaway.
>>
>> Thx. I noticed that this fix is still not in -next. I then investigated
>> and I found it was applied on Thursday last week here:
>> https://gitlab.freedesktop.org/drm/msm/-/commits/msm-fixes?ref_type=heads
>>
>> Makes me wonder: when will that patch go to a branch that is included in
>> -next? And when will it move on towards mainline?
> 
> This has been included in a pull request for 6.7-rc3 to the DRM tree and
> shall make it to -next from there.

Ahh, great, thx, I was slowly getting worried.

Ciao, Thorsten

>>>>> Since the patch has not landed yet (and even was not reviewed)
>>>>> and since one of the bridges erroneously uses USB connector type
>>>>> instead
>>>>> of DP, attach the property directly from the MSM DP driver.
>>>>>
>>>>> This fixes the following oops on DP HPD event:
>>>>>
>>>>>    drm_object_property_set_value
>>>>> (drivers/gpu/drm/drm_mode_object.c:288)
>>>>>    dp_display_process_hpd_high
>>>>> (drivers/gpu/drm/msm/dp/dp_display.c:402)
>>>>>    dp_hpd_plug_handle.isra.0 (drivers/gpu/drm/msm/dp/dp_display.c:604)
>>>>>    hpd_event_thread (drivers/gpu/drm/msm/dp/dp_display.c:1110)
>>>>>    kthread (kernel/kthread.c:388)
>>>>>    ret_from_fork (arch/arm64/kernel/entry.S:858)
>>>>
>>>> This only says where the oops happened, it doesn't necessarily in
>>>> itself
>>>> indicate an oops at all or that in this case it's a NULL pointer
>>>> dereference.
>>>>
>>>> On the X13s I'm seeing the NULL deref in a different path during boot,
>>>> and when this happens after a deferred probe (due to the panel lookup
>>>> mess) it hangs the machine, which makes it a bit of a pain to debug:
>>>>
>>>>  Unable to handle kernel NULL pointer dereference at virtual
>>>> address 0060
>>>>  ...
>>>>  CPU: 4 PID: 57 Comm: kworker/u16:1 Not tainted 6.7.0-rc1 #4
>>>>  Hardware name: Qualcomm QRD, BIOS
>>>> 6.0.220110.BOOT.MXF.1.1-00470-MAKENA-1 01/10/2022
>>>>  ...
>>>>  Call trace:
>>>>   drm_object_property_set_value+0x0/0x88 [drm]
>>>>   dp_display_process_hpd_high+0xa0/0x14c [msm]
>>>>   dp_hpd_plug_handle.constprop.0.isra.0+0x90/0x110 [msm]
>>>>   dp_bridge_atomic_enable+0x184/0x21c [msm]
>>>>   edp_bridge_atomic_enable+0x60/0x94 [msm]
>>>>   drm_atomic_bridge_chain_enable+0x54/0xc8 [drm]
>>>>   drm_atomic_helper_commit_modeset_enables+0x194/0x26c
>>>> [drm_kms_helper]
>>>>   msm_atomic_commit_tail+0x204/0x804 [msm]
>>>>   commit_tail+0xa4/0x18c [drm_kms_helper]
>>>>   drm_atomic_helper_commit+0x19c/0x1b0 [drm_kms_helper]
>>>>   drm_atomic_commit+0xa4/0x104 [drm]
>>>>   drm_client_modeset_commit_atomic+0x22c/0x298 [drm]
>>>>   drm_client_modeset_commit_locked+0x60/0x1c0 [drm]
>>>>   drm_client_modeset_commit+0x30/0x58 [drm]
>>>>   __drm_fb_helper_restore_fbdev_mode_unlocked+0xbc/0xfc
>>>> [drm_kms_helper]
>>>>   drm_fb_helper_set_par+0x30/0x4c [drm_kms_helper]
>>>>   fbcon_init+0x224/0x49c
>>>>   visual_init+0xb0/0x108
>>>>   do_bind_con_driver.isra.0+0x19c/0x38c
>>>>   do_take_over_console+0x140/0x1ec
>>>>   do_fbcon_takeover+0x6c/0xe4
>>>>   fbcon_fb_registered+0x180/0x1f0
>>>>   register_framebuffer+0x19c/0x228
>>>>   __drm_fb_helper_initial_config_and_unlock+0x2e8/0x4e8
>>>> [drm_kms_helper]
>>>>   drm_fb_helper_initial_config+0x3c/0x4c [drm_kms_helper]
>>>>   msm_fbdev_client_hotplug+0x84/0xcc [msm]
>>>>   drm_client_register+0x5c/0xa0 [drm]
>>>>   msm_fbdev_setup+0x94/0x148 [msm]
>>>>   msm_drm_bind+0x3d0/0x42c [msm]
>>>>   try_to_bring_up_aggregate_device+0x1ec/0x2f4
>>>>   __component_add+0xa8/0x194
>>>>   component_add+0x14/0x20
>>>>   dp_display_probe+0x278/0x41c [msm]
>>>>
>>>>> [1] https://patchwork.freedesktop.org/patch/30/
>>>>>
>>>>> Fixes: bfcc3d8f94f4 ("drm/msm/dp: support setting the DP subconnector
>>>>> type")
>>>>> Reviewed-by: Abhinav Kumar 
>>>>> Signed-off-by: Dmitry Baryshkov 
>>>>
>>>> Reviewed-by: Johan Hovold 
>>>> Tested-by: Johan Hovold 
>>>>
>>>
>>> Thanks !
>>>
>>>> Johan
> 
> 


Re: [PATCH v2 2/2] drm/msm/dp: attach the DP subconnector property

2023-11-21 Thread Linux regression tracking (Thorsten Leemhuis)
On 15.11.23 19:06, Abhinav Kumar wrote:
> On 11/15/2023 12:06 AM, Johan Hovold wrote:
>> On Wed, Oct 25, 2023 at 12:23:10PM +0300, Dmitry Baryshkov wrote:
>>> While developing and testing the commit bfcc3d8f94f4 ("drm/msm/dp:
>>> support setting the DP subconnector type") I had the patch [1] in my
>>> tree. I haven't noticed that it was a dependency for the commit in
>>> question. Mea culpa.
>>
>> This also broke boot on the Lenovo ThinkPad X13s.
>>
>> Would be nice to get this fixed ASAP so that further people don't have
>> to debug this known regression.
> 
> I will queue this patch for -fixes rightaway.

Thx. I noticed that this fix is still not in -next. I then investigated
and I found it was applied on Thursday last week here:
https://gitlab.freedesktop.org/drm/msm/-/commits/msm-fixes?ref_type=heads

Makes me wonder: when will that patch go to a branch that is included in
-next? And when will it move on towards mainline?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

>>> Since the patch has not landed yet (and even was not reviewed)
>>> and since one of the bridges erroneously uses USB connector type instead
>>> of DP, attach the property directly from the MSM DP driver.
>>>
>>> This fixes the following oops on DP HPD event:
>>>
>>>   drm_object_property_set_value (drivers/gpu/drm/drm_mode_object.c:288)
>>>   dp_display_process_hpd_high (drivers/gpu/drm/msm/dp/dp_display.c:402)
>>>   dp_hpd_plug_handle.isra.0 (drivers/gpu/drm/msm/dp/dp_display.c:604)
>>>   hpd_event_thread (drivers/gpu/drm/msm/dp/dp_display.c:1110)
>>>   kthread (kernel/kthread.c:388)
>>>   ret_from_fork (arch/arm64/kernel/entry.S:858)
>>
>> This only says where the oops happened, it doesn't necessarily in itself
>> indicate an oops at all or that in this case it's a NULL pointer
>> dereference.
>>
>> On the X13s I'm seeing the NULL deref in a different path during boot,
>> and when this happens after a deferred probe (due to the panel lookup
>> mess) it hangs the machine, which makes it a bit of a pain to debug:
>>
>>     Unable to handle kernel NULL pointer dereference at virtual
>> address 0060
>>     ...
>>     CPU: 4 PID: 57 Comm: kworker/u16:1 Not tainted 6.7.0-rc1 #4
>>     Hardware name: Qualcomm QRD, BIOS
>> 6.0.220110.BOOT.MXF.1.1-00470-MAKENA-1 01/10/2022
>>     ...
>>     Call trace:
>>  drm_object_property_set_value+0x0/0x88 [drm]
>>  dp_display_process_hpd_high+0xa0/0x14c [msm]
>>  dp_hpd_plug_handle.constprop.0.isra.0+0x90/0x110 [msm]
>>  dp_bridge_atomic_enable+0x184/0x21c [msm]
>>  edp_bridge_atomic_enable+0x60/0x94 [msm]
>>  drm_atomic_bridge_chain_enable+0x54/0xc8 [drm]
>>  drm_atomic_helper_commit_modeset_enables+0x194/0x26c
>> [drm_kms_helper]
>>  msm_atomic_commit_tail+0x204/0x804 [msm]
>>  commit_tail+0xa4/0x18c [drm_kms_helper]
>>  drm_atomic_helper_commit+0x19c/0x1b0 [drm_kms_helper]
>>  drm_atomic_commit+0xa4/0x104 [drm]
>>  drm_client_modeset_commit_atomic+0x22c/0x298 [drm]
>>  drm_client_modeset_commit_locked+0x60/0x1c0 [drm]
>>  drm_client_modeset_commit+0x30/0x58 [drm]
>>  __drm_fb_helper_restore_fbdev_mode_unlocked+0xbc/0xfc
>> [drm_kms_helper]
>>  drm_fb_helper_set_par+0x30/0x4c [drm_kms_helper]
>>  fbcon_init+0x224/0x49c
>>  visual_init+0xb0/0x108
>>  do_bind_con_driver.isra.0+0x19c/0x38c
>>  do_take_over_console+0x140/0x1ec
>>  do_fbcon_takeover+0x6c/0xe4
>>  fbcon_fb_registered+0x180/0x1f0
>>  register_framebuffer+0x19c/0x228
>>  __drm_fb_helper_initial_config_and_unlock+0x2e8/0x4e8
>> [drm_kms_helper]
>>  drm_fb_helper_initial_config+0x3c/0x4c [drm_kms_helper]
>>  msm_fbdev_client_hotplug+0x84/0xcc [msm]
>>  drm_client_register+0x5c/0xa0 [drm]
>>  msm_fbdev_setup+0x94/0x148 [msm]
>>  msm_drm_bind+0x3d0/0x42c [msm]
>>  try_to_bring_up_aggregate_device+0x1ec/0x2f4
>>  __component_add+0xa8/0x194
>>  component_add+0x14/0x20
>>  dp_display_probe+0x278/0x41c [msm]
>>
>>> [1] https://patchwork.freedesktop.org/patch/30/
>>>
>>> Fixes: bfcc3d8f94f4 ("drm/msm/dp: support setting the DP subconnector
>>> type")
>>> Reviewed-by: Abhinav Kumar 
>>> Signed-off-by: Dmitry Baryshkov 
>>
>> Reviewed-by: Johan Hovold 
>> Tested-by: Johan Hovold 
>>
> 
> Thanks !
> 
>> Johan


Re: [REGRESSION]: nouveau: Asynchronous wait on fence

2023-11-21 Thread Linux regression tracking (Thorsten Leemhuis)
On 15.11.23 07:19, Owen T. Heisler wrote:
> On 10/31/23 04:18, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 28.10.23 04:46, Owen T. Heisler wrote:
>>> #regzbot introduced: d386a4b54607cf6f76e23815c2c9a3abc1d66882
>>> #regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/180
>>>
>>> ## Problem
>>>
>>> 1. Connect external display to DVI port on dock and run X with both
>>>     displays in use.
>>> 2. Wait hours or days.
>>> 3. Suddenly the secondary Nvidia-connected display turns off and X stops
>>>     responding to keyboard/mouse input. In *some* cases it is
>>> possible to
>>>     switch to a virtual TTY with Ctrl+Alt+Fn and log in there.
> 
>> You thus might want to check if the problem occurs with 6.6 -- and
>> ideally also check if reverting the culprit there fixes things for you.
> 
> The problem also occurs with v6.6.

You meanwhile might want to give 6.7-rc as well on the off chance that
it improves things, even if that is unlikely.

> Here is a decoded kernel log from an
> untainted kernel:
> 
> https://gitlab.freedesktop.org/drm/nouveau/uploads/c120faf09da46f9c74006df9f1d14442/async-wait-on-fence-180.log
> 
> The culprit commit does not revert cleanly on v6.6. I have not yet
> attempted to resolve the conflicts.
> 
> I have also updated the bug description at
> <https://gitlab.freedesktop.org/drm/nouveau/-/issues/180>.

Maybe one of the nouveau developer can take a quick look at
d386a4b54607cf and suggest a simple way to revert it in latest mainline.
Maybe just removing the main chunk of code that is added is all that it
takes.

Ciao, Thorsten


Re: Radeon regression in 6.6 kernel

2023-11-19 Thread Linux regression tracking (Thorsten Leemhuis)
On 19.11.23 14:24, Bagas Sanjaya wrote:
> On Sun, Nov 19, 2023 at 04:47:01PM +1000, Dave Airlie wrote:
>>> On 12.11.23 01:46, Phillip Susi wrote:
 I had been testing some things on a post 6.6-rc5 kernel for a week or
 two and then when I pulled to a post 6.6 release kernel, I found that
 system suspend was broken.  It seems that the radeon driver failed to
 suspend, leaving the display dead, the wayland display server hung, and
 the system still running.  I have been trying to bisect it for the last
 few days and have only been able to narrow it down to the following 3
 commits:

 There are only 'skip'ped commits left to test.
 The first bad commit could be any of:
 56e449603f0ac580700621a356d35d5716a62ce5
 c07bf1636f0005f9eb7956404490672286ea59d3
 b70438004a14f4d0f9890b3297cd66248728546c
 We cannot bisect more!
>>>
>>> Hmm, not a single reply from the amdgpu folks. Wondering how we can
>>> encourage them to look into this.
>>>
>>> Phillip, reporting issues by mail should still work, but you might have
>>> more luck here, as that's where the amdgpu afaics prefer to track bugs:
>>> https://gitlab.freedesktop.org/drm/amd/-/issues
>>>
>>> When you file an issue there, please mention it here.
>>>
>>> Furthermore it might help if you could verify if 6.7-rc1 (or rc2, which
>>> comes out later today) or 6.6.2-rc1 improve things.

BTW, ignore the "6.6.2-rc1" here, I misunderstood one detail earlier. Sorry.

>> It would also be good to test if reverting any of these is possible or not.

Good point, sorry, forgot to mention that.

> Hi Dave,
> 
> AFAIK commit c07bf1636f0005 ("MAINTAINERS: Update the GPU Scheduler email")
> doesn't seem to do with this regression as it doesn't change any amdgpu code
> that may introduce the regression.

Bagas, sorry for being blunt here, I know you mean well. But I feel the
need to say the following in the open, as this otherwise falls back on
me and regression tracking.

Stating the above is not very helpful, as Dave for sure will know.
Telling Phillip that he likely can skip that commit might have been
something different. But I guess even for most users that are able to do
a bisection it's obvious and maybe not worth pointing out.

Ciao, Thorsten


Re: Radeon regression in 6.6 kernel

2023-11-18 Thread Linux regression tracking (Thorsten Leemhuis)
Lo!

On 12.11.23 01:46, Phillip Susi wrote:
> I had been testing some things on a post 6.6-rc5 kernel for a week or
> two and then when I pulled to a post 6.6 release kernel, I found that
> system suspend was broken.  It seems that the radeon driver failed to
> suspend, leaving the display dead, the wayland display server hung, and
> the system still running.  I have been trying to bisect it for the last
> few days and have only been able to narrow it down to the following 3
> commits:
> 
> There are only 'skip'ped commits left to test.
> The first bad commit could be any of:
> 56e449603f0ac580700621a356d35d5716a62ce5
> c07bf1636f0005f9eb7956404490672286ea59d3
> b70438004a14f4d0f9890b3297cd66248728546c
> We cannot bisect more!

Hmm, not a single reply from the amdgpu folks. Wondering how we can
encourage them to look into this.

Phillip, reporting issues by mail should still work, but you might have
more luck here, as that's where the amdgpu afaics prefer to track bugs:
https://gitlab.freedesktop.org/drm/amd/-/issues

When you file an issue there, please mention it here.

Furthermore it might help if you could verify if 6.7-rc1 (or rc2, which
comes out later today) or 6.6.2-rc1 improve things.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke


> It appears that there was a late merge in the 6.6 window that originally
> forked from the -rc2, as many of the later commits that I bisected had
> that version number.
> 
> I couldn't get it more narrowed down because I had to skip the
> surrounding commits because they wouldn't even boot up to a gui desktop,
> let alone try to suspend.
> 
> When system suspend fails, I find the following in my syslog after I
> have to magic-sysrq reboot because the the display is dead:
> 
> Nov 11 18:44:39 faldara kernel: PM: suspend entry (deep)
> Nov 11 18:44:39 faldara kernel: Filesystems sync: 0.035 seconds
> Nov 11 18:44:40 faldara kernel: Freezing user space processes
> Nov 11 18:44:40 faldara kernel: Freezing user space processes completed 
> (elapsed 0.001 seconds)
> Nov 11 18:44:40 faldara kernel: OOM killer disabled.
> Nov 11 18:44:40 faldara kernel: Freezing remaining freezable tasks
> Nov 11 18:44:40 faldara kernel: Freezing remaining freezable tasks completed 
> (elapsed 0.001 seconds)
> Nov 11 18:44:40 faldara kernel: printk: Suspending console(s) (use 
> no_console_suspend to debug)
> Nov 11 18:44:40 faldara kernel: serial 00:01: disabled
> Nov 11 18:44:40 faldara kernel: e1000e: EEE TX LPI TIMER: 0011
> Nov 11 18:44:40 faldara kernel: sd 4:0:0:0: [sdb] Synchronizing SCSI cache
> Nov 11 18:44:40 faldara kernel: sd 1:0:0:0: [sda] Synchronizing SCSI cache
> Nov 11 18:44:40 faldara kernel: sd 5:0:0:0: [sdc] Synchronizing SCSI cache
> Nov 11 18:44:40 faldara kernel: sd 4:0:0:0: [sdb] Stopping disk
> Nov 11 18:44:40 faldara kernel: sd 1:0:0:0: [sda] Stopping disk
> Nov 11 18:44:40 faldara kernel: sd 5:0:0:0: [sdc] Stopping disk
> Nov 11 18:44:40 faldara kernel: amdgpu: Move buffer fallback to memcpy 
> unavailable
> Nov 11 18:44:40 faldara kernel: [TTM] Buffer eviction failed
> Nov 11 18:44:40 faldara kernel: [drm] evicting device resources failed
> Nov 11 18:44:40 faldara kernel: amdgpu :03:00.0: PM: pci_pm_suspend(): 
> amdgpu_pmops_suspend+0x0/0x80 [amdgpu] returns -19
> Nov 11 18:44:40 faldara kernel: amdgpu :03:00.0: PM: dpm_run_callback(): 
> pci_pm_suspend+0x0/0x170 returns -19
> Nov 11 18:44:40 faldara kernel: amdgpu :03:00.0: PM: failed to suspend 
> async: error -19
> Nov 11 18:44:40 faldara kernel: PM: Some devices failed to suspend, or early 
> wake event detected
> Nov 11 18:44:40 faldara kernel: xhci_hcd :06:00.0: xHC error in resume, 
> USBSTS 0x401, Reinit
> Nov 11 18:44:40 faldara kernel: usb usb3: root hub lost power or was reset
> Nov 11 18:44:40 faldara kernel: usb usb4: root hub lost power or was reset
> Nov 11 18:44:40 faldara kernel: serial 00:01: activated
> Nov 11 18:44:40 faldara kernel: nvme nvme0: 4/0/0 default/read/poll queues
> Nov 11 18:44:40 faldara kernel: ata8: SATA link down (SStatus 0 SControl 300)
> Nov 11 18:44:40 faldara kernel: ata7: SATA link down (SStatus 0 SControl 300)
> Nov 11 18:44:40 faldara kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 
> SControl 300)
> Nov 11 18:44:40 faldara kernel: ata1: SATA link down (SStatus 4 SControl 300)
> Nov 11 18:44:40 faldara kernel: ata3: SATA link down (SStatus 4 SControl 300)
> Nov 11 18:44:40 faldara kernel: ata4.00: configured for UDMA/133
> Nov 11 18:44:40 faldara kernel: OOM killer enabled.
> Nov 11 18:44:40 faldara kernel: R

Re: [PATCH v2 2/2] drm/msm/dp: attach the DP subconnector property

2023-11-18 Thread Linux regression tracking #adding (Thorsten Leemhuis)
[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]


On 15.11.23 09:06, Johan Hovold wrote:
> On Wed, Oct 25, 2023 at 12:23:10PM +0300, Dmitry Baryshkov wrote:
>> While developing and testing the commit bfcc3d8f94f4 ("drm/msm/dp:
>> support setting the DP subconnector type") I had the patch [1] in my
>> tree. I haven't noticed that it was a dependency for the commit in
>> question. Mea culpa.
> This also broke boot on the Lenovo ThinkPad X13s.
> [...]

>> Fixes: bfcc3d8f94f4 ("drm/msm/dp: support setting the DP subconnector type")
>> Reviewed-by: Abhinav Kumar 
>> Signed-off-by: Dmitry Baryshkov 
> 
> Reviewed-by: Johan Hovold 
> Tested-by: Johan Hovold 


Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced bfcc3d8f94f4
#regzbot title drm/msm/dp: boot broken on the Lenovo ThinkPad X13s and
some other machines
#regzbot fix: drm/msm/dp: attach the DP subconnector property
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: mainline build failure due to 7966f319c66d ("drm/amd/display: Introduce DML2")

2023-11-12 Thread Linux regression tracking #update (Thorsten Leemhuis)
[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 04.11.23 10:42, Sudip Mukherjee wrote:
> On Thu, 2 Nov 2023 at 22:53, Alex Deucher  wrote:
>> On Thu, Nov 2, 2023 at 1:07 PM Sudip Mukherjee
>>  wrote:
>>> On Thu, 2 Nov 2023 at 16:52, Alex Deucher  wrote:
>>>> On Thu, Nov 2, 2023 at 5:32 AM Sudip Mukherjee (Codethink)
>>>>  wrote:
>>
>> Should be fixed with Nathan's patch:
>> https://patchwork.freedesktop.org/patch/565675/
> 
> Yes, it does. Thanks.
> 
> Tested-by: Sudip Mukherjee 

#regzbot fix: 6740ec97bcdbe9
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.



Re: [Nouveau] Fwd: System (Xeon Nvidia) hangs at boot terminal after kernel 6.4.7

2023-11-01 Thread Linux regression tracking #update (Thorsten Leemhuis)
[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 10.08.23 06:19, Thorsten Leemhuis wrote:
> On 10.08.23 05:03, Bagas Sanjaya wrote:
>>
>> I notice a regression report on Bugzilla [1]. Quoting from it:
>>
>> [...]
>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=217776

#regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/255
#regzbot fix: 6eb4a83e612af65bab8492957cba
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.



Re: Blank screen on boot of Linux 6.5 and later on Lenovo ThinkPad L570

2023-10-25 Thread Linux regression tracking (Thorsten Leemhuis)
On 25.10.23 15:23, Huacai Chen wrote:
> On Wed, Oct 25, 2023 at 6:08 PM Thorsten Leemhuis
>  wrote:
>>
>> Javier, Dave, Sima,
>>
>> On 23.10.23 00:54, Evan Preston wrote:
>>> On 2023-10-20 Fri 05:48pm, Huacai Chen wrote:
>>>> On Fri, Oct 20, 2023 at 5:35 PM Linux regression tracking (Thorsten
>>>> Leemhuis)  wrote:
>>>>> On 09.10.23 10:54, Huacai Chen wrote:
>>>>>> On Mon, Oct 9, 2023 at 4:45 PM Bagas Sanjaya  
>>>>>> wrote:
>>>>>>> On Mon, Oct 09, 2023 at 09:27:02AM +0800, Huacai Chen wrote:
>>>>>>>> On Tue, Sep 26, 2023 at 10:31 PM Huacai Chen  
>>>>>>>> wrote:
>>>>>>>>> On Tue, Sep 26, 2023 at 7:15 PM Linux regression tracking (Thorsten
>>>>>>>>> Leemhuis)  wrote:
>>>>>>>>>> On 13.09.23 14:02, Jaak Ristioja wrote:
>>>>>>>>>>>
>>>>>>>>>>> Upgrading to Linux 6.5 on a Lenovo ThinkPad L570 (Integrated Intel 
>>>>>>>>>>> HD
>>>>>>>>>>> Graphics 620 (rev 02), Intel(R) Core(TM) i7-7500U) results in a 
>>>>>>>>>>> blank
>>>>>>>>>>> screen after boot until the display manager starts... if it does 
>>>>>>>>>>> start
>>>>>>>>>>> at all. Using the nomodeset kernel parameter seems to be a 
>>>>>>>>>>> workaround.
>>>>>>>>>>>
>>>>>>>>>>> I've bisected this to commit 
>>>>>>>>>>> 60aebc9559492cea6a9625f514a8041717e3a2e4
>>>>>>>>>>> ("drivers/firmware: Move sysfb_init() from device_initcall to
>>>>>>>>>>> subsys_initcall_sync").
>>>>>>>>>>
>>>>>>>> As confirmed by Jaak, disabling DRM_SIMPLEDRM makes things work fine
>>>>>>>> again. So I guess the reason:
>>>>>
>>>>> Well, this to me still looks a lot (please correct me if I'm wrong) like
>>>>> regression that should be fixed, as DRM_SIMPLEDRM was enabled beforehand
>>>>> if I understood things correctly. Or is there a proper fix for this
>>>>> already in the works and I just missed this? Or is there some good
>>>>> reason why this won't/can't be fixed?
>>>>
>>>> DRM_SIMPLEDRM was enabled but it didn't work at all because there was
>>>> no corresponding platform device. Now DRM_SIMPLEDRM works but it has a
>>>> blank screen. Of course it is valuable to investigate further about
>>>> DRM_SIMPLEDRM on Jaak's machine, but that needs Jaak's effort because
>>>> I don't have a same machine.
>>
>> Side note: Huacai, have you tried working with Jaak to get down to the
>> real problem? Evan, might you be able to help out here?
> No, Jaak has no response after he 'fixed' his problem by disabling SIMPLEDRM.

Yeah, understood, already suspected something like that, thx for confirming.

>> But I write this mail for a different reason:
>>
>>> I am having the same issue on a Lenovo Thinkpad P70 (Intel
>>> Corporation HD Graphics 530 (rev 06), Intel(R) Core(TM) i7-6700HQ).
>>> Upgrading from Linux 6.4.12 to 6.5 and later results in only a blank
>>> screen after boot and a rapidly flashing device-access-status
>>> indicator.
>>
>> This additional report makes me wonder if we should revert the culprit
>> (60aebc9559492c ("drivers/firmware: Move sysfb_init() from
>> device_initcall to subsys_initcall_sync") [v6.5-rc1]). But I guess that
>> might lead to regressions for some users? But the patch description says
>> that this is not a common configuration, so can we maybe get away with that?
>>From my point of view, this is not a regression, 60aebc9559492c
> doesn't cause a problem, but exposes a problem.

>From my understanding of Linus stance in cases like this I think that
aspect doesn't matter. To for example quote
https://lore.kernel.org/lkml/CAHk-=wiP4K8DRJWsCo=20hn_6054xbamgkf2kpguzpb5ama...@mail.gmail.com/

""
But it ended up exposing another problem, and as such caused a kernel
upgrade to fail for a user. So it got reverted.
"""

For other examples of his view see the bottom half of
https://docs.kernel.org/process/handling-regressions.html

We could bring Linus in to clarify if needed, but I for now didn't CC
him, as I hope we can solve this without h

Re: Blank screen on boot of Linux 6.5 and later on Lenovo ThinkPad L570

2023-10-20 Thread Linux regression tracking (Thorsten Leemhuis)
On 09.10.23 10:54, Huacai Chen wrote:
> On Mon, Oct 9, 2023 at 4:45 PM Bagas Sanjaya  wrote:
>> On Mon, Oct 09, 2023 at 09:27:02AM +0800, Huacai Chen wrote:
>>> On Tue, Sep 26, 2023 at 10:31 PM Huacai Chen  wrote:
>>>> On Tue, Sep 26, 2023 at 7:15 PM Linux regression tracking (Thorsten
>>>> Leemhuis)  wrote:
>>>>> On 13.09.23 14:02, Jaak Ristioja wrote:
>>>>>>
>>>>>> Upgrading to Linux 6.5 on a Lenovo ThinkPad L570 (Integrated Intel HD
>>>>>> Graphics 620 (rev 02), Intel(R) Core(TM) i7-7500U) results in a blank
>>>>>> screen after boot until the display manager starts... if it does start
>>>>>> at all. Using the nomodeset kernel parameter seems to be a workaround.
>>>>>>
>>>>>> I've bisected this to commit 60aebc9559492cea6a9625f514a8041717e3a2e4
>>>>>> ("drivers/firmware: Move sysfb_init() from device_initcall to
>>>>>> subsys_initcall_sync").
>>>>>
>>>>> Hmmm, no reaction since it was posted a while ago, unless I'm missing
>>>>> something.
>>>>>
>>>>> Huacai Chen, did you maybe miss this report? The problem is apparently
>>>>> caused by a commit of yours (that Javier applied), you hence should look
>>>>> into this.
>>>> I'm sorry but it looks very strange, could you please share your config 
>>>> file?
>>> As confirmed by Jaak, disabling DRM_SIMPLEDRM makes things work fine
>>> again. So I guess the reason:
>>
>> Did Jaak reply privately? It should have been disclosed in public
>> ML here instead.
> Yes, he replied privately, and disabling DRM_SIMPLEDRM was suggested by me.

Well, this to me still looks a lot (please correct me if I'm wrong) like
regression that should be fixed, as DRM_SIMPLEDRM was enabled beforehand
if I understood things correctly. Or is there a proper fix for this
already in the works and I just missed this? Or is there some good
reason why this won't/can't be fixed?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

>>> When SIMPLEDRM takes over the framebuffer, the screen is blank (don't
>>> know why). And before 60aebc9559492cea6a9625f ("drivers/firmware: Move
>>> sysfb_init() from device_initcall to subsys_initcall_sync") there is
>>> no platform device created for SIMPLEDRM at early stage, so it seems
>>> also "no problem".
>>
>> I don't understand above. You mean that after that commit the platform
>> device is also none, right?
> No. The SIMPLEDRM driver needs a platform device to work, and that
> commit makes the platform device created earlier. So, before that
> commit, SIMPLEDRM doesn't work, but the screen isn't blank; after that
> commit, SIMPLEDRM works, but the screen is blank.
> 
> Huacai
>>
>> Confused...
>>
>> --
>> An old man doll... just what I always wanted! - Clara
> 
> 


Re: [REGRESSION] Panic in gen8_ggtt_insert_entries() with v6.5

2023-09-29 Thread Linux regression tracking #update (Thorsten Leemhuis)
On 19.09.23 16:08, Bagas Sanjaya wrote:
> On Sat, Sep 02, 2023 at 06:14:12PM +0200, Oleksandr Natalenko wrote:
>>
>> Since v6.5 kernel the following HW:
>>
>> * Lenovo T460s laptop with Skylake GT2 [HD Graphics 520] (rev 07)
>> * Lenovo T490s laptop with WhiskeyLake-U GT2 [UHD Graphics 620] (rev 02)
> 
> #regzbot ^introduced: 0b62af28f249b9
> #regzbot title: gen8_ggtt_insert_entries() panic on Lenovo T14s (Tiger Lake) 
> due to folio_batch() on shmem_sg_free_table()
> #regzbot link: https://gitlab.freedesktop.org/drm/intel/-/issues/9256

#regzbot fix: i915: Limit the length of an sg list to the requested length
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.




Re: Blank screen on boot of Linux 6.5 and later on Lenovo ThinkPad L570

2023-09-26 Thread Linux regression tracking (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

Hi, Thorsten here, the Linux kernel's regression tracker.

On 13.09.23 14:02, Jaak Ristioja wrote:
> 
> Upgrading to Linux 6.5 on a Lenovo ThinkPad L570 (Integrated Intel HD
> Graphics 620 (rev 02), Intel(R) Core(TM) i7-7500U) results in a blank
> screen after boot until the display manager starts... if it does start
> at all. Using the nomodeset kernel parameter seems to be a workaround.
> 
> I've bisected this to commit 60aebc9559492cea6a9625f514a8041717e3a2e4
> ("drivers/firmware: Move sysfb_init() from device_initcall to
> subsys_initcall_sync").

Hmmm, no reaction since it was posted a while ago, unless I'm missing
something.

Huacai Chen, did you maybe miss this report? The problem is apparently
caused by a commit of yours (that Javier applied), you hence should look
into this.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

> git bisect start
> # status: waiting for both good and bad commits
> # good: [6995e2de6891c724bfeb2db33d7b87775f913ad1] Linux 6.4
> git bisect good 6995e2de6891c724bfeb2db33d7b87775f913ad1
> # status: waiting for bad commit, 1 good commit known
> # bad: [2dde18cd1d8fac735875f2e4987f11817cc0bc2c] Linux 6.5
> git bisect bad 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
> # bad: [b775d6c5859affe00527cbe74263de05cfe6b9f9] Merge tag 'mips_6.5'
> of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
> git bisect bad b775d6c5859affe00527cbe74263de05cfe6b9f9
> # good: [3a8a670eeeaa40d87bd38a587438952741980c18] Merge tag
> 'net-next-6.5' of
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
> git bisect good 3a8a670eeeaa40d87bd38a587438952741980c18
> # bad: [188d3f80fc6d8451ab5e570becd6a7b2d3033023] drm/amdgpu: vcn_4_0
> set instance 0 init sched score to 1
> git bisect bad 188d3f80fc6d8451ab5e570becd6a7b2d3033023
> # good: [12fb1ad70d65edc3405884792d044fa79df7244f] drm/amdkfd: update
> process interrupt handling for debug events
> git bisect good 12fb1ad70d65edc3405884792d044fa79df7244f
> # bad: [9cc31938d4586f72eb8e0235ad9d9eb22496fcee] i915/perf: Drop the
> aging_tail logic in perf OA
> git bisect bad 9cc31938d4586f72eb8e0235ad9d9eb22496fcee
> # bad: [51d86ee5e07ccef85af04ee9850b0baa107999b6] drm/msm: Switch to
> fdinfo helper
> git bisect bad 51d86ee5e07ccef85af04ee9850b0baa107999b6
> # good: [bfdede3a58ea970333d77a05144a7bcec13cf515] drm/rockchip: cdn-dp:
> call drm_connector_update_edid_property() unconditionally
> git bisect good bfdede3a58ea970333d77a05144a7bcec13cf515
> # good: [123ee07ba5b7123e0ce0e0f9d64938026c16a2ce] drm: sun4i_tcon: use
> devm_clk_get_enabled in `sun4i_tcon_init_clocks`
> git bisect good 123ee07ba5b7123e0ce0e0f9d64938026c16a2ce
> # bad: [20d54e48d9c705091a025afff5839da2ea606f6b] fbdev: Rename
> fb_mem*() helpers
> git bisect bad 20d54e48d9c705091a025afff5839da2ea606f6b
> # bad: [728cb3f061e2b3a002fd76d91c2449b1497b6640] gpu: drm: bridge: No
> need to set device_driver owner
> git bisect bad 728cb3f061e2b3a002fd76d91c2449b1497b6640
> # bad: [0f1cb4d777281ca3360dbc8959befc488e0c327e] drm/ssd130x: Fix
> include guard name
> git bisect bad 0f1cb4d777281ca3360dbc8959befc488e0c327e
> # good: [0bd5bd65cd2e4d1335ea6c17cd2c8664decbc630] dt-bindings: display:
> simple: Add BOE EV121WXM-N10-1850 panel
> git bisect good 0bd5bd65cd2e4d1335ea6c17cd2c8664decbc630
> # bad: [60aebc9559492cea6a9625f514a8041717e3a2e4] drivers/firmware: Move
> sysfb_init() from device_initcall to subsys_initcall_sync
> git bisect bad 60aebc9559492cea6a9625f514a8041717e3a2e4
> # good: [8bb7c7bca5b70f3cd22d95b4d36029295c4274f6] drm/panel:
> panel-simple: Add BOE EV121WXM-N10-1850 panel support
> git bisect good 8bb7c7bca5b70f3cd22d95b4d36029295c4274f6
> # first bad commit: [60aebc9559492cea6a9625f514a8041717e3a2e4]
> drivers/firmware: Move sysfb_init() from device_initcall to
> subsys_initcall_sync


Re: mainline build failure due to 501126083855 ("fbdev/g364fb: Use fbdev I/O helpers")

2023-09-03 Thread Linux regression tracking #update (Thorsten Leemhuis)
[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 31.08.23 20:48, Sudip Mukherjee (Codethink) wrote:
> Hi All,
> 
> The latest mainline kernel branch fails to build mips jazz_defconfig with
> the error:
> 
> drivers/video/fbdev/g364fb.c:115:9: error: 'FB_DEFAULT_IOMEM_HELPERS' 
> undeclared here (not in a function); did you mean 'FB_DEFAULT_IOMEM_OPS'?
>   115 | FB_DEFAULT_IOMEM_HELPERS,
>   | ^~~~
>   | FB_DEFAULT_IOMEM_OPS
> 
> 
> git bisect pointed to 501126083855 ("fbdev/g364fb: Use fbdev I/O helpers").
> 
> Reverting the commit has fixed the build failure.
> 
> I will be happy to test any patch or provide any extra log if needed.
> 
> #regzbot introduced: 5011260838551cefbf23d60b48c3243b6d5530a2
> 

#regzbot fix: 8df0f84c3bb921f5aa1036223dd932bbc7df6d
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.



Re: nouveau bug in linux/6.1.38-2

2023-08-31 Thread Linux regression tracking #update (Thorsten Leemhuis)
[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 04.08.23 14:02, Thorsten Leemhuis wrote:
> On 02.08.23 23:28, Olaf Skibbe wrote:
>> Dear Maintainers,
>>
>> Hereby I would like to report an apparent bug in the nouveau driver in
>> linux/6.1.38-2.
> 
> Thx for your report. Maybe your problem is caused by a incomplete
> backport. I Cced the maintainers for the drivers (and the regressions
> and the stable list), maybe one of them has an idea, as they know the
> driver.

#regzbot fix: 98e470dc73a9b3539e5a7a3c72f6b7c01c98
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.




Re: [REGRESSION] HDMI connector detection broken in 6.3 on Intel(R) Celeron(R) N3060 integrated graphics

2023-08-13 Thread Linux regression tracking (Thorsten Leemhuis)
On 11.08.23 20:10, Mikhail Rudenko wrote:
> On 2023-08-11 at 08:45 +02, Thorsten Leemhuis  
> wrote:
>> On 10.08.23 21:33, Mikhail Rudenko wrote:
>>> The following is a copy an issue I posted to drm/i915 gitlab [1] two
>>> months ago. I repost it to the mailing lists in hope that it will help
>>> the right people pay attention to it.
>>
>> Thx for your report. Wonder why Dmitry (who authored a4e771729a51) or
>> Thomas (who committed it) it didn't look into this, but maybe the i915
>> devs didn't forward the report to them.

For the record: they did, and Jani mentioned already. Sorry, should have
phrased this differently.

>> Let's see if these mails help. Just wondering: does reverting
>> a4e771729a51 from 6.5-rc5 or drm-tip help as well?
> 
> I've redone my tests with 6.5-rc5, and here are the results:
> (1) 6.5-rc5 -> still affected
> (2) 6.5-rc5 + revert a4e771729a51 -> not affected
> (3) 6.5-rc5 + two patches [1][2] suggested on i915 gitlab by @ideak -> not 
> affected (!)
> 
> Should we somehow tell regzbot about (3)?

That's good to know, thx. But the more important things are:

* When will those be merged? They are not yet in next yet afaics, so it
might take some time to mainline them, especially at this point of the
devel cycle. Imre, could you try to prod the right people so that these
are ideally upstreamed rather sooner than later, as they fix a regression?
* They if possible ideally should be tagged for backporting to 6.4, as
this is a regression from the 6.3 cycle.

But yes, let's tell regzbot that fixes are available, too:

#regzbot fix: drm/i915: Fix HPD polling, reenabling the output poll work
as needed

(for the record: that's the second of two patches apparently needed)

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

>> BTW, there was an earlier report about a problem with a4e771729a51 that
>> afaics was never addressed, but it might be unrelated.
>> https://lore.kernel.org/all/20230328023129.3596968-1-zhouzong...@kylinos.cn/
> [1] https://patchwork.freedesktop.org/patch/548590/?series=121050=1
> [2] https://patchwork.freedesktop.org/patch/548591/?series=121050=1



Re: [PATCH 2/2] drm/bridge: lt9611: Do not generate HFP/HBP/HSA and EOT packet

2023-07-26 Thread Linux regression tracking (Thorsten Leemhuis)
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

What's the status wrt to this regression (caused by 8ddce13ae69 from
Marek)? It looks like things are stalled and the regression still is
unresolved, but I ask because I might be missing something.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 14.07.23 08:11, Amit Pundir wrote:
> On Thu, 13 Jul 2023 at 23:58, Marek Vasut  wrote:
>>
>> On 7/13/23 20:09, Abhinav Kumar wrote:
>>>
>>>
>>> On 7/12/2023 10:41 AM, Marek Vasut wrote:
>>>> On 7/9/23 03:03, Abhinav Kumar wrote:
>>>>>
>>>>>
>>>>> On 7/7/2023 1:47 AM, Neil Armstrong wrote:
>>>>>> On 07/07/2023 09:18, Neil Armstrong wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 06/07/2023 11:20, Amit Pundir wrote:
>>>>>>>> On Wed, 5 Jul 2023 at 11:09, Dmitry Baryshkov
>>>>>>>>  wrote:
>>>>>>>>>
>>>>>>>>> [Adding freedreno@ to cc list]
>>>>>>>>>
>>>>>>>>> On Wed, 5 Jul 2023 at 08:31, Jagan Teki
>>>>>>>>>  wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Amit,
>>>>>>>>>>
>>>>>>>>>> On Wed, Jul 5, 2023 at 10:15 AM Amit Pundir
>>>>>>>>>>  wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Marek,
>>>>>>>>>>>
>>>>>>>>>>> On Wed, 5 Jul 2023 at 01:48, Marek Vasut  wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Do not generate the HS front and back porch gaps, the HSA gap and
>>>>>>>>>>>> EOT packet, as these packets are not required. This makes the
>>>>>>>>>>>> bridge
>>>>>>>>>>>> work with Samsung DSIM on i.MX8MM and i.MX8MP.
>>>>>>>>>>>
>>>>>>>>>>> This patch broke display on Dragonboard 845c (SDM845) devboard
>>>>>>>>>>> running
>>>>>>>>>>> AOSP. This is what I see
>>>>>>>>>>> https://people.linaro.org/~amit.pundir/db845c-userdebug/v6.5-broken-display/PXL_20230704_150156326.jpg.
>>>>>>>>>>> Reverting this patch fixes this regression for me.
>>>>>>>>>>
>>>>>>>>>> Might be msm dsi host require proper handling on these updated
>>>>>>>>>> mode_flags? did they?
>>>>>>>>>
>>>>>>>>> The msm DSI host supports those flags. Also, I'd like to point out
>>>>>>>>> that the patch didn't change the rest of the driver code. So even if
>>>>>>>>> drm/msm ignored some of the flags, it should not have caused the
>>>>>>>>> issue. Most likely the issue is on the lt9611 side. I's suspect that
>>>>>>>>> additional programming is required to make it work with these flags.
>>>>>>>>
>>>>>>>> I spent some time today on smoke testing these flags (individually
>>>>>>>> and
>>>>>>>> in limited combination) on DB845c, to narrow down this breakage to
>>>>>>>> one
>>>>>>>> or more flag(s) triggering it. Here are my observations in limited
>>>>>>>> testing done so far.
>>>>>>>>
>>>>>>>> There is no regression with MIPI_DSI_MODE_NO_EOT_PACKET when enabled
>>>>>>>> alone and system boots to UI as usual.
>>>>>>>>
>>>>>>>> MIPI_DSI_MODE_VIDEO_NO_HFP always trigger the broken display as in
>>>>>>>> the
>>>>>>>> screenshot[1] shared earlier as well.
>>>>>>>>
>>>>>>>> Adding either of MIPI_DSI_MODE_VIDEO_NO_HSA and
>>>>>>>> MIPI_DSI_MODE_VIDEO_NO_HBP always result in no display, unless paired
>>>>>>>> with MIPI_DSI_MODE_VIDE

Re: [PATCH v2] drm/ast: report connection status on Display Port.

2023-07-10 Thread Linux regression tracking (Thorsten Leemhuis)
On 10.07.23 10:12, Jocelyn Falempe wrote:
> On 06/07/2023 15:03, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 06.07.23 11:58, Jocelyn Falempe wrote:
>>> Aspeed always report the display port as "connected", because it
>>> doesn't set a .detect callback.
>>> Fix this by providing the proper detect callback for astdp and dp501.
>>>
>>> This also fixes the following regression:
>>> Since commit fae7d186403e ("drm/probe-helper: Default to 640x480 if no
>>>   EDID on DP")
>>> The default resolution is now 640x480 when no monitor is connected.
>>> But Aspeed graphics is mostly used in servers, where no monitor
>>> is attached. This also affects the remote BMC resolution to 640x480,
>>> which is inconvenient, and breaks the anaconda installer.
>>>
>>> v2: Add .detect callback to the dp/dp501 connector (Jani Nikula)
>>>
>>> Signed-off-by: Jocelyn Falempe 
>>
>> So if this "also fixes a regression" how about a Fixes: tag and a CC:
>> > also in all affected stable and longterm kernels?
> 
> In this case, the regression only affect one userspace program
> (anaconda),

That is (mostly) irrelevant when it comes to regressions.

> and the fix looks too risky to backport to all stable kernels.

Not sure, but I tend to thing that decision would better be left to the
stable team. Each developer will have a different opinion about what's
too risky or not and they might be in the better position to judge what
they want for their trees. A "Fixes:" tag thus still seems appropriate
here; will also tell downstream distros that might want to pick this up.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.


RE: [PATCH] drm/hyperv: Fix a compilation issue because of not including screen_info.h

2023-07-09 Thread Michael Kelley (LINUX)
From: Sui Jingfeng  Sent: Sunday, July 9, 2023 3:05 AM
> 
>drivers/video/fbdev/hyperv_fb.c: In function 'hvfb_getmem':
> >> drivers/video/fbdev/hyperv_fb.c:1033:24: error: 'screen_info' undeclared 
> >> (first use
> in this function)
> 1033 | base = screen_info.lfb_base;
>  |^~~
>drivers/video/fbdev/hyperv_fb.c:1033:24: note: each undeclared identifier 
> is reported
> only once for each function it appears in
> --
>drivers/gpu/drm/hyperv/hyperv_drm_drv.c: In function 'hyperv_setup_vram':
> >> drivers/gpu/drm/hyperv/hyperv_drm_drv.c:75:54: error: 'screen_info' 
> >> undeclared
> (first use in this function)
>   75 | 
> drm_aperture_remove_conflicting_framebuffers(screen_info.lfb_base,
>  |  ^~~
>drivers/gpu/drm/hyperv/hyperv_drm_drv.c:75:54: note: each undeclared 
> identifier is
> reported only once for each function it appears in
> 
> Reported-by: kernel test robot 
> Closes: 
> https://lore.kernel.org/oe-kbuild-all/202307090823.nxnt8kk5-...@intel.com/
> Fixes: 81d2393485f0 ("fbdev/hyperv-fb: Do not set struct fb_info.apertures")
> Fixes: 8b0d13545b09 ("efi: Do not include  from EFI 
> header")
> Signed-off-by: Sui Jingfeng 
> ---
>  drivers/gpu/drm/hyperv/hyperv_drm_drv.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> index a7d2c92d6c6a..8026118c6e03 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> @@ -7,6 +7,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include 
>  #include 
> --
> 2.25.1

Reviewed-by: Michael Kelley 


Re: [PATCH 2/2] drm/bridge: lt9611: Do not generate HFP/HBP/HSA and EOT packet

2023-07-08 Thread Linux regression tracking (Thorsten Leemhuis)
[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 05.07.23 06:45, Amit Pundir wrote:
> 
> On Wed, 5 Jul 2023 at 01:48, Marek Vasut  wrote:
>>
>> Do not generate the HS front and back porch gaps, the HSA gap and
>> EOT packet, as these packets are not required. This makes the bridge
>> work with Samsung DSIM on i.MX8MM and i.MX8MP.
> 
> This patch broke display on Dragonboard 845c (SDM845) devboard running
> AOSP. This is what I see
> https://people.linaro.org/~amit.pundir/db845c-userdebug/v6.5-broken-display/PXL_20230704_150156326.jpg.
> Reverting this patch fixes this regression for me.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 8ddce13ae69
#regzbot title drm/bridge: lt9611: Dragonboard 845c (SDM845) devboard
broken when running AOSP
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: [PATCH v2] drm/ast: report connection status on Display Port.

2023-07-06 Thread Linux regression tracking (Thorsten Leemhuis)
On 06.07.23 11:58, Jocelyn Falempe wrote:
> Aspeed always report the display port as "connected", because it
> doesn't set a .detect callback.
> Fix this by providing the proper detect callback for astdp and dp501.
> 
> This also fixes the following regression:
> Since commit fae7d186403e ("drm/probe-helper: Default to 640x480 if no
>  EDID on DP")
> The default resolution is now 640x480 when no monitor is connected.
> But Aspeed graphics is mostly used in servers, where no monitor
> is attached. This also affects the remote BMC resolution to 640x480,
> which is inconvenient, and breaks the anaconda installer.
> 
> v2: Add .detect callback to the dp/dp501 connector (Jani Nikula)
> 
> Signed-off-by: Jocelyn Falempe 

So if this "also fixes a regression" how about a Fixes: tag and a CC:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.


Re: [PATCH 1/2] fbdev/offb: Update expected device name

2023-06-15 Thread Linux regression tracking (Thorsten Leemhuis)
On 16.04.23 14:34, Salvatore Bonaccorso wrote:
> 
> On Wed, Apr 12, 2023 at 11:55:08AM +0200, Cyril Brulebois wrote:
>> Since commit 241d2fb56a18 ("of: Make OF framebuffer device names unique"),
>> as spotted by Frédéric Bonnard, the historical "of-display" device is
>> gone: the updated logic creates "of-display.0" instead, then as many
>> "of-display.N" as required.
>>
>> This means that offb no longer finds the expected device, which prevents
>> the Debian Installer from setting up its interface, at least on ppc64el.
>>
>> It might be better to iterate on all possible nodes, but updating the
>> hardcoded device from "of-display" to "of-display.0" is confirmed to fix
>> the Debian Installer at the very least.
> [...]
> #regzbot ^introduced 241d2fb56a18
> #regzbot title: Open Firmware framebuffer cannot find of-display
> #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=217328
> #regzbot link: 
> https://lore.kernel.org/all/20230412095509.2196162-1-cy...@debamax.com/T/#m34493480243a2cad2ae359abfd9db5e755f41add
> #regzbot link: https://bugs.debian.org/1033058

No reply to my status inquiry[1] a few weeks ago, so I have to assume
nobody cares anymore. If somebody still cares, holler!

#regzbot inconclusive: no answer to a status inquiry
#regzbot ignore-activity

[1]
https://lore.kernel.org/lkml/d1aee7d3-05f6-0920-b8e1-4ed5cf3f9...@leemhuis.info/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.


Re: [PATCH] Revert "drm/msm/dp: set self refresh aware based on PSR support"

2023-06-06 Thread Linux regression tracking #adding (Thorsten Leemhuis)



On 05.06.23 12:18, Johan Hovold wrote:
> On Mon, Jun 05, 2023 at 01:05:36PM +0300, Dmitry Baryshkov wrote:
>> On Mon, 5 Jun 2023 at 13:02, Johan Hovold  wrote:
> 
>>> Virtual terminals are still broken with 6.4-rc5 on the Lenovo ThinkPad
>>> X13s two weeks after I reported this, and there has been no indication
>>> of any progress in the other related thread:
>>>
>>> https://lore.kernel.org/lkml/zhyphnwodbxb-...@hovoldconsulting.com
>>>
>>> Seems like it is time to merge this revert to get this sorted.

BTW, thx for bringing this to my attention!

>>> Rob, Abhinav, Dmitry, can either of you merge this one and get it into
>>> 6.4-rc6?
>>
>> Rob sent the pull request few hours ago, see
>> https://lore.kernel.org/dri-devel/caf6aeguhujkfjra6ys36uyh0kur4hd16u1emqjo8toz3ifv...@mail.gmail.com/
> 
> Ok, so you guys went with the module parameter hack. Whatever. As long
> as the regression is finally fixed.

Yup. Let me tell regzbot about the fix:

#regzbot fix: drm/msm/dp: add module parameter for PSR
#regzbot ignore-activity

> Next time, some visibility into your process would be appreciated to
> avoid unnecessary work.

Yeah, that's something we IMHO sooner or later need to improve for all
of kernel development -- among others to give people that find existing
bug reports a chance to find patches that were posted or applied to
address the issue (and of course reporters also, like in this case).

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.



Re: [PATCH v3 11/13] drm/fb-helper: Fix single-probe color-format selection

2023-05-26 Thread Linux regression tracking #update (Thorsten Leemhuis)
[TLDR: This mail in primarily relevant for Linux regression tracking. A
change or fix related to the regression discussed in this thread was
posted or applied, but it did not use a Link: tag to point to the
report, as Linus and the documentation call for. Things happen, no
worries -- but now the regression tracking bot needs to be told manually
about the fix. See link in footer if these mails annoy you.]

On 14.05.23 14:10, Linux regression tracking #adding (Thorsten Leemhuis)
wrote:
> On 12.05.23 15:20, Linus Walleij wrote:
>> Sorry for late regression detection but this patch regresses
>> the Integrator AB IMPD-1 graphics, I bisected down to this
>> patch.
> 
> #regzbot ^introduced 37c90d589dc
> #regzbot title drm/fb-helper: downscaling apparently stopped to work
> with pl110_impd1
> #regzbot ignore-activity

#regzbot monitor:
https://lore.kernel.org/all/20230515092943.1401558-1-linus.wall...@linaro.org/
#regzbot fix: drm/pl111: Fix FB depth on IMPD-1 framebuffer
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.



Re: [PATCH 2/2] drm/ofdrm: Update expected device name

2023-05-22 Thread Linux regression tracking (Thorsten Leemhuis)
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

Was a proper solution for the regression the initial mail in this thread
is about ever found? Doesn't look like it for here, but maybe I'm
missing something.

Reminder, the problem afaik is caused by 241d2fb56a ("of: Make OF
framebuffer device names unique") [merged for v6.2-rc8, authored by
Michal Suchanek; committed by Rob Herring].

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 24.04.23 11:35, Helge Deller wrote:
> On 4/24/23 11:07, Thomas Zimmermann wrote:
>> Am 24.04.23 um 09:33 schrieb Geert Uytterhoeven:
>>> On Wed, Apr 12, 2023 at 12:05 PM Cyril Brulebois 
>>> wrote:
>>>> Since commit 241d2fb56a18 ("of: Make OF framebuffer device names
>>>> unique"),
>>>> as spotted by Frédéric Bonnard, the historical "of-display" device is
>>>> gone: the updated logic creates "of-display.0" instead, then as many
>>>> "of-display.N" as required.
>>>>
>>>> This means that offb no longer finds the expected device, which
>>>> prevents
>>>> the Debian Installer from setting up its interface, at least on
>>>> ppc64el.
>>>>
>>>> Given the code similarity it is likely to affect ofdrm in the same way.
>>>>
>>>> It might be better to iterate on all possible nodes, but updating the
>>>> hardcoded device from "of-display" to "of-display.0" is likely to help
>>>> as a first step.
>>>>
>>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217328
>>>> Link: https://bugs.debian.org/1033058
>>>> Fixes: 241d2fb56a18 ("of: Make OF framebuffer device names unique")
>>>> Cc: sta...@vger.kernel.org # v6.2+
>>>> Signed-off-by: Cyril Brulebois 
>>>
>>> Thanks for your patch, which is now commit 3a9d8ea2539ebebd
>>> ("drm/ofdrm: Update expected device name") in fbdev/for-next.
>>>
>>>> --- a/drivers/gpu/drm/tiny/ofdrm.c
>>>> +++ b/drivers/gpu/drm/tiny/ofdrm.c
>>>> @@ -1390,7 +1390,7 @@ MODULE_DEVICE_TABLE(of, ofdrm_of_match_display);
>>>>
>>>>   static struct platform_driver ofdrm_platform_driver = {
>>>>  .driver = {
>>>> -   .name = "of-display",
>>>> +   .name = "of-display.0",
>>>>  .of_match_table = ofdrm_of_match_display,
>>>>  },
>>>>  .probe = ofdrm_probe,
>>>
>>> Same comment as for "[PATCH 1/2] fbdev/offb: Update expected device
>>> name".
>>>
>>> https://lore.kernel.org/r/camuhmdvgeeasmb4tauuqqgj-4+bbetwewyja+m9nyjv0bj_...@mail.gmail.com
>>
>> Sorry that I missed this patch. I agree that it's probably not
>> correct. At least in ofdrm, we want to be able to use multiple
>> framebuffers at the same time; a feature that has been broken by this
>> change.
> 
> Geert & Thomas, thanks for the review!
> 
> I've dropped both patches from fbdev tree for now.
> Would be great to find another good solution though, as it breaks the
> debian
> installer.
> 
> Helge


Re: [PATCH] drm/probe_helper: fix the warning reported when calling drm_kms_helper_poll_disable during suspend

2023-05-17 Thread Linux regression tracking (Thorsten Leemhuis)
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

Dmitry, was any progress made to address this regression? Doesn't look
like it, but I strongly suspect I'm missing something, as I'm not really
sure if I properly understood this thread. It sounded a bit like
a4e771729a51 should be reverted for now until all
drm_kms_helper_poll_disable() calls have been verified. Is that right?
Or did somebody already verify and fix all of them with bugs?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 28.04.23 03:17, zongmin zhou wrote:
> On Wed, 2023-04-26 at 16:10 +0300, Dmitry Baryshkov wrote:
>> On Wed, 26 Apr 2023 at 12:09, zongmin zhou 
>> wrote:
>>> On Sun, 2023-04-23 at 22:51 +0200, Janne Grunau wrote:
>>>> On 2023-04-20 23:07:01 +0300, Dmitry Baryshkov wrote:
>>>>> On Thu, 20 Apr 2023 at 23:01, Janne Grunau 
>>>>> wrote:
>>>>>>
>>>>>> On 2023-03-28 10:31:29 +0800, Zongmin Zhou wrote:
>>>>>>> When drivers call drm_kms_helper_poll_disable from
>>>>>>> their device suspend implementation without enabled output
>>>>>>> polling before,
>>>>>>> following warning will be reported,due to work->func not be
>>>>>>> initialized:
>>>>>>
>>>>>> we see the same warning with the wpork in progress kms driver
>>>>>> for
>>>>>> apple
>>>>>> silicon SoCs. The connectors do not need to polled so the
>>>>>> driver
>>>>>> never
>>>>>> calls drm_kms_helper_poll_init().
>>>>>>
>>>>>>> [   55.141361] WARNING: CPU: 3 PID: 372 at
>>>>>>> kernel/workqueue.c:3066 __flush_work+0x22f/0x240
>>>>>>> [   55.141382] Modules linked in: nls_iso8859_1
>>>>>>> snd_hda_codec_generic ledtrig_audio snd_hda_intel
>>>>>>> snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec
>>>>>>> snd_hda_core
>>>>>>> snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event
>>>>>>> snd_rawmidi
>>>>>>> snd_seq intel_rapl_msr intel_rapl_common bochs
>>>>>>> drm_vram_helper
>>>>>>> drm_ttm_helper snd_seq_device nfit ttm crct10dif_pclmul
>>>>>>> snd_timer ghash_clmulni_intel binfmt_misc sha512_ssse3
>>>>>>> aesni_intel drm_kms_helper joydev input_leds syscopyarea
>>>>>>> crypto_simd snd cryptd sysfillrect sysimgblt mac_hid
>>>>>>> serio_raw
>>>>>>> soundcore qemu_fw_cfg sch_fq_codel msr parport_pc ppdev lp
>>>>>>> parport drm ramoops reed_solomon pstore_blk pstore_zone
>>>>>>> efi_pstore virtio_rng ip_tables x_tables autofs4
>>>>>>> hid_generic
>>>>>>> usbhid hid ahci virtio_net i2c_i801 crc32_pclmul psmouse
>>>>>>> virtio_scsi libahci i2c_smbus lpc_ich xhci_pci net_failover
>>>>>>> virtio_blk xhci_pci_renesas failover
>>>>>>> [   55.141430] CPU: 3 PID: 372 Comm: kworker/u16:9 Not
>>>>>>> tainted
>>>>>>> 6.2.0-rc6+ #16
>>>>>>> [   55.141433] Hardware name: QEMU Standard PC (Q35 + ICH9,
>>>>>>> 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org
>>>>>>> 04/01/2014
>>>>>>> [   55.141435] Workqueue: events_unbound async_run_entry_fn
>>>>>>> [   55.141441] RIP: 0010:__flush_work+0x22f/0x240
>>>>>>> [   55.141444] Code: 8b 43 28 48 8b 53 30 89 c1 e9 f9 fe ff
>>>>>>> ff
>>>>>>> 4c 89 f7 e8 b5 95 d9 00 e8 00 53 08 00 45 31 ff e9 11 ff ff
>>>>>>> ff
>>>>>>> 0f 0b e9 0a ff ff ff <0f> 0b 45 31 ff e9 00 ff ff ff e8 e2
>>>>>>> 54
>>>>>>> d8 00 66 90 90 90 90 90 90
>>>>>>> [   55.141446] RSP: 0018:ff59221940833c18 EFLAGS: 00010246
>>>>>>> [   55.141449] RAX:  RBX: 
>>>>>>> RCX:
>>>>>>> 9b72bcbe
>>>>>>> [   55.141450] RDX: 0001 RSI: 0001
>>>>>>> RDI:
>>>>>

RE: [PATCH v2 RESEND 4/7] swiotlb: Dynamically allocated bounce buffers

2023-05-15 Thread Michael Kelley (LINUX)
From: Petr Tesarik  Sent: Tuesday, May 9, 2023 
2:18 AM
> 
> The software IO TLB was designed with the assumption that it is not
> used much, especially on 64-bit systems, so a small fixed memory
> area (currently 64 MiB) is sufficient to handle the few cases which
> still require a bounce buffer. However, these cases are not so rare
> in some circumstances.
> 
> First, if SEV is active, all DMA must be done through shared
> unencrypted pages, and SWIOTLB is used to make this happen without
> changing device drivers. The software IO TLB size is increased to 6%
> of total memory in sev_setup_arch(), but that is more of an
> approximation. The actual requirements may vary depending on which
> drivers are used and the amount of I/O.

FWIW, I don't think the approach you have implemented here will be
practical to use for CoCo VMs (SEV, TDX, whatever else).  The problem
is that dma_direct_alloc_pages() and dma_direct_free_pages() must
call dma_set_decrypted() and dma_set_encrypted(), respectively.  In CoCo
VMs, these calls are expensive because they require a hypercall to the host,
and the operation on the host isn't trivial either.  I haven't measured the
overhead, but doing a hypercall on every DMA map operation and on
every unmap operation has long been something we thought we must
avoid.  The fixed swiotlb bounce buffer space solves this problem by
doing set_decrypted() in batch at boot time, and never
doing set_encrypted().

In Microsoft's first implementation of bounce buffering for SEV-SNP VMs,
we created custom bounce buffer code separate from swiotlb.  This code
did similar what you've done, but maintained a per-device pool of allocated
buffers that could be reused, rather than freeing the memory (and marking
the memory encrypted again) on every DMA unmap operation.  (The pool
was actually per-VMBus channel, but VMBus channels are per-device, so
the effect was the same.)  The reusable pool avoided most of the calls to
set_decrypted()/set_encrypted() and made it practical from a performance
standpoint.  But of course, the pool could grow arbitrarily large, so there
was additional complexity to decay and trim the pool size.  LKML feedback
early on was to use swiotlb instead, which made sense, but at the cost of
needing to figure out the appropriate fixed size of the swiotlb, and likely
over-provisioning to avoid running out of bounce buffer space.

Now we're considering again a more dynamic approach, which is good, but
we're encountering the same problems.

See 
https://lore.kernel.org/linux-hyperv/20210228150315.2552437-1-ltyker...@gmail.com/
for this historical example.

Michael

> 
> Second, some embedded devices have very little RAM, so 64 MiB is not
> negligible. Sadly, these are exactly the devices that also often
> need a software IO TLB. Although minimum swiotlb size can be found
> empirically by extensive testing, it would be easier to allocate a
> small swiotlb at boot and let it grow on demand.
> 
> Growing the SWIOTLB data structures at run time is impossible. The
> whole SWIOTLB region is contiguous in physical memory to allow
> combining adjacent slots and also to ensure that alignment
> constraints can be met. The SWIOTLB is too big for the buddy
> allocator (cf. MAX_ORDER). More importantly, even if a new SWIOTLB
> could be allocated (e.g. from CMA), it cannot be extended in-place
> (because surrounding pages may be already allocated for other
> purposes), and there is no mechanism for relocating already mapped
> bounce buffers: The DMA API gets only the address of a buffer, and
> the implementation (direct or IOMMU) checks whether it belongs to
> the software IO TLB.
> 
> It is possible to allocate multiple smaller struct io_tlb_mem
> instances. However, they would have to be stored in a non-constant
> container (list or tree), which needs synchronization between
> readers and writers, creating contention in a hot path for all
> devices, not only those which need software IO TLB.
> 
> Another option is to allocate a very large SWIOTLB at boot, but
> allow migrating pages to other users (like CMA does). This approach
> might work, but there are many open issues:
> 
> 1. After a page is migrated away from SWIOTLB, it must not be used
>as a (direct) DMA buffer. Otherwise SWIOTLB code would have to
>check which pages have been migrated to determine whether a given
>buffer address belongs to a bounce buffer or not, effectively
>introducing all the issues of multiple SWIOTLB instances.
> 
> 2. Unlike SWIOTLB, CMA cannot be used from atomic contexts, and that
>for many different reasons. This might be changed in theory, but
>it would take a lot of investigation and time. OTOH improvement
>to the SWIOTLB is needed now.
> 
> 3. If SWIOTLB is implemented separately from CMA and not as its
>p

Re: [PATCH v3 11/13] drm/fb-helper: Fix single-probe color-format selection

2023-05-14 Thread Linux regression tracking #adding (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 12.05.23 15:20, Linus Walleij wrote:
> Sorry for late regression detection but this patch regresses
> the Integrator AB IMPD-1 graphics, I bisected down to this
> patch.
> 
> On Mon, Jan 2, 2023 at 12:30 PM Thomas Zimmermann  wrote:
> [...]
> Before this patch:
> 
> [drm] Initialized pl111 1.0.0 20170317 for c100.display on minor 0
> drm-clcd-pl111 c100.display: [drm] requested bpp 16, scaled depth down to 
> 15
> drm-clcd-pl111 c100.display: enable IM-PD1 CLCD connectors
> Console: switching to colour frame buffer device 80x30
> drm-clcd-pl111 c100.display: [drm] fb0: pl111drmfb frame buffer device
> 
> After this patch:
> 
> [drm] Initialized pl111 1.0.0 20170317 for c100.display on minor 0
> drm-clcd-pl111 c100.display: [drm] bpp/depth value of 16/16 not supported
> drm-clcd-pl111 c100.display: [drm] No compatible format found
> drm-clcd-pl111 c100.display: [drm] *ERROR* fbdev: Failed to setup
> generic emulation (ret=-12)
> 
> It seems the bpp downscaling stopped to work? [...]

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 37c90d589dc
#regzbot title drm/fb-helper: downscaling apparently stopped to work
with pl110_impd1
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.



Re: Fwd: Kernel 5.11 crashes when it boots, it produces black screen.

2023-05-10 Thread Linux regression tracking (Thorsten Leemhuis)
Hi!

On 10.05.23 10:26, Bagas Sanjaya wrote:
> 
> I noticed a regression report on Bugzilla ([1]). As many developers don't
> have a look on it, I decided to forward it by email. See the report
> for the full thread.
> 
> Quoting from the report:
> 
>>  Azamat S. Kalimoulline 2021-04-06 15:45:08 UTC
>>
>> Same as in https://bugzilla.kernel.org/show_bug.cgi?id=212133, but not 
>> StoneyRidge related. I have same issue in 5.11.9 kernel, but on Renoir 
>> architecture. I have AMD Ryzen 5 PRO 4650U with Radeon Graphics. Same stuck 
>> on loading initial ramdisk. modprobe.blacklist=amdgpu 3` didn't help to 
>> boot. Same stuck. Also iommu=off and acpi=off too. 5.10.26 boots fine. I 
>> boot via efi and I have no option boot without it.
> 
> Azamat, can you try reproducing this issue on latest mainline?
>
> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=212579

Bagas, thx for all your help with regression tracking, much appreciated
(side note, as I'm curious for a while already: what is your motivation?
Just want to help? But whatever, any help is great!).

That being said: I'm not sure if I like what you did in this particular
case, as developers might start getting annoyed by regression tracking
if we throw too many bug reports of lesser quality before their feet --
and then they might start to ignore us, which we really need to prevent.

That's why I would not have forwarded that report at this point of time,
mainly for these reasons:

 * The initial report is quite old already, as it fall through the
cracks (not good, but happens; sorry Azamat!). Hence in this case it
would definitely be better to *first* ask the reporter to check if the
problem still happens with latest mainline (or at least latest stable)
before involving the kernel developers, as it might have been fixed
already.

 * This might not be a amdgpu bug at all; in fact the other bug the
reporter mentioned was an iommu thing. Hence this might be one of those
regressions where a bisection is the only way to get down to the
problem. Sure, sending a few developers a quick inquiry along the lines
of "do you maybe have an idea what's up there" is fine, but that's not
what you did in your mail. Your list of recipients is also quite long;
that's risky: if you do that too often, as then they might start
ignoring mail from you.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.


Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue

2023-05-02 Thread Linux regression tracking (Thorsten Leemhuis)
On 02.05.23 15:48, Felix Richter wrote:
> On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 02.05.23 15:13, Alex Deucher wrote:
>>> On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
>>> Leemhuis)  wrote:
>>>
>>>> On 30.04.23 13:44, Felix Richter wrote:
>>>>> Hi,
>>>>>
>>>>> I am running into an issue with the integrated GPU of the Ryzen 9
>>>>> 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
>>>>> The bug materializes in from of my monitor blinking, meaning it
>>>>> turns full white shortly. This happens very often so that the
>>>>> system becomes unpleasant to use.
>>>>>
>>>>> I am running the Archlinux Kernel:
>>>>> The Issue happens on the bleeding edge kernel: 6.2.13
>>>>> Switching back to the LTS kernel resolves the issue: 6.1.26
>>>>>
>>>>> I have two monitors attached to the system. One 42 inch 4k Display
>>>>> and a 24 inch 1080p Display and am running sway as my desktop.
>>>>>
>>>>> Let me know if there is more information I could provide to help
>>>>> narrow down the issue.
>>>> Thanks for the report. To be sure the issue doesn't fall through the
>>>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>>>> tracking bot:
>>>>
>>>> #regzbot ^introduced v6.1..v6.2
>>>> #regzbot title drm: amdgpu: system becomes unpleasant to use after
>>>> monitor starts blinking and turns full white
>>>> #regzbot ignore-activity
>>>>
>>>> This isn't a regression? This issue or a fix for it are already
>>>> discussed somewhere else? It was fixed already? You want to clarify
>>>> when
>>>> the regression started to happen? Or point out I got the title or
>>>> something else totally wrong? Then just reply and tell me -- ideally
>>>> while also telling regzbot about it, as explained by the page listed in
>>>> the footer of this mail.
>>>>
>>>> Developers: When fixing the issue, remember to add 'Link:' tags
>>>> pointing
>>>> to the report (the parent of this mail). See page linked in footer for
>>>> details.
>>> This sounds exactly like the issue that was fixed in this patch which
>>> is already on it's way to Linus:
>>> https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9
>> FWIW, you in the flood of emails likely missed that this is the same
>> thread where you yesterday replied "If the module parameter didn't help
>> then perhaps you are seeing some other issue.  Can you bisect?". That's
>> why I decided to add this to the tracking. Or am I missing something
>> obvious here?
>>
>> /me looks around again and can't see anything, but that doesn't have to
>> mean anything...
>>
>> Felix, btw, this guide might help you with the bisection, even if it's
>> just for kernel compilation:
>>
>> https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html
>>
>> And to indirectly reply to your mail from yesterday[1]. You might want
>> to ignore the arch linux kernel git repo and just do a bisection between
>> 6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
>> I'd also try 6.3 or even mainline before that, in case the issue was
>> fixed already.
>>
>> [1]
>> https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279e...@felixrichter.tech/
>>
> Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
> the newest commit.

FWIW, I wonder what you actually mean with "newest commit" here: a
bisection between 6.1 and mainline HEAD might be a waste of time, *if*
this is something that only happens in 6.2.y (say due to a broken or
incomplete backport)

> That was the part I was mostly unsure about … where
> to start from.
> 
> I was planning to use PKGBUILD scripts from arch to achieve the same
> configuration as I would when installing
> the package and just rewrite the script to use a local copy of the
> source code instead of the repository.
> That way I can just use the bisect command, rebuild the package and test
> again.

In my experience trying to deal with Linux distro's package managers
creates more trouble than it's worth.

> But I probably won't be able to finish it this week, since I am on
> vacation starting tomorrow and will not have access to the computer in
> question. I will be back next week, by that time the patch Alex is
> talking about might
> already be in mainline. So if that fixes it, I will notice and let you
> know. If not I will do the bisection to figure out what the actual issue
> is.

Enjoy your vacation!

Ciao, Thorsten


Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue

2023-05-02 Thread Linux regression tracking (Thorsten Leemhuis)
On 02.05.23 15:13, Alex Deucher wrote:
> On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
> Leemhuis)  wrote:
>
>> On 30.04.23 13:44, Felix Richter wrote:
>>> Hi,
>>>
>>> I am running into an issue with the integrated GPU of the Ryzen 9 7950X. It 
>>> seems to be a regression from kernel version 6.1 to 6.2.
>>> The bug materializes in from of my monitor blinking, meaning it turns full 
>>> white shortly. This happens very often so that the system becomes 
>>> unpleasant to use.
>>>
>>> I am running the Archlinux Kernel:
>>> The Issue happens on the bleeding edge kernel: 6.2.13
>>> Switching back to the LTS kernel resolves the issue: 6.1.26
>>>
>>> I have two monitors attached to the system. One 42 inch 4k Display and a 24 
>>> inch 1080p Display and am running sway as my desktop.
>>>
>>> Let me know if there is more information I could provide to help narrow 
>>> down the issue.
>>
>> Thanks for the report. To be sure the issue doesn't fall through the
>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>> tracking bot:
>>
>> #regzbot ^introduced v6.1..v6.2
>> #regzbot title drm: amdgpu: system becomes unpleasant to use after
>> monitor starts blinking and turns full white
>> #regzbot ignore-activity
>>
>> This isn't a regression? This issue or a fix for it are already
>> discussed somewhere else? It was fixed already? You want to clarify when
>> the regression started to happen? Or point out I got the title or
>> something else totally wrong? Then just reply and tell me -- ideally
>> while also telling regzbot about it, as explained by the page listed in
>> the footer of this mail.
>>
>> Developers: When fixing the issue, remember to add 'Link:' tags pointing
>> to the report (the parent of this mail). See page linked in footer for
>> details.
> 
> This sounds exactly like the issue that was fixed in this patch which
> is already on it's way to Linus:
> https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9

FWIW, you in the flood of emails likely missed that this is the same
thread where you yesterday replied "If the module parameter didn't help
then perhaps you are seeing some other issue.  Can you bisect?". That's
why I decided to add this to the tracking. Or am I missing something
obvious here?

/me looks around again and can't see anything, but that doesn't have to
mean anything...

Felix, btw, this guide might help you with the bisection, even if it's
just for kernel compilation:

https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html

And to indirectly reply to your mail from yesterday[1]. You might want
to ignore the arch linux kernel git repo and just do a bisection between
6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
I'd also try 6.3 or even mainline before that, in case the issue was
fixed already.

[1]
https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279e...@felixrichter.tech/

Ciao, Thorsten


Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue

2023-05-02 Thread Linux regression tracking (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 30.04.23 13:44, Felix Richter wrote:
> Hi,
> 
> I am running into an issue with the integrated GPU of the Ryzen 9 7950X. It 
> seems to be a regression from kernel version 6.1 to 6.2. 
> The bug materializes in from of my monitor blinking, meaning it turns full 
> white shortly. This happens very often so that the system becomes unpleasant 
> to use.
> 
> I am running the Archlinux Kernel:
> The Issue happens on the bleeding edge kernel: 6.2.13
> Switching back to the LTS kernel resolves the issue: 6.1.26
> 
> I have two monitors attached to the system. One 42 inch 4k Display and a 24 
> inch 1080p Display and am running sway as my desktop.
> 
> Let me know if there is more information I could provide to help narrow down 
> the issue.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced v6.1..v6.2
#regzbot title drm: amdgpu: system becomes unpleasant to use after
monitor starts blinking and turns full white
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: [PATCH v3] firmware/sysfb: Fix VESA format selection

2023-04-21 Thread Linux regression tracking (Thorsten Leemhuis)
On 20.04.23 17:57, Pierre Asselin wrote:
> Some legacy BIOSes report no reserved bits in their 32-bit rgb mode,
> breaking the calculation of bits_per_pixel in commit f35cd3fa7729
> ("firmware/sysfb: Fix EFI/VESA format selection").  However they report
> lfb_depth correctly for those modes.  Keep the computation but
> set bits_per_pixel to lfb_depth if the latter is larger.
> 
> v2 fixes the warnings from a max3() macro with arguments of different
> types;  split the bits_per_pixel assignment to avoid uglyfing the code
> with too many casts.
> 
> v3 fixes space and formatting blips pointed out by Javier, and change
> the bit_per_pixel assignment back to a single statement using two casts.
> 
> Link: https://lore.kernel.org/r/4psm6b6lqkz1...@panix3.panix.com
> Link: https://lore.kernel.org/r/20230412150225.3757223-1-javi...@redhat.com
> Link: 
> https://lore.kernel.org/dri-devel/20230418183325.2327-1...@panix.com/T/#u
> Link: 
> https://lore.kernel.org/dri-devel/20230419044834.10816-1...@panix.com/T/#u
> Fixes: f35cd3fa7729 ("firmware/sysfb: Fix EFI/VESA format selection")
> Signed-off-by: Pierre Asselin 

Linus might release the final this weekend and this is among the last
few 6.3 regressions I track. Hence please allow me to ask:

Pierre, Tomas, Javier, et. al: how many "legacy BIOSes" do we suspect
are affected by this? So many that it might be worth delaying the
release by one week? And in case everybody involved might agree that
this patch is ready by today or tomorrow: might it be worth asking Linus
to merge this patch directly[1]?

[FWIW, I highly suspect the answer to the last two questions is "no,
that's definitely not worth is", just wanted to confirm]

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

[1] yes, that's a thing we do:
https://lore.kernel.org/all/CAHk-=wis_qqy4odnynnki5b7qhosmxtoj1jxo5wmb6sruwq...@mail.gmail.com/


Re: [PATCH v3 01/13] firmware/sysfb: Fix EFI/VESA format selection

2023-04-16 Thread Linux regression tracking #update (Thorsten Leemhuis)
[TLDR: This mail in primarily relevant for Linux regression tracking. A
change or fix related to the regression discussed in this thread was
posted or applied, but it did not use a Link: tag to point to the
report, as Linus and the documentation call for. Things happen, no
worries -- but now the regression tracking bot needs to be told manually
about the fix. See link in footer if these mails annoy you.]

On 08.04.23 13:26, Linux regression tracking #adding (Thorsten Leemhuis)
wrote:
> 
> On 06.04.23 17:45, Pierre Asselin wrote:
>> Thomas Zimmermann  wrote:
>> [...] 
>> Starting at linux-6.3-rc1 my simplefb picks the wrong mode and garbles
>> the display This is on a 16-year old i686 laptop.  I can post lshw or
>> dmidecode output if it helps.
>> [...] 
>> I bisected it to f35cd3fa77293c2cd03e94b6a6151e1a7d9309cf
>> firmware/sysfb: Fix EFI/VESA format selection
> 
> Thanks for the report. To be sure the issue doesn't fall through the
> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
> tracking bot:
> 
> #regzbot ^introduced f35cd3fa77293c2cd03e
> #regzbot title firmware/sysfb: wrong mode and display garbled on 16-year
> old i686 laptop
> #regzbot ignore-activity

#regzbot monitor:
https://lore.kernel.org/lkml/20230412150225.3757223-1-javi...@redhat.com/
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: [PATCH v3 01/13] firmware/sysfb: Fix EFI/VESA format selection

2023-04-08 Thread Linux regression tracking #adding (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 06.04.23 17:45, Pierre Asselin wrote:
> Thomas Zimmermann  wrote:
> [...] 
> Starting at linux-6.3-rc1 my simplefb picks the wrong mode and garbles
> the display This is on a 16-year old i686 laptop.  I can post lshw or
> dmidecode output if it helps.
> [...] 
> I bisected it to f35cd3fa77293c2cd03e94b6a6151e1a7d9309cf
> firmware/sysfb: Fix EFI/VESA format selection

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced f35cd3fa77293c2cd03e
#regzbot title firmware/sysfb: wrong mode and display garbled on 16-year
old i686 laptop
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: linux-6.2-rc4+ hangs on poweroff/reboot: Bisected

2023-03-12 Thread Linux regression tracking (Thorsten Leemhuis)
On 10.03.23 11:20, Karol Herbst wrote:
> On Fri, Mar 10, 2023 at 10:26 AM Chris Clayton  
> wrote:
>>
>> Is it likely that this fix will be sumbmitted to mainline during the ongoing 
>> 6.3 development cycle?
>>
> 
> yes, it's already pushed to drm-misc-fixed, which then will go into
> the current devel cycle. I just don't know when it's the next time it
> will be pushed upwards, but it should get there eventually. 

FWIW, the fix landed now as 1b9b4f922f96 ; sadly without a Link: tag to
the report, hence I have to mark this manually as resolved:

#regzbot fix: 1b9b4f922f96108da3bb5d87b2d603f5dfbc5650

> And
> because it also contains a Fixes tag it will be backported to older
> branches as well.

FWIW, nope, that's not enough you have to tag those explicitly to ensure
backporting, as explained in
Documentation/process/stable-kernel-rules.rst Greg points that out every
few weeks, recently here for example:

https://lore.kernel.org/all/y6bwpo9s9qbns...@kroah.com/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

>> Chris
>>
>> On 20/02/2023 22:16, Ben Skeggs wrote:
>>> On Mon, 20 Feb 2023 at 21:27, Karol Herbst  wrote:
>>>>
>>>> On Mon, Feb 20, 2023 at 11:51 AM Chris Clayton  
>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 20/02/2023 05:35, Ben Skeggs wrote:
>>>>>> On Sun, 19 Feb 2023 at 04:55, Chris Clayton  
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 18/02/2023 15:19, Chris Clayton wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 18/02/2023 12:25, Karol Herbst wrote:
>>>>>>>>> On Sat, Feb 18, 2023 at 1:22 PM Chris Clayton 
>>>>>>>>>  wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 15/02/2023 11:09, Karol Herbst wrote:
>>>>>>>>>>> On Wed, Feb 15, 2023 at 11:36 AM Linux regression tracking #update
>>>>>>>>>>> (Thorsten Leemhuis)  wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 13.02.23 10:14, Chris Clayton wrote:
>>>>>>>>>>>>> On 13/02/2023 02:57, Dave Airlie wrote:
>>>>>>>>>>>>>> On Sun, 12 Feb 2023 at 00:43, Chris Clayton 
>>>>>>>>>>>>>>  wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 10/02/2023 19:33, Linux regression tracking (Thorsten 
>>>>>>>>>>>>>>> Leemhuis) wrote:
>>>>>>>>>>>>>>>> On 10.02.23 20:01, Karol Herbst wrote:
>>>>>>>>>>>>>>>>> On Fri, Feb 10, 2023 at 7:35 PM Linux regression tracking 
>>>>>>>>>>>>>>>>> (Thorsten
>>>>>>>>>>>>>>>>> Leemhuis)  wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 08.02.23 09:48, Chris Clayton wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I'm assuming  that we are not going to see a fix for this 
>>>>>>>>>>>>>>>>>>> regression before 6.2 is released.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yeah, looks like it. That's unfortunate, but happens. But 
>>>>>>>>>>>>>>>>>> there is still
>>>>>>>>>>>>>>>>>> time to fix it and there is one thing I wonder:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Did any of the nouveau developers look at the netconsole 
>>>>>>>>>>>>>>>>>> captures Chris
>>>>&

RE: [PATCH v2 031/101] fbdev/hyperv_fb: Duplicate video-mode option string

2023-03-12 Thread Michael Kelley (LINUX)
From: Thomas Zimmermann  Sent: Thursday, March 9, 2023 
8:01 AM
> 
> Assume that the driver does not own the option string or its substrings
> and hence duplicate the option string for the video mode. As the driver
> implements a very simple mode parser in a fairly unstructured way, just
> duplicate the option string and parse the duplicated memory buffer. Free
> the buffer afterwards.
> 
> Done in preparation of constifying the option string and switching the
> driver to struct option_iter.
> 
> Signed-off-by: Thomas Zimmermann 
> ---
>  drivers/video/fbdev/hyperv_fb.c | 18 +-
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
> index 4a6a3303b6b4..edb0555239c6 100644
> --- a/drivers/video/fbdev/hyperv_fb.c
> +++ b/drivers/video/fbdev/hyperv_fb.c
> @@ -903,17 +903,23 @@ static const struct fb_ops hvfb_ops = {
>  static void hvfb_get_option(struct fb_info *info)
>  {
>   struct hvfb_par *par = info->par;
> - char *opt = NULL, *p;
> + char *options = NULL;
> + char *optbuf, *opt, *p;
>   uint x = 0, y = 0;
> 
> - if (fb_get_options(KBUILD_MODNAME, ) || !opt || !*opt)
> + if (fb_get_options(KBUILD_MODNAME, ) || !options || !*options)
>   return;
> 
> + optbuf = kstrdup(options, GFP_KERNEL);
> + if (!optbuf)
> + return;
> + opt = optbuf;
> +
>   p = strsep(, "x");
>   if (!*p || kstrtouint(p, 0, ) ||
>   !opt || !*opt || kstrtouint(opt, 0, )) {
>   pr_err("Screen option is invalid: skipped\n");
> - return;
> + goto out;
>   }
> 
>   if (x < HVFB_WIDTH_MIN || y < HVFB_HEIGHT_MIN ||
> @@ -922,12 +928,14 @@ static void hvfb_get_option(struct fb_info *info)
>   (par->synthvid_version == SYNTHVID_VERSION_WIN8 &&
>x * y * screen_depth / 8 > SYNTHVID_FB_SIZE_WIN8)) {
>   pr_err("Screen resolution option is out of range: skipped\n");
> - return;
> + goto out;
>   }
> 
>   screen_width = x;
>   screen_height = y;
> - return;
> +
> +out:
> + kfree(optbuf);
>  }
> 
>  /*
> --
> 2.39.2

Reviewed-by: Michael Kelley 



Re: linux-6.2-rc4+ hangs on poweroff/reboot: Bisected

2023-02-15 Thread Linux regression tracking #update (Thorsten Leemhuis)
On 13.02.23 10:14, Chris Clayton wrote:
> On 13/02/2023 02:57, Dave Airlie wrote:
>> On Sun, 12 Feb 2023 at 00:43, Chris Clayton  wrote:
>>>
>>>
>>>
>>> On 10/02/2023 19:33, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>> On 10.02.23 20:01, Karol Herbst wrote:
>>>>> On Fri, Feb 10, 2023 at 7:35 PM Linux regression tracking (Thorsten
>>>>> Leemhuis)  wrote:
>>>>>>
>>>>>> On 08.02.23 09:48, Chris Clayton wrote:
>>>>>>>
>>>>>>> I'm assuming  that we are not going to see a fix for this regression 
>>>>>>> before 6.2 is released.
>>>>>>
>>>>>> Yeah, looks like it. That's unfortunate, but happens. But there is still
>>>>>> time to fix it and there is one thing I wonder:
>>>>>>
>>>>>> Did any of the nouveau developers look at the netconsole captures Chris
>>>>>> posted more than a week ago to check if they somehow help to track down
>>>>>> the root of this problem?
>>>>>
>>>>> I did now and I can't spot anything. I think at this point it would
>>>>> make sense to dump the active tasks/threads via sqsrq keys to see if
>>>>> any is in a weird state preventing the machine from shutting down.
>>>>
>>>> Many thx for looking into it!
>>>
>>> Yes, thanks Karol.
>>>
>>> Attached is the output from dmesg when this block of code:
>>>
>>> /bin/mount /dev/sda7 /mnt/sda7
>>> /bin/mountpoint /proc || /bin/mount /proc
>>> /bin/dmesg -w > /mnt/sda7/sysrq.dmesg.log &
>>> /bin/echo t > /proc/sysrq-trigger
>>> /bin/sleep 1
>>> /bin/sync
>>> /bin/sleep 1
>>> kill $(pidof dmesg)
>>> /bin/umount /mnt/sda7
>>>
>>> is executed immediately before /sbin/reboot is called as the final step of 
>>> rebooting my system.
>>>
>>> I hope this is what you were looking for, but if not, please let me know 
>>> what you need
> 
> Thanks Dave. [...]
FWIW, in case anyone strands here in the archives: the msg was
truncated. The full post can be found in a new thread:

https://lore.kernel.org/lkml/e0b80506-b3cf-315b-4327-1b988d860...@googlemail.com/

Sadly it seems the info "With runpm=0, both reboot and poweroff work on
my laptop." didn't bring us much further to a solution. :-/ I don't
really like it, but for regression tracking I'm now putting this on the
back-burner, as a fix is not in sight.

#regzbot monitor:
https://lore.kernel.org/lkml/e0b80506-b3cf-315b-4327-1b988d860...@googlemail.com/
#regzbot backburner: hard to debug and apparently rare
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

#regzbot ignore-activity


Re: linux-6.2-rc4+ hangs on poweroff/reboot: Bisected

2023-02-10 Thread Linux regression tracking (Thorsten Leemhuis)
On 10.02.23 20:01, Karol Herbst wrote:
> On Fri, Feb 10, 2023 at 7:35 PM Linux regression tracking (Thorsten
> Leemhuis)  wrote:
>>
>> On 08.02.23 09:48, Chris Clayton wrote:
>>>
>>> I'm assuming  that we are not going to see a fix for this regression before 
>>> 6.2 is released.
>>
>> Yeah, looks like it. That's unfortunate, but happens. But there is still
>> time to fix it and there is one thing I wonder:
>>
>> Did any of the nouveau developers look at the netconsole captures Chris
>> posted more than a week ago to check if they somehow help to track down
>> the root of this problem?
> 
> I did now and I can't spot anything. I think at this point it would
> make sense to dump the active tasks/threads via sqsrq keys to see if
> any is in a weird state preventing the machine from shutting down.

Many thx for looking into it!

Ciao, Thorsten

>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>> --
>> Everything you wanna know about Linux kernel regression tracking:
>> https://linux-regtracking.leemhuis.info/about/#tldr
>> If I did something stupid, please tell me, as explained on that page.
>>
>>> Consequently, I've
>>> implemented a (very simple) workaround. All that happens is that in the 
>>> (sysv) init script that starts and stops SDDM,
>>> the nouveau module is removed once SDDM is stopped. With that in place, my 
>>> system no longer freezes on reboot or poweroff.
>>>
>>> Let me know if I can provide any additional diagnostics although, with the 
>>> problem seemingly occurring so late in the
>>> shutdown process, I may need help on how to go about capturing.
>>>
>>> Chris
>>>
>>> On 02/02/2023 20:45, Chris Clayton wrote:
>>>>
>>>>
>>>> On 01/02/2023 13:51, Chris Clayton wrote:
>>>>>
>>>>>
>>>>> On 30/01/2023 23:27, Ben Skeggs wrote:
>>>>>> On Tue, 31 Jan 2023 at 09:09, Chris Clayton  
>>>>>> wrote:
>>>>>>>
>>>>>>> Hi again.
>>>>>>>
>>>>>>> On 30/01/2023 20:19, Chris Clayton wrote:
>>>>>>>> Thanks, Ben.
>>>>>>>
>>>>>>> 
>>>>>>>
>>>>>>>>> Hey,
>>>>>>>>>
>>>>>>>>> This is a complete shot-in-the-dark, as I don't see this behaviour on
>>>>>>>>> *any* of my boards.  Could you try the attached patch please?
>>>>>>>>
>>>>>>>> Unfortunately, the patch made no difference.
>>>>>>>>
>>>>>>>> I've been looking at how the graphics on my laptop is set up, and have 
>>>>>>>> a bit of a worry about whether the firmware might
>>>>>>>> be playing a part in this problem. In order to offload video decoding 
>>>>>>>> to the NVidia TU117 GPU, it seems the scrubber
>>>>>>>> firmware must be available, but as far as I know,that has not been 
>>>>>>>> released by NVidia. To get it to work, I followed
>>>>>>>> what ubuntu have done and the scrubber in 
>>>>>>>> /lib/firmware/nvidia/tu117/nvdec/ is a symlink to
>>>>>>>> ../../tu116/nvdev/scrubber.bin. That, of course, means that some of 
>>>>>>>> the firmware loaded is for a different card is being
>>>>>>>> loaded. I note that processing related to firmware is being changed in 
>>>>>>>> the patch. Might my set up be at the root of my
>>>>>>>> problem?
>>>>>>>>
>>>>>>>> I'll have a fiddle an see what I can work out.
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Ben.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>>> Well, my fiddling has got my system rebooting and shutting down 
>>>>>>> successfully again. I found that if I delete the symlink
>>>>>>> to the scrubber firmware, reboot and shutdown work again. There are 
>>>>>>> however, a number of other files in the tu117
>>>>>>> firmware directory tree that that are symlinks to actual files in its 
&g

Re: linux-6.2-rc4+ hangs on poweroff/reboot: Bisected

2023-02-10 Thread Linux regression tracking (Thorsten Leemhuis)
On 08.02.23 09:48, Chris Clayton wrote:
> 
> I'm assuming  that we are not going to see a fix for this regression before 
> 6.2 is released.

Yeah, looks like it. That's unfortunate, but happens. But there is still
time to fix it and there is one thing I wonder:

Did any of the nouveau developers look at the netconsole captures Chris
posted more than a week ago to check if they somehow help to track down
the root of this problem?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

> Consequently, I've
> implemented a (very simple) workaround. All that happens is that in the 
> (sysv) init script that starts and stops SDDM,
> the nouveau module is removed once SDDM is stopped. With that in place, my 
> system no longer freezes on reboot or poweroff.
> 
> Let me know if I can provide any additional diagnostics although, with the 
> problem seemingly occurring so late in the
> shutdown process, I may need help on how to go about capturing.
> 
> Chris
> 
> On 02/02/2023 20:45, Chris Clayton wrote:
>>
>>
>> On 01/02/2023 13:51, Chris Clayton wrote:
>>>
>>>
>>> On 30/01/2023 23:27, Ben Skeggs wrote:
>>>> On Tue, 31 Jan 2023 at 09:09, Chris Clayton  
>>>> wrote:
>>>>>
>>>>> Hi again.
>>>>>
>>>>> On 30/01/2023 20:19, Chris Clayton wrote:
>>>>>> Thanks, Ben.
>>>>>
>>>>> 
>>>>>
>>>>>>> Hey,
>>>>>>>
>>>>>>> This is a complete shot-in-the-dark, as I don't see this behaviour on
>>>>>>> *any* of my boards.  Could you try the attached patch please?
>>>>>>
>>>>>> Unfortunately, the patch made no difference.
>>>>>>
>>>>>> I've been looking at how the graphics on my laptop is set up, and have a 
>>>>>> bit of a worry about whether the firmware might
>>>>>> be playing a part in this problem. In order to offload video decoding to 
>>>>>> the NVidia TU117 GPU, it seems the scrubber
>>>>>> firmware must be available, but as far as I know,that has not been 
>>>>>> released by NVidia. To get it to work, I followed
>>>>>> what ubuntu have done and the scrubber in 
>>>>>> /lib/firmware/nvidia/tu117/nvdec/ is a symlink to
>>>>>> ../../tu116/nvdev/scrubber.bin. That, of course, means that some of the 
>>>>>> firmware loaded is for a different card is being
>>>>>> loaded. I note that processing related to firmware is being changed in 
>>>>>> the patch. Might my set up be at the root of my
>>>>>> problem?
>>>>>>
>>>>>> I'll have a fiddle an see what I can work out.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Ben.
>>>>>>>
>>>>>>>>
>>>>>
>>>>> Well, my fiddling has got my system rebooting and shutting down 
>>>>> successfully again. I found that if I delete the symlink
>>>>> to the scrubber firmware, reboot and shutdown work again. There are 
>>>>> however, a number of other files in the tu117
>>>>> firmware directory tree that that are symlinks to actual files in its 
>>>>> tu116 counterpart. So I deleted all of those too.
>>>>> Unfortunately, the absence of one or more of those symlinks causes Xorg 
>>>>> to fail to start. I've reinstated all the links
>>>>> except scrubber and I now have a system that works as it did until I 
>>>>> tried to run a kernel that includes the bad commit
>>>>> I identified in my bisection. That includes offloading video decoding to 
>>>>> the NVidia card, so what ever I read that said
>>>>> the scrubber firmware was needed seems to have been wrong. I get a new 
>>>>> message that (nouveau :01:00.0: fb: VPR
>>>>> locked, but no scrubber binary!), but, hey, we can't have everything.
>>>>>
>>>>> If you still want to get to the bottom of this, let me know what you need 
>>>>> me to provide and I'll do my best. I suspect
>>>>> you might want to because there will a n awful lot of Ubuntu-ba

Re: [REGRESSION] GM20B probe fails after commit 2541626cfb79

2023-02-03 Thread Linux kernel regression tracking (#update)
[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 05.01.23 13:28, Thorsten Leemhuis wrote:
> On 28.12.22 15:49, Diogo Ivo wrote:
>> Hello,
>>
>> Commit 2541626cfb79 breaks GM20B probe with
>> the following kernel log:
> Just wondering: is anyone looking on this? The report was posted more
> than a week ago and didn't even get a single reply yet afaics. This of
> course can happen at this time of the year, but I nevertheless thought a
> quick status inquiry might be a good idea at this point.

#regzbot fix: drm/nouveau/acr/gm20b: regression fixes
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.



Re: linux-6.2-rc4+ hangs on poweroff/reboot: Bisected

2023-01-27 Thread Linux kernel regression tracking (Thorsten Leemhuis)
On 27.01.23 20:46, Chris Clayton wrote:
> [Resend because the mail client on my phone decided to turn HTML on behind my 
> back, so my reply got bounced.]
> 
> Thanks Thorsten.
> 
> I did try to revert but it didnt revert cleanly and I don't have the 
> knowledge to fix it up.
> 
> The patch was part of a merge that included a number of related patches. 
> Tomorrow, I'll try to revert the lot and report
> back.

You are free to do so, but there is no need for that from my side. I
only wanted to know if a simple revert would do the trick; if it
doesn't, it in my experience often is best to leave things to the
developers of the code in question, as they know it best and thus have a
better idea which hidden side effect a more complex revert might have.

Ciao, Thorsten

> On 27/01/2023 11:20, Linux kernel regression tracking (Thorsten Leemhuis) 
> wrote:
>> Hi, this is your Linux kernel regression tracker. Top-posting for once,
>> to make this easily accessible to everyone.
>>
>> @nouveau-maintainers, did anyone take a look at this? The report is
>> already 8 days old and I don't see a single reply. Sure, we'll likely
>> get a -rc8, but still it would be good to not fix this on the finish line.
>>
>> Chris, btw, did you try if you can revert the commit on top of latest
>> mainline? And if so, does it fix the problem?
>>
>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>> --
>> Everything you wanna know about Linux kernel regression tracking:
>> https://linux-regtracking.leemhuis.info/about/#tldr
>> If I did something stupid, please tell me, as explained on that page.
>>
>> #regzbot poke
>>
>> On 19.01.23 15:33, Linux kernel regression tracking (Thorsten Leemhuis)
>> wrote:
>>> [adding various lists and the two other nouveau maintainers to the list
>>> of recipients]
>>
>>> On 18.01.23 21:59, Chris Clayton wrote:
>>>> Hi.
>>>>
>>>> I build and installed the lastest development kernel earlier this week. 
>>>> I've found that when I try the laptop down (or
>>>> reboot it), it hangs right at the end of closing the current session. The 
>>>> last line I see on  the screen when rebooting is:
>>>>
>>>>    sd 4:0:0:0: [sda] Synchronising SCSI cache
>>>>
>>>> when closing down I see one additional line:
>>>>
>>>>sd 4:0:0:0 [sda]Stopping disk
>>>>
>>>> In both cases the machine then hangs and I have to hold down the power 
>>>> button fot a few seconds to switch it off.
>>>>
>>>> Linux 6.1 is OK but 6.2-rc1 hangs, so I bisected between this two and 
>>>> landed on:
>>>>
>>>># first bad commit: [0e44c21708761977dcbea9b846b51a6fb684907a] 
>>>> drm/nouveau/flcn: new code to load+boot simple HS FWs
>>>> (VPR scrubber)
>>>>
>>>> I built and installed a kernel with 
>>>> f15cde64b66161bfa74fb58f4e5697d8265b802e (the parent of the bad commit) 
>>>> checked out
>>>> and that shuts down and reboots fine. It the did the same with the bad 
>>>> commit checked out and that does indeed hang, so
>>>> I'm confident the bisect outcome is OK.
>>>>
>>>> Kernels 6.1.6 and 5.15.88 are also OK.
>>>>
>>>> My system had dual GPUs - one intel and one NVidia. Related extracts from 
>>>> 'lscpi -v' is:
>>>>
>>>> 00:02.0 VGA compatible controller: Intel Corporation CometLake-H GT2 [UHD 
>>>> Graphics] (rev 05) (prog-if 00 [VGA controller])
>>>> Subsystem: CLEVO/KAPOK Computer CometLake-H GT2 [UHD Graphics]
>>>>
>>>> Flags: bus master, fast devsel, latency 0, IRQ 142
>>>>
>>>> Memory at c200 (64-bit, non-prefetchable) [size=16M]
>>>>
>>>> Memory at a000 (64-bit, prefetchable) [size=256M]
>>>>
>>>> I/O ports at 5000 [size=64]
>>>>
>>>> Expansion ROM at 000c [virtual] [disabled] [size=128K]
>>>>
>>>> Capabilities: [40] Vendor Specific Information: Len=0c 
>>>>
>>>> Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
>>>>
>>>> Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
>>>>
>>>> Capabilities: [d0] Power Management version 2
>>>>
>>>> Kernel driver in use: i915
>>>>
>>>> Kernel modul

Re: linux-6.2-rc4+ hangs on poweroff/reboot: Bisected

2023-01-27 Thread Linux kernel regression tracking (Thorsten Leemhuis)
Hi, this is your Linux kernel regression tracker. Top-posting for once,
to make this easily accessible to everyone.

@nouveau-maintainers, did anyone take a look at this? The report is
already 8 days old and I don't see a single reply. Sure, we'll likely
get a -rc8, but still it would be good to not fix this on the finish line.

Chris, btw, did you try if you can revert the commit on top of latest
mainline? And if so, does it fix the problem?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 19.01.23 15:33, Linux kernel regression tracking (Thorsten Leemhuis)
wrote:
> [adding various lists and the two other nouveau maintainers to the list
> of recipients]

> On 18.01.23 21:59, Chris Clayton wrote:
>> Hi.
>>
>> I build and installed the lastest development kernel earlier this week. I've 
>> found that when I try the laptop down (or
>> reboot it), it hangs right at the end of closing the current session. The 
>> last line I see on  the screen when rebooting is:
>>
>>  sd 4:0:0:0: [sda] Synchronising SCSI cache
>>
>> when closing down I see one additional line:
>>
>>  sd 4:0:0:0 [sda]Stopping disk
>>
>> In both cases the machine then hangs and I have to hold down the power 
>> button fot a few seconds to switch it off.
>>
>> Linux 6.1 is OK but 6.2-rc1 hangs, so I bisected between this two and landed 
>> on:
>>
>>  # first bad commit: [0e44c21708761977dcbea9b846b51a6fb684907a] 
>> drm/nouveau/flcn: new code to load+boot simple HS FWs
>> (VPR scrubber)
>>
>> I built and installed a kernel with f15cde64b66161bfa74fb58f4e5697d8265b802e 
>> (the parent of the bad commit) checked out
>> and that shuts down and reboots fine. It the did the same with the bad 
>> commit checked out and that does indeed hang, so
>> I'm confident the bisect outcome is OK.
>>
>> Kernels 6.1.6 and 5.15.88 are also OK.
>>
>> My system had dual GPUs - one intel and one NVidia. Related extracts from 
>> 'lscpi -v' is:
>>
>> 00:02.0 VGA compatible controller: Intel Corporation CometLake-H GT2 [UHD 
>> Graphics] (rev 05) (prog-if 00 [VGA controller])
>> Subsystem: CLEVO/KAPOK Computer CometLake-H GT2 [UHD Graphics]
>>
>> Flags: bus master, fast devsel, latency 0, IRQ 142
>>
>> Memory at c200 (64-bit, non-prefetchable) [size=16M]
>>
>> Memory at a000 (64-bit, prefetchable) [size=256M]
>>
>> I/O ports at 5000 [size=64]
>>
>> Expansion ROM at 000c [virtual] [disabled] [size=128K]
>>
>> Capabilities: [40] Vendor Specific Information: Len=0c 
>>
>> Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
>>
>> Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
>>
>> Capabilities: [d0] Power Management version 2
>>
>> Kernel driver in use: i915
>>
>> Kernel modules: i915
>>
>>
>> 01:00.0 VGA compatible controller: NVIDIA Corporation TU117M [GeForce GTX 
>> 1650 Ti Mobile] (rev a1) (prog-if 00 [VGA
>> controller])
>> Subsystem: CLEVO/KAPOK Computer TU117M [GeForce GTX 1650 Ti Mobile]
>> Flags: bus master, fast devsel, latency 0, IRQ 141
>> Memory at c400 (32-bit, non-prefetchable) [size=16M]
>> Memory at b000 (64-bit, prefetchable) [size=256M]
>> Memory at c000 (64-bit, prefetchable) [size=32M]
>> I/O ports at 4000 [size=128]
>> Expansion ROM at c300 [disabled] [size=512K]
>> Capabilities: [60] Power Management version 3
>> Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
>> Capabilities: [78] Express Legacy Endpoint, MSI 00
>> Kernel driver in use: nouveau
>> Kernel modules: nouveau
>>
>> DRI_PRIME=1 is exported in one of my init scripts (yes, I am still using 
>> sysvinit).
>>
>> I've attached the bisect.log, but please let me know if I can provide any 
>> other diagnostics. Please cc me as I'm not
>> subscribed.
> 
> Thanks for the report. To be sure the issue doesn't fall through the
> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
> tracking bot:
> 
> #regzbot ^introduced e44c2170876197
> #regzbot title drm: nouveau: hangs on poweroff/reboot
> #regzbot ignore-activity
> 
> This isn't a regression? This is

Re: [PATCH] Revert "drm/display/dp_mst: Move all payload info into the atomic state"

2023-01-27 Thread Linux kernel regression tracking (Thorsten Leemhuis)
On 27.01.23 08:39, Greg KH wrote:
> On Fri, Jan 20, 2023 at 11:51:04AM -0600, Limonciello, Mario wrote:
>> On 1/20/2023 11:46, Guenter Roeck wrote:
>>> On Thu, Jan 12, 2023 at 04:50:44PM +0800, Wayne Lin wrote:
>>>> This reverts commit 4d07b0bc403403438d9cf88450506240c5faf92f.
>>>>
>>>> [Why]
>>>> Changes cause regression on amdgpu mst.
>>>> E.g.
>>>> In fill_dc_mst_payload_table_from_drm(), amdgpu expects to add/remove 
>>>> payload
>>>> one by one and call fill_dc_mst_payload_table_from_drm() to update the HW
>>>> maintained payload table. But previous change tries to go through all the
>>>> payloads in mst_state and update amdpug hw maintained table in once 
>>>> everytime
>>>> driver only tries to add/remove a specific payload stream only. The newly
>>>> design idea conflicts with the implementation in amdgpu nowadays.
>>>>
>>>> [How]
>>>> Revert this patch first. After addressing all regression problems caused by
>>>> this previous patch, will add it back and adjust it.
>>>
>>> Has there been any progress on this revert, or on fixing the underlying
>>> problem ?
>>>
>>> Thanks,
>>> Guenter
>>
>> Hi Guenter,
>>
>> Wayne is OOO for CNY, but let me update you.
>>
>> Harry has sent out this series which is a collection of proper fixes.
>> https://patchwork.freedesktop.org/series/113125/
>>
>> Once that's reviewed and accepted, 4 of them are applicable for 6.1.
> 
> Any hint on when those will be reviewed and accepted?  patchwork doesn't
> show any activity on them, or at least I can't figure it out...

I didn't look closer (hence please correct me if I'm wrong), but the
core changes afaics are in the DRM pull airlied send a few hours ago to
Linus (note the "amdgpu […] DP MST fixes" line):

https://lore.kernel.org/all/capm%3d9tzuu4xnx6t5v7sksk%2ba5heapoc1iemyznsyqzgztj%3d...@mail.gmail.com/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.


Re: linux-6.2-rc4+ hangs on poweroff/reboot: Bisected

2023-01-19 Thread Linux kernel regression tracking (#update)
[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 19.01.23 15:33, Linux kernel regression tracking (Thorsten Leemhuis)
wrote:
> On 18.01.23 21:59, Chris Clayton wrote:
>>
>>  # first bad commit: [0e44c21708761977dcbea9b846b51a6fb684907a] 
>> drm/nouveau/flcn: new code to load+boot simple HS FWs
>> (VPR scrubber)
>
> #regzbot ^introduced e44c2170876197

/me wonders if he failed to spot or cut'n'paste the leading 0
/me wonders if he needs glasses
#sigh

Sorry for the noise!

#regzbot 0e44c21708761977dc

> #regzbot title drm: nouveau: hangs on poweroff/reboot
> #regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

#regzbot ignore-activity


Re: linux-6.2-rc4+ hangs on poweroff/reboot: Bisected

2023-01-19 Thread Linux kernel regression tracking (Thorsten Leemhuis)
[adding various lists and the two other nouveau maintainers to the list
of recipients]

For the rest of this mail:

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 18.01.23 21:59, Chris Clayton wrote:
> Hi.
> 
> I build and installed the lastest development kernel earlier this week. I've 
> found that when I try the laptop down (or
> reboot it), it hangs right at the end of closing the current session. The 
> last line I see on  the screen when rebooting is:
> 
>   sd 4:0:0:0: [sda] Synchronising SCSI cache
> 
> when closing down I see one additional line:
> 
>   sd 4:0:0:0 [sda]Stopping disk
> 
> In both cases the machine then hangs and I have to hold down the power button 
> fot a few seconds to switch it off.
> 
> Linux 6.1 is OK but 6.2-rc1 hangs, so I bisected between this two and landed 
> on:
> 
>   # first bad commit: [0e44c21708761977dcbea9b846b51a6fb684907a] 
> drm/nouveau/flcn: new code to load+boot simple HS FWs
> (VPR scrubber)
> 
> I built and installed a kernel with f15cde64b66161bfa74fb58f4e5697d8265b802e 
> (the parent of the bad commit) checked out
> and that shuts down and reboots fine. It the did the same with the bad commit 
> checked out and that does indeed hang, so
> I'm confident the bisect outcome is OK.
> 
> Kernels 6.1.6 and 5.15.88 are also OK.
> 
> My system had dual GPUs - one intel and one NVidia. Related extracts from 
> 'lscpi -v' is:
> 
> 00:02.0 VGA compatible controller: Intel Corporation CometLake-H GT2 [UHD 
> Graphics] (rev 05) (prog-if 00 [VGA controller])
> Subsystem: CLEVO/KAPOK Computer CometLake-H GT2 [UHD Graphics]
> 
> Flags: bus master, fast devsel, latency 0, IRQ 142
> 
> Memory at c200 (64-bit, non-prefetchable) [size=16M]
> 
> Memory at a000 (64-bit, prefetchable) [size=256M]
> 
> I/O ports at 5000 [size=64]
> 
> Expansion ROM at 000c [virtual] [disabled] [size=128K]
> 
> Capabilities: [40] Vendor Specific Information: Len=0c 
> 
> Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
> 
> Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
> 
> Capabilities: [d0] Power Management version 2
> 
> Kernel driver in use: i915
> 
> Kernel modules: i915
> 
> 
> 01:00.0 VGA compatible controller: NVIDIA Corporation TU117M [GeForce GTX 
> 1650 Ti Mobile] (rev a1) (prog-if 00 [VGA
> controller])
> Subsystem: CLEVO/KAPOK Computer TU117M [GeForce GTX 1650 Ti Mobile]
> Flags: bus master, fast devsel, latency 0, IRQ 141
> Memory at c400 (32-bit, non-prefetchable) [size=16M]
> Memory at b000 (64-bit, prefetchable) [size=256M]
> Memory at c000 (64-bit, prefetchable) [size=32M]
> I/O ports at 4000 [size=128]
> Expansion ROM at c300 [disabled] [size=512K]
> Capabilities: [60] Power Management version 3
> Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
> Capabilities: [78] Express Legacy Endpoint, MSI 00
> Kernel driver in use: nouveau
> Kernel modules: nouveau
> 
> DRI_PRIME=1 is exported in one of my init scripts (yes, I am still using 
> sysvinit).
> 
> I've attached the bisect.log, but please let me know if I can provide any 
> other diagnostics. Please cc me as I'm not
> subscribed.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced e44c2170876197
#regzbot title drm: nouveau: hangs on poweroff/reboot
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


Re: [REGRESSION] GM20B probe fails after commit 2541626cfb79

2023-01-13 Thread Linux kernel regression tracking (Thorsten Leemhuis)
[CCing Daniel]

On 05.01.23 13:28, Thorsten Leemhuis wrote:
> [adding Karol and Lyude to the list of recipients]
> 
> On 28.12.22 15:49, Diogo Ivo wrote:
>> Hello,
>>
>> Commit 2541626cfb79 breaks GM20B probe with
>> the following kernel log:
> Just wondering: is anyone looking on this? The report was posted more
> than a week ago and didn't even get a single reply yet afaics. This of
> course can happen at this time of the year, but I nevertheless thought a
> quick status inquiry might be a good idea at this point.

Hmmm, the report is now more that two weeks old and didn't get a single
reply. My prodding about a week ago also didn't help. Then I guess I
have to bring this to Linus attention, unless something happens in the
next 2 days.

Diogo, for that it would be really helpful to known: is the issue still
happening with latest mainline? Is it possible to revert 2541626cfb79
easily? And if so: do things work afterwards again?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

>> [2.153892] [ cut here ]
>> [2.153897] WARNING: CPU: 1 PID: 36 at 
>> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c:273 
>> gf100_vmm_valid+0x2c4/0x390
>> [2.153916] Modules linked in:
>> [2.153922] CPU: 1 PID: 36 Comm: kworker/u8:1 Not tainted 6.1.0+ #1
>> [2.153929] Hardware name: Google Pixel C (DT)
>> [2.153933] Workqueue: events_unbound deferred_probe_work_func
>> [2.153943] pstate: 8005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS 
>> BTYPE=--)
>> [2.153950] pc : gf100_vmm_valid+0x2c4/0x390
>> [2.153959] lr : gf100_vmm_valid+0xb4/0x390
>> [2.153966] sp : ffc009e134b0
>> [2.153969] x29: ffc009e134b0 x28:  x27: 
>> ffc008fd44c8
>> [2.153979] x26: ffea x25: ffc0087b98d0 x24: 
>> ff8080f89038
>> [2.153987] x23: ff8081fadc08 x22:  x21: 
>> 
>> [2.153995] x20: ff8080f8a000 x19: ffc009e13678 x18: 
>> 
>> [2.154003] x17: f37a8b93418958e6 x16: ffc009f0d000 x15: 
>> 
>> [2.154011] x14: 0002 x13: 0003a020 x12: 
>> ffc00800
>> [2.154019] x11: 000102913000 x10:  x9 : 
>> 
>> [2.154026] x8 : ffc009e136d8 x7 : ffc008fd44c8 x6 : 
>> ff80803d0f00
>> [2.154034] x5 :  x4 : ff8080f88c00 x3 : 
>> 0010
>> [2.154041] x2 : 000c x1 : ffea x0 : 
>> ffea
>> [2.154050] Call trace:
>> [2.154053]  gf100_vmm_valid+0x2c4/0x390
>> [2.154061]  nvkm_vmm_map_valid+0xd4/0x204
>> [2.154069]  nvkm_vmm_map_locked+0xa4/0x344
>> [2.154076]  nvkm_vmm_map+0x50/0x84
>> [2.154083]  nvkm_firmware_mem_map+0x84/0xc4
>> [2.154094]  nvkm_falcon_fw_oneinit+0xc8/0x320
>> [2.154101]  nvkm_acr_oneinit+0x428/0x5b0
>> [2.154109]  nvkm_subdev_oneinit_+0x50/0x104
>> [2.154114]  nvkm_subdev_init_+0x3c/0x12c
>> [2.154119]  nvkm_subdev_init+0x60/0xa0
>> [2.154125]  nvkm_device_init+0x14c/0x2a0
>> [2.154133]  nvkm_udevice_init+0x60/0x9c
>> [2.154140]  nvkm_object_init+0x48/0x1b0
>> [2.154144]  nvkm_ioctl_new+0x168/0x254
>> [2.154149]  nvkm_ioctl+0xd0/0x220
>> [2.154153]  nvkm_client_ioctl+0x10/0x1c
>> [2.154162]  nvif_object_ctor+0xf4/0x22c
>> [2.154168]  nvif_device_ctor+0x28/0x70
>> [2.154174]  nouveau_cli_init+0x150/0x590
>> [2.154180]  nouveau_drm_device_init+0x60/0x2a0
>> [2.154187]  nouveau_platform_device_create+0x90/0xd0
>> [2.154193]  nouveau_platform_probe+0x3c/0x9c
>> [2.154200]  platform_probe+0x68/0xc0
>> [2.154207]  really_probe+0xbc/0x2dc
>> [2.154211]  __driver_probe_device+0x78/0xe0
>> [2.154216]  driver_probe_device+0xd8/0x160
>> [2.154221]  __device_attach_driver+0xb8/0x134
>> [2.154226]  bus_for_each_drv+0x78/0xd0
>> [2.154230]  __device_attach+0x9c/0x1a0
>> [2.154234]  device_initial_probe+0x14/0x20
>> [2.154239]  bus_probe_device+0x98/0xa0
>> [2.154243]  deferred_probe_work_func+0x88/0xc0
>> [2.154247]  process_one_work+0x204/0x40c
>> [2.154256]  worker_thread+0x230/0x450
>> [2.154261]  kthread+0xc8/0xcc
>> [2.154266]  ret_from_fork+0x10/0x20
>> [2.154273] 

RE: [PATCH 08/18] fbdev/hyperv-fb: Do not set struct fb_info.apertures

2022-12-29 Thread Michael Kelley (LINUX)
From: Thomas Zimmermann  Sent: Monday, December 19, 2022 
8:05 AM
> 
> Generic fbdev drivers use the apertures field in struct fb_info to
> control ownership of the framebuffer memory and graphics device. Do
> not set the values in hyperv-fb.
> 
> Signed-off-by: Thomas Zimmermann 
> ---
>  drivers/video/fbdev/hyperv_fb.c | 17 ++---
>  1 file changed, 6 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
> index d8edb5635f77..1c7d6ff5a6c0 100644
> --- a/drivers/video/fbdev/hyperv_fb.c
> +++ b/drivers/video/fbdev/hyperv_fb.c
> @@ -988,13 +988,10 @@ static int hvfb_getmem(struct hv_device *hdev, struct 
> fb_info *info)
>   struct pci_dev *pdev  = NULL;
>   void __iomem *fb_virt;
>   int gen2vm = efi_enabled(EFI_BOOT);
> + resource_size_t base, size;
>   phys_addr_t paddr;
>   int ret;
> 
> - info->apertures = alloc_apertures(1);
> - if (!info->apertures)
> - return -ENOMEM;
> -
>   if (!gen2vm) {
>   pdev = pci_get_device(PCI_VENDOR_ID_MICROSOFT,
>   PCI_DEVICE_ID_HYPERV_VIDEO, NULL);
> @@ -1003,8 +1000,8 @@ static int hvfb_getmem(struct hv_device *hdev, struct 
> fb_info *info)
>   return -ENODEV;
>   }
> 
> - info->apertures->ranges[0].base = pci_resource_start(pdev, 0);
> - info->apertures->ranges[0].size = pci_resource_len(pdev, 0);
> + base = pci_resource_start(pdev, 0);
> + size = pci_resource_len(pdev, 0);
> 
>   /*
>* For Gen 1 VM, we can directly use the contiguous memory
> @@ -1027,8 +1024,8 @@ static int hvfb_getmem(struct hv_device *hdev, struct 
> fb_info *info)
>   }
>   pr_info("Unable to allocate enough contiguous physical memory 
> on Gen 1 VM. Using MMIO instead.\n");
>   } else {
> - info->apertures->ranges[0].base = screen_info.lfb_base;
> - info->apertures->ranges[0].size = screen_info.lfb_size;
> + base = screen_info.lfb_base;
> + size = screen_info.lfb_size;
>   }
> 
>   /*
> @@ -1070,9 +1067,7 @@ static int hvfb_getmem(struct hv_device *hdev, struct 
> fb_info *info)
>   info->screen_size = dio_fb_size;
> 
>  getmem_done:
> - aperture_remove_conflicting_devices(info->apertures->ranges[0].base,
> - info->apertures->ranges[0].size,
> - false, KBUILD_MODNAME);
> + aperture_remove_conflicting_devices(base, size, false, KBUILD_MODNAME);
> 
>   if (gen2vm) {
>   /* framebuffer is reallocated, clear screen_info to avoid 
> misuse from kexec */
> --
> 2.39.0

Reviewed-by: Michael Kelley 



[PATCH] drm/amd/display: fix array-bounds errors in dc_stream_remove_writeback()

2022-12-26 Thread wenyang . linux
From: Wen Yang 

The following errors occurred when using gcc 7.5.0-3ubuntu1~18.04:
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stream.c: In function 
‘dc_stream_remove_writeback’:
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stream.c:543:55: warning: 
array subscript is above array bounds [-Warray-bounds]
 stream->writeback_info[j] = stream->writeback_info[i];
 ~~^~~
Add a check to make sure that num_wb_info won't overflowing the writeback_info 
buffer.

Fixes: 6fbefb84a98e ("drm/amd/display: Add DC core changes for DCN2")

Signed-off-by: Wen Yang 
Cc: Aurabindo Pillai 
Cc: Hamza Mahfooz 
Cc: Guenter Roeck 
Cc: Alex Deucher 
Cc: Harry Wentland 
Cc: Leo Li 
Cc: amd-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org
---
 drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
index 20e534f73513..9825c30f2ca0 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
@@ -481,6 +481,7 @@ bool dc_stream_add_writeback(struct dc *dc,
}
 
if (!isDrc) {
+   ASSERT(stream->num_wb_info + 1 <= MAX_DWB_PIPES);
stream->writeback_info[stream->num_wb_info++] = *wb_info;
}
 
@@ -526,6 +527,11 @@ bool dc_stream_remove_writeback(struct dc *dc,
return false;
}
 
+   if (stream->num_wb_info > MAX_DWB_PIPES) {
+   dm_error("DC: num_wb_info is invalid!\n");
+   return false;
+   }
+
 // stream->writeback_info[dwb_pipe_inst].wb_enabled = false;
for (i = 0; i < stream->num_wb_info; i++) {
/*dynamic update*/
@@ -540,7 +546,8 @@ bool dc_stream_remove_writeback(struct dc *dc,
if (stream->writeback_info[i].wb_enabled) {
if (j < i)
/* trim the array */
-   stream->writeback_info[j] = 
stream->writeback_info[i];
+   memcpy(>writeback_info[j], 
>writeback_info[i],
+   sizeof(struct 
dc_writeback_info));
j++;
}
}
-- 
2.25.1



RE: [PATCH] drm/hyperv: Don't overwrite dirt_needed value set by host

2022-09-22 Thread Michael Kelley (LINUX)
From: Saurabh Sengar  Sent: Monday, September 12, 
2022 8:33 AM
> 
> Existing code is causing a race condition where dirt_needed value is
> already set by the host and gets overwritten with default value. Remove
> this default setting of dirt_needed, to avoid overwriting the value
> received in the channel callback set by vmbus_open. Removing this
> setting also means the default value for dirt_needed is changed to false
> as it's allocated by kzalloc which is similar to legacy hyperv_fb driver.
> 
> Signed-off-by: Saurabh Sengar 
> ---
>  drivers/gpu/drm/hyperv/hyperv_drm_drv.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> index 4a8941fa0815..57d49a08b37f 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> @@ -198,8 +198,6 @@ static int hyperv_vmbus_probe(struct hv_device *hdev,
>   if (ret)
>   drm_warn(dev, "Failed to update vram location.\n");
> 
> - hv->dirt_needed = true;
> -
>   ret = hyperv_mode_config_init(hv);
>   if (ret)
>   goto err_vmbus_close;
> --
> 2.31.1

Reviewed-by: Michael Kelley 



RE: [PATCH] drm/hyperv: Add ratelimit on error message

2022-09-10 Thread Michael Kelley (LINUX)
From: Saurabh Sengar  Sent: Friday, September 9, 
2022 8:10 AM
> 
> Due to a full ring buffer, the driver may be unable to send updates to
> the Hyper-V host.  But outputing the error message can make the problem
> worse because console output is also typically written to the frame
> buffer.
> Rate limiting the error message, also output the error code for additional
> diagnosability.
> 
> Signed-off-by: Saurabh Sengar 
> ---
>  drivers/gpu/drm/hyperv/hyperv_drm_proto.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> b/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> index 76a182a..013a782 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> @@ -208,7 +208,7 @@ static inline int hyperv_sendpacket(struct hv_device 
> *hdev,
> struct synthvid_msg
>  VM_PKT_DATA_INBAND, 0);
> 
>   if (ret)
> - drm_err(>dev, "Unable to send packet via vmbus\n");
> + drm_err_ratelimited(>dev, "Unable to send packet via vmbus; 
> error %d\n", ret);
> 
>   return ret;
>  }
> --
> 1.8.3.1

Reviewed-by: Michael Kelley 



RE: [PATCH] drm/hyperv: Don't rely on screen_info.lfb_base for Gen1 VMs

2022-09-10 Thread Michael Kelley (LINUX)
From: Saurabh Sengar  Sent: Friday, September 9, 
2022 7:44 AM
> 
> hyperv_setup_vram tries to remove conflicting framebuffer based on
> 'screen_info'. As observed in past due to some bug or wrong setting
> in grub, the 'screen_info' fields may not be set for Gen1, and in such
> cases drm_aperture_remove_conflicting_framebuffers will not do anything
> useful.
> For Gen1 VMs, it should always be possible to get framebuffer
> conflict removed using PCI device instead.
> 
> Fixes: a0ab5abced55 ("drm/hyperv : Removing the restruction of VRAM 
> allocation with PCI bar size")
> Signed-off-by: Saurabh Sengar 
> ---
>  drivers/gpu/drm/hyperv/hyperv_drm_drv.c | 24 
>  1 file changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> index 6d11e7938c83..b0cc974efa45 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> @@ -73,12 +73,28 @@ static int hyperv_setup_vram(struct hyperv_drm_device *hv,
>struct hv_device *hdev)
>  {
>   struct drm_device *dev = >dev;
> + struct pci_dev *pdev;
>   int ret;
> 
> - drm_aperture_remove_conflicting_framebuffers(screen_info.lfb_base,
> -  screen_info.lfb_size,
> -  false,
> -  _driver);
> + if (efi_enabled(EFI_BOOT)) {
> + 
> drm_aperture_remove_conflicting_framebuffers(screen_info.lfb_base,
> +  
> screen_info.lfb_size,
> +  false,
> +  _driver);
> + } else {
> + pdev = pci_get_device(PCI_VENDOR_ID_MICROSOFT, 
> PCI_DEVICE_ID_HYPERV_VIDEO, NULL);
> + if (!pdev) {
> + drm_err(dev, "Unable to find PCI Hyper-V video\n");
> + return -ENODEV;
> + }
> +
> + ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
> _driver);
> + pci_dev_put(pdev);
> + if (ret) {
> + drm_err(dev, "Not able to remove boot fb\n");
> + return ret;
> + }
> + }
> 
>   hv->fb_size = (unsigned long)hv->mmio_megabytes * 1024 * 1024;
> 
> --
> 2.34.1

Reviewed-by: Michael Kelley 


RE: [PATCH v2 3/3] Drivers: hv: Never allocate anything besides framebuffer from framebuffer memory region

2022-08-25 Thread Michael Kelley (LINUX)
From: Vitaly Kuznetsov  Sent: Thursday, August 25, 2022 
2:00 AM
> 
> Passed through PCI device sometimes misbehave on Gen1 VMs when Hyper-V
> DRM driver is also loaded. Looking at IOMEM assignment, we can see e.g.
> 
> $ cat /proc/iomem
> ...
> f800-fffb : PCI Bus :00
>   f800-fbff : :00:08.0
> f800-f8001fff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
> ...
> fe000-f : PCI Bus :00
>   fe000-fe07f : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
> fe000-fe07f : 2ba2:00:02.0
>   fe000-fe07f : mlx4_core
> 
> the interesting part is the 'f800' region as it is actually the
> VM's framebuffer:
> 
> $ lspci -v
> ...
> :00:08.0 VGA compatible controller: Microsoft Corporation Hyper-V virtual 
> VGA
> (prog-if 00 [VGA controller])
>   Flags: bus master, fast devsel, latency 0, IRQ 11
>   Memory at f800 (32-bit, non-prefetchable) [size=64M]
> ...
> 
>  hv_vmbus: registering driver hyperv_drm
>  hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Synthvid Version 
> major 3, minor 5
>  hyperv_drm :00:08.0: vgaarb: deactivate vga console
>  hyperv_drm :00:08.0: BAR 0: can't reserve [mem 0xf800-0xfbff]
>  hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Cannot request 
> framebuffer, boot fb still active?
> 
> Note: "Cannot request framebuffer" is not a fatal error in
> hyperv_setup_gen1() as the code assumes there's some other framebuffer
> device there but we actually have some other PCI device (mlx4 in this
> case) config space there!

My apologies for not getting around to commenting on the previous
version of this patch.  The function hyperv_setup_gen1() and the
"Cannot request framebuffer" message have gone away as of
commit a0ab5abced55.

> 
> The problem appears to be that vmbus_allocate_mmio() can allocate from
> the reserved framebuffer region (fb_overlap_ok), however, if the
> request to allocate MMIO comes from some other device before
> framebuffer region is taken, it can happily use framebuffer region for
> it. 

Interesting. I had never looked at the details of vmbus_allocate_mmio().
The semantics one might assume of a parameter named "fb_overlap_ok"
aren't implemented because !fb_overlap_ok essentially has no effect.   The
existing semantics are really "prefer_fb_overlap".  This patch implements
the expected and needed semantics, which is to not allocate from the frame
buffer space when !fb_overlap_ok.

If that's an accurate high level summary, maybe this commit message
could describe it that way?  The other details you provide about what can
go wrong should still be included as well.

> Note, Gen2 VMs are usually unaffected by the issue because
> framebuffer region is already taken by EFI fb (in case kernel supports
> it) but Gen1 VMs may have this region unclaimed by the time Hyper-V PCI
> pass-through driver tries allocating MMIO space if Hyper-V DRM/FB drivers
> load after it. Devices can be brought up in any sequence so let's
> resolve the issue by always ignoring 'fb_mmio' region for non-FB
> requests, even if the region is unclaimed.
> 
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  drivers/hv/vmbus_drv.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 536f68e563c6..3c833ea60db6 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -2331,7 +2331,7 @@ int vmbus_allocate_mmio(struct resource **new, struct
> hv_device *device_obj,
>   bool fb_overlap_ok)
>  {
>   struct resource *iter, *shadow;
> - resource_size_t range_min, range_max, start;
> + resource_size_t range_min, range_max, start, end;
>   const char *dev_n = dev_name(_obj->device);
>   int retval;
> 
> @@ -2366,6 +2366,14 @@ int vmbus_allocate_mmio(struct resource **new, struct
> hv_device *device_obj,
>   range_max = iter->end;
>   start = (range_min + align - 1) & ~(align - 1);
>   for (; start + size - 1 <= range_max; start += align) {
> + end = start + size - 1;
> +
> + /* Skip the whole fb_mmio region if not fb_overlap_ok */
> + if (!fb_overlap_ok && fb_mmio &&
> + (((start >= fb_mmio->start) && (start <= 
> fb_mmio->end)) ||
> +  ((end >= fb_mmio->start) && (end <= 
> fb_mmio->end
> + continue;
> +
>   shadow = __request_region(iter, start, size, NULL,
> IORESOURCE_BUSY);
>   if (!shadow)
> --
> 2.37.1

Other than my musings on the commit message,

Reviewed-by: Michael Kelley 



RE: [PATCH v2 2/3] Drivers: hv: Always reserve framebuffer region for Gen1 VMs

2022-08-25 Thread Michael Kelley (LINUX)
From: Vitaly Kuznetsov  Sent: Thursday, August 25, 2022 
2:00 AM
> 
> vmbus_reserve_fb() tries reserving framebuffer region iff
> 'screen_info.lfb_base' is set. Gen2 VMs seem to have it set by EFI fb

Just so I'm clear, by "EFI fb" you mean the EFI layer code that sets
up the frame buffer before the Linux kernel ever boots, right?
You are not referring to the Linux kernel EFI framebuffer
driver, which may or may not be configured in the kernel.

> (or, in some edge cases like kexec, the address where the buffer was
> moved, see 
> https://lore.kernel.org/all/20201014092429.1415040-1-kas...@redhat.com/
> but on Gen1 VM it depends on bootloader behavior. With grub, it depends
> on 'gfxpayload=' setting but in some cases it is observed to be zero.
> Relying on 'screen_info.lfb_base' to reserve framebuffer region is
> risky. Instead, it is possible to get the address from the dedicated
> PCI device which is always present.
> 
> Check for legacy PCI video device presence and reserve the whole
> region for framebuffer on Gen1 VMs.
> 
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  drivers/hv/vmbus_drv.c | 46 +-
>  1 file changed, 32 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 23c680d1a0f5..536f68e563c6 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -35,6 +35,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include "hyperv_vmbus.h"
> 
> @@ -2262,26 +2263,43 @@ static int vmbus_acpi_remove(struct acpi_device 
> *device)
> 
>  static void vmbus_reserve_fb(void)
>  {
> - int size;
> + resource_size_t start = 0, size;
> + struct pci_dev *pdev;
> +
> + if (efi_enabled(EFI_BOOT)) {
> + /* Gen2 VM: get FB base from EFI framebuffer */
> + start = screen_info.lfb_base;
> + size = max_t(__u32, screen_info.lfb_size, 0x80);
> + } else {
> + /* Gen1 VM: get FB base from PCI */
> + pdev = pci_get_device(PCI_VENDOR_ID_MICROSOFT,
> +   PCI_DEVICE_ID_HYPERV_VIDEO, NULL);
> + if (!pdev)
> + return;
> +
> + if (pdev->resource[0].flags & IORESOURCE_MEM) {
> + start = pci_resource_start(pdev, 0);
> + size = pci_resource_len(pdev, 0);
> + }
> +
> + /*
> +  * Release the PCI device so hyperv_drm or hyperv_fb driver can
> +  * grab it later.
> +  */
> + pci_dev_put(pdev);
> + }
> +
> + if (!start)
> + return;
> +
>   /*
>* Make a claim for the frame buffer in the resource tree under the
>* first node, which will be the one below 4GB.  The length seems to
>* be underreported, particularly in a Generation 1 VM.  So start out
>* reserving a larger area and make it smaller until it succeeds.
>*/
> -
> - if (screen_info.lfb_base) {
> - if (efi_enabled(EFI_BOOT))
> - size = max_t(__u32, screen_info.lfb_size, 0x80);
> - else
> - size = max_t(__u32, screen_info.lfb_size, 0x400);
> -
> - for (; !fb_mmio && (size >= 0x10); size >>= 1) {
> - fb_mmio = __request_region(hyperv_mmio,
> -screen_info.lfb_base, size,
> -fb_mmio_name, 0);
> - }
> - }
> + for (; !fb_mmio && (size >= 0x10); size >>= 1)
> + fb_mmio = __request_region(hyperv_mmio, start, size, 
> fb_mmio_name, 0);
>  }
> 
>  /**
> --
> 2.37.1

Reviewed-by: Michael Kelley 


RE: [PATCH v2 1/3] PCI: Move PCI_VENDOR_ID_MICROSOFT/PCI_DEVICE_ID_HYPERV_VIDEO definitions to pci_ids.h

2022-08-25 Thread Michael Kelley (LINUX)
From: Vitaly Kuznetsov  Sent: Thursday, August 25, 2022 
2:00 AM
> 
> There are already three places in kernel which define PCI_VENDOR_ID_MICROSOFT
> and two for PCI_DEVICE_ID_HYPERV_VIDEO and there's a need to use these
> from core Vmbus code. Move the defines where they belong.
> 
> No functional change.
> 
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  drivers/gpu/drm/hyperv/hyperv_drm_drv.c | 3 ---
>  drivers/net/ethernet/microsoft/mana/gdma_main.c | 4 
>  drivers/video/fbdev/hyperv_fb.c     | 4 
>  include/linux/pci_ids.h | 3 +++
>  4 files changed, 3 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> index 6d11e7938c83..40888e36f91a 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> @@ -23,9 +23,6 @@
>  #define DRIVER_MAJOR 1
>  #define DRIVER_MINOR 0
> 
> -#define PCI_VENDOR_ID_MICROSOFT 0x1414
> -#define PCI_DEVICE_ID_HYPERV_VIDEO 0x5353
> -
>  DEFINE_DRM_GEM_FOPS(hv_fops);
> 
>  static struct drm_driver hyperv_driver = {
> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index 5f9240182351..00d8198072ae 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -1465,10 +1465,6 @@ static void mana_gd_shutdown(struct pci_dev *pdev)
>   pci_disable_device(pdev);
>  }
> 
> -#ifndef PCI_VENDOR_ID_MICROSOFT
> -#define PCI_VENDOR_ID_MICROSOFT 0x1414
> -#endif
> -
>  static const struct pci_device_id mana_id_table[] = {
>   { PCI_DEVICE(PCI_VENDOR_ID_MICROSOFT, MANA_PF_DEVICE_ID) },
>   { PCI_DEVICE(PCI_VENDOR_ID_MICROSOFT, MANA_VF_DEVICE_ID) },
> diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
> index 886c564787f1..b58b445bb529 100644
> --- a/drivers/video/fbdev/hyperv_fb.c
> +++ b/drivers/video/fbdev/hyperv_fb.c
> @@ -74,10 +74,6 @@
>  #define SYNTHVID_DEPTH_WIN8 32
>  #define SYNTHVID_FB_SIZE_WIN8 (8 * 1024 * 1024)
> 
> -#define PCI_VENDOR_ID_MICROSOFT 0x1414
> -#define PCI_DEVICE_ID_HYPERV_VIDEO 0x5353
> -
> -
>  enum pipe_msg_type {
>   PIPE_MSG_INVALID,
>   PIPE_MSG_DATA,
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index 6feade66efdb..15b49e655ce3 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -2079,6 +2079,9 @@
>  #define PCI_DEVICE_ID_ICE_1712   0x1712
>  #define PCI_DEVICE_ID_VT1724 0x1724
> 
> +#define PCI_VENDOR_ID_MICROSOFT  0x1414
> +#define PCI_DEVICE_ID_HYPERV_VIDEO   0x5353
> +
>  #define PCI_VENDOR_ID_OXSEMI 0x1415
>  #define PCI_DEVICE_ID_OXSEMI_12PCI8400x8403
>  #define PCI_DEVICE_ID_OXSEMI_PCIe840 0xC000
> --
> 2.37.1

Reviewed-by: Michael Kelley 



RE: [PATCH v1 3/4] Drivers: hv: Always reserve framebuffer region for Gen1 VMs

2022-08-23 Thread Michael Kelley (LINUX)
From: Vitaly Kuznetsov  Sent: Thursday, August 18, 2022 
7:25 AM
> 
> vmbus_reserve_fb() tries reserving framebuffer region iff
> screen_info.lfb_base is set. Gen2 VMs seem to have it set by EFI fb
> but on Gen1 VM it is observed to be zero. 

FWIW, in a Gen1 VM, whether screen_info.lfb_base is set depends on what
grub sets up, which in turn seems to depend on the gfxpayload= setting
in grub.cfg and certain versions of grub.  There are cases where it is
observed to be zero, but from our experiments it's not all cases.

In a Gen2 VM, there's an edge case where the frame buffer has been
moved, and a kexec() kernel may see the moved location instead of
what was set by EFI.  See
https://lore.kernel.org/all/20201014092429.1415040-1-kas...@redhat.com/

I think these points may be worth recording in the commit message here
so that there's accurate record for the future.  The Hyper-V and grub
idiosyncrasies make this a very tricky area.

> In fact, we do not need to
> rely on some other video driver setting it correctly as Gen1 VMs have
> a dedicated PCI device to look at. Both Hyper-V DRM and Hyper-V FB
> drivers get framebuffer base from this PCI device already so Vmbus
> driver can do the same trick.
> 
> Check for legacy PCI video device presence and reserve the whole
> region for framebuffer.
> 
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  drivers/hv/vmbus_drv.c | 47 +-
>  1 file changed, 33 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 547ae334e5cd..6edaeefa2c3c 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -35,6 +35,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include "hyperv_vmbus.h"
> 
> @@ -2258,26 +2259,44 @@ static int vmbus_acpi_remove(struct acpi_device 
> *device)
> 
>  static void vmbus_reserve_fb(void)
>  {
> - int size;
> + resource_size_t start = 0, size;
> + struct pci_dev *pdev;
> +
> + if (efi_enabled(EFI_BOOT)) {
> + /* Gen2 VM: get FB base from EFI framebuffer */
> + start = screen_info.lfb_base;
> + size = max_t(__u32, screen_info.lfb_size, 0x80);
> + } else {
> + /* Gen1 VM: get FB base from PCI */
> + pdev = pci_get_device(PCI_VENDOR_ID_MICROSOFT,
> +   PCI_DEVICE_ID_HYPERV_VIDEO, NULL);
> + if (!pdev)
> + return;
> +
> + if (!(pdev->resource[0].flags & IORESOURCE_MEM))
> + return;

Doesn't this error exit need a pci_dev_put(pdev)?  Or maybe reverse
the test like this, and the later check for !start will do the error exit.

if (pdev->resource[0].flags & IORESOURCE_MEM) {
start = pci_resource_start(pdev, 0);
size = pci_resource_len(pdev, 0);
}

> +
> + start = pci_resource_start(pdev, 0);
> + size = pci_resource_len(pdev, 0);
> +
> + /*
> +  * Release the PCI device so hyperv_drm or hyperv_fb driver can
> +  * grab it later.
> +  */
> + pci_dev_put(pdev);
> + }
> +
> + if (!start)
> + return;
> +
>   /*
>* Make a claim for the frame buffer in the resource tree under the
>* first node, which will be the one below 4GB.  The length seems to
>* be underreported, particularly in a Generation 1 VM.  So start out
>* reserving a larger area and make it smaller until it succeeds.
>*/
> -
> - if (screen_info.lfb_base) {
> - if (efi_enabled(EFI_BOOT))
> - size = max_t(__u32, screen_info.lfb_size, 0x80);
> - else
> - size = max_t(__u32, screen_info.lfb_size, 0x400);
> -
> - for (; !fb_mmio && (size >= 0x10); size >>= 1) {
> - fb_mmio = __request_region(hyperv_mmio,
> -screen_info.lfb_base, size,
> -fb_mmio_name, 0);
> - }
> - }
> + for (; !fb_mmio && (size >= 0x10); size >>= 1)
> + fb_mmio = __request_region(hyperv_mmio, start, size, 
> fb_mmio_name, 0);
> }
> 
>  /**
> --
> 2.37.1


RE: [PATCH v1 2/4] drm/hyperv: Don't forget to put PCI device when removing conflicting FB fails

2022-08-22 Thread Michael Kelley (LINUX)
From: Vitaly Kuznetsov  Sent: Thursday, August 18, 2022 
7:25 AM
> 
> When drm_aperture_remove_conflicting_pci_framebuffers() fails, 'pdev'
> needs to be released with pci_dev_put().
> 
> Fixes: 76c56a5affeb ("drm/hyperv: Add DRM driver for hyperv synthetic video 
> device")
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  drivers/gpu/drm/hyperv/hyperv_drm_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> index 46f6c454b820..ca4e517b95ca 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> @@ -82,7 +82,7 @@ static int hyperv_setup_gen1(struct hyperv_drm_device *hv)
>   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev,
> _driver);
>   if (ret) {
>   drm_err(dev, "Not able to remove boot fb\n");
> - return ret;
> + goto error;
>   }
> 
>   if (pci_request_region(pdev, 0, DRIVER_NAME) != 0)
> --
> 2.37.1

This patch appears to be obsoleted by commit a0ab5abced55
that was merged into 6.0-rc1.  Of course, it does beg the question of
why the original function hyperv_setup_gen2(), which is now renamed
to hyperv_setup_vram(), doesn't check the return value from
drm_aperture_remove_conflicting_framebuffers().


Michael



RE: [PATCH v1 1/4] Drivers: hv: Move legacy Hyper-V PCI video device's ids to linux/hyperv.h

2022-08-22 Thread Michael Kelley (LINUX)
From: Vitaly Kuznetsov  Sent: Thursday, August 18, 2022 
7:25 AM
> 
> There are already two places in kernel with PCI_VENDOR_ID_MICROSOFT/
> PCI_DEVICE_ID_HYPERV_VIDEO and there's a need to use these from core
> Vmbus code. Move the defines to a common header.
> 
> No functional change.
> 
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  drivers/gpu/drm/hyperv/hyperv_drm_drv.c | 3 ---
>  drivers/video/fbdev/hyperv_fb.c | 4 
>  include/linux/hyperv.h  | 4 
>  3 files changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> index 4a8941fa0815..46f6c454b820 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> @@ -23,9 +23,6 @@
>  #define DRIVER_MAJOR 1
>  #define DRIVER_MINOR 0
> 
> -#define PCI_VENDOR_ID_MICROSOFT 0x1414
> -#define PCI_DEVICE_ID_HYPERV_VIDEO 0x5353
> -
>  DEFINE_DRM_GEM_FOPS(hv_fops);
> 
>  static struct drm_driver hyperv_driver = {
> diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
> index 886c564787f1..b58b445bb529 100644
> --- a/drivers/video/fbdev/hyperv_fb.c
> +++ b/drivers/video/fbdev/hyperv_fb.c
> @@ -74,10 +74,6 @@
>  #define SYNTHVID_DEPTH_WIN8 32
>  #define SYNTHVID_FB_SIZE_WIN8 (8 * 1024 * 1024)
> 
> -#define PCI_VENDOR_ID_MICROSOFT 0x1414
> -#define PCI_DEVICE_ID_HYPERV_VIDEO 0x5353
> -
> -
>  enum pipe_msg_type {
>   PIPE_MSG_INVALID,
>   PIPE_MSG_DATA,
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index 3b42264333ef..4bb39a8f1af7 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -1516,6 +1516,10 @@ void vmbus_free_mmio(resource_size_t start,
> resource_size_t size);
>   .guid = GUID_INIT(0xc376c1c3, 0xd276, 0x48d2, 0x90, 0xa9, \
> 0xc0, 0x47, 0x48, 0x07, 0x2c, 0x60)
> 
> +/* Legacy Hyper-V PCI video device */
> +#define PCI_VENDOR_ID_MICROSOFT 0x1414
> +#define PCI_DEVICE_ID_HYPERV_VIDEO 0x5353

I've never looked at this before, but shouldn't these move to
include/linux/pci_ids.h with all the others?  And we've got
another #define of PCI_VENDOR_ID_MICROSOFT in
drivers/net/ethernet/microsoft/mana/gdma_main.c that
could be deleted.

Michael

> +
>  /*
>   * Common header for Hyper-V ICs
>   */
> --
> 2.37.1


RE: [PATCH] drm/hyperv: Fix an error handling path in hyperv_vmbus_probe()

2022-08-05 Thread Michael Kelley (LINUX)
From: Christophe JAILLET  Sent: Sunday, July 31, 
2022 1:02 PM
> 
> hyperv_setup_vram() calls vmbus_allocate_mmio().
> This must be undone in the error handling path of the probe, as already
> done in the remove function.
> 
> This patch depends on commit a0ab5abced55 ("drm/hyperv : Removing the
> restruction of VRAM allocation with PCI bar size").
> Without it, something like what is done in commit e048834c209a
> ("drm/hyperv: Fix device removal on Gen1 VMs") should be done.

Should the above paragraph be below the '---' as a comment, rather than
part of the commit message?  It's more about staging instructions than a
long-term record of the actual functional/code change.

> 
> Fixes: 76c56a5affeb ("drm/hyperv: Add DRM driver for hyperv synthetic video 
> device")

I wonder if the Fixes: dependency should be on a0ab5abced55.  As you noted,
this patch won't apply cleanly on stable kernel versions that lack that commit,
so we'll need a separate patch for stable if we want to make the fix there.

> Signed-off-by: Christophe JAILLET 

All that said, the fix looks good, so

Reviewed-by: Michael Kelley 

> ---
>  drivers/gpu/drm/hyperv/hyperv_drm_drv.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> index 6d11e7938c83..fc8b4e045f5d 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
> @@ -133,7 +133,6 @@ static int hyperv_vmbus_probe(struct hv_device *hdev,
>   }
> 
>   ret = hyperv_setup_vram(hv, hdev);
> -
>   if (ret)
>   goto err_vmbus_close;
> 
> @@ -150,18 +149,20 @@ static int hyperv_vmbus_probe(struct hv_device *hdev,
> 
>   ret = hyperv_mode_config_init(hv);
>   if (ret)
> - goto err_vmbus_close;
> + goto err_free_mmio;
> 
>   ret = drm_dev_register(dev, 0);
>   if (ret) {
>   drm_err(dev, "Failed to register drm driver.\n");
> - goto err_vmbus_close;
> + goto err_free_mmio;
>   }
> 
>   drm_fbdev_generic_setup(dev, 0);
> 
>   return 0;
> 
> +err_free_mmio:
> + vmbus_free_mmio(hv->mem->start, hv->fb_size);
>  err_vmbus_close:
>   vmbus_close(hdev->channel);
>  err_hv_set_drv_data:
> --
> 2.34.1



RE: [PATCH] fbdev: Fix order of arguments to aperture_remove_conflicting_devices()

2022-07-21 Thread Michael Kelley (LINUX)
From: Thomas Zimmermann  Sent: Thursday, July 21, 2022 
1:17 AM
> 
> Reverse the order of the final two arguments when calling
> aperture_remove_conflicting_devices(). An error report is available
> at [1].
> 
> Reported-by: kernel test robot 
> Signed-off-by: Thomas Zimmermann 
> Fixes: 8d69d008f44c ("fbdev: Convert drivers to aperture helpers")
> Cc: Thomas Zimmermann 
> Cc: Javier Martinez Canillas 
> Cc: Sudip Mukherjee 
> Cc: Teddy Wang 
> Cc: Benjamin Herrenschmidt 
> Cc: "K. Y. Srinivasan" 
> Cc: Haiyang Zhang 
> Cc: Stephen Hemminger 
> Cc: Wei Liu 
> Cc: Dexuan Cui 
> Cc: linux-fb...@vger.kernel.org
> Cc: linux-hyp...@vger.kernel.org
> Link: https://lore.kernel.org/lkml/202207202040.js1wctzn-...@intel.com/
> ---
>  drivers/video/fbdev/aty/radeon_base.c | 2 +-
>  drivers/video/fbdev/hyperv_fb.c   | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/video/fbdev/aty/radeon_base.c
> b/drivers/video/fbdev/aty/radeon_base.c
> index e5e362b8c9da..0a8199985d52 100644
> --- a/drivers/video/fbdev/aty/radeon_base.c
> +++ b/drivers/video/fbdev/aty/radeon_base.c
> @@ -2243,7 +2243,7 @@ static int radeon_kick_out_firmware_fb(struct pci_dev
> *pdev)
>   resource_size_t base = pci_resource_start(pdev, 0);
>   resource_size_t size = pci_resource_len(pdev, 0);
> 
> - return aperture_remove_conflicting_devices(base, size, KBUILD_MODNAME,
> false);
> + return aperture_remove_conflicting_devices(base, size, false,
> KBUILD_MODNAME);
>  }
> 
>  static int radeonfb_pci_register(struct pci_dev *pdev,
> diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
> index a944a6620527..a0e1d70b90d7 100644
> --- a/drivers/video/fbdev/hyperv_fb.c
> +++ b/drivers/video/fbdev/hyperv_fb.c
> @@ -1077,7 +1077,7 @@ static int hvfb_getmem(struct hv_device *hdev, struct
> fb_info *info)
>  getmem_done:
>   aperture_remove_conflicting_devices(info->apertures->ranges[0].base,
>   info->apertures->ranges[0].size,
> - KBUILD_MODNAME, false);
> + false, KBUILD_MODNAME);
> 
>   if (gen2vm) {
>   /* framebuffer is reallocated, clear screen_info to avoid 
> misuse from
> kexec */
> --
> 2.36.1

For the Hyper-V frame buffer driver:

Reviewed-by: Michael Kelley 


RE: [PATCH 0/4] Remove support for Hyper-V 2008 and 2008R2/Win7

2022-05-08 Thread Michael Kelley (LINUX)
From: Pavel Machek  Sent: Wednesday, May 4, 2022 10:23 AM
> 
> Hi!
> 
> > Linux code for running as a Hyper-V guest includes special cases for the
> > first released versions of Hyper-V: 2008 and 2008R2/Windows 7. These
> > versions were very thinly used for running Linux guests when first
> > released more than 12 years ago, and they are now out of support
> > (except for extended security updates). As initial versions, they
> > lack the performance features needed for effective production usage
> > of Linux guests. In total, there's no need to continue to support
> > the latest Linux kernels on these versions of Hyper-V.
> >
> > Simplify the code for running on Hyper-V by removing the special
> > cases. This includes removing the negotiation of the VMbus protocol
> > versions for 2008 and 2008R2, and the special case code based on
> > those VMbus protocol versions. Changes are in the core VMbus code and
> > several drivers for synthetic VMbus devices.
> 
> > 2008 and 2008R2, so if the broader Linux kernel community surfaces
> > a reason why this clean-up should not be done now, we can wait.
> > But I think we want to eventually stop carrying around this extra
> > baggage, and based on discussions with the Hyper-V team within
> > Microsoft, we're already past the point that it has any value.
> 
> Normal way to do such deprecations is to put printks in first, then hide it
> under config option noone sets, and wait for year or so if anyone complains.
> 

Are there any examples of doing these deprecation steps that you can
point me to?  I did not see anything in the Documentation directory
covering the deprecation process you describe.

I'd also make the case that we are already well down the deprecation
path.  For at least the last 5 years, the public Microsoft documentation
for Linux guests has listed Hyper-V 2012 R2 as the earliest supported
Hyper-V version.  Other current and new Microsoft products aren't
supported on Hyper-V 2008/Win7 either -- the usual Word/Excel/
PowerPoint, etc. fall into this category as well as Windows 10 and Windows
11 as guests.  So for a rare user who might still be using Hyper-V
2008/Win7, there's no reasonable expectation of being able to run
the latest upstream Linux kernel on Hyper-V 2008/Win7.  Other
current software doesn't.

Given that running Linux guests on Hyper-V sort of implicitly
combines Microsoft commercial thinking and Linux open source
thinking about version support, I could see putting the old Hyper-V
version support under a config option that defaults to "no", with a 
deprecation comment, and seeing if that garners any complaints.
But given the broader situation with Hyper-V 2008/Win7, in my
judgment even that is more cautious than we need to be.

Michael

> We can't really remove code that is in use.
> 
> Best regards,
>   Pavel


RE: [PATCH v2] hv: account for packet descriptor in maximum packet size

2022-01-27 Thread Michael Kelley (LINUX)
From: Yanming Liu  Sent: Wednesday, January 19, 2022 12:14 
PM
> 
> On Thu, Jan 20, 2022 at 2:12 AM Michael Kelley (LINUX)
>  wrote:
> >
> > From: Wei Liu  Sent: Friday, January 14, 2022 11:13 AM
> > >
> > > On Mon, Jan 10, 2022 at 01:44:19AM +0100, Andrea Parri wrote:
> > > > (Extending Cc: list,)
> > > >
> > > > On Sun, Jan 09, 2022 at 05:55:16PM +0800, Yanming Liu wrote:
> >
> > The VSS driver in hv_snapshot.c allocates a receive buffer of 8 Kbytes
> > and sets max_pkt_size to 8 Kbytes.  But the received messages are
> > all fixed size and small.  I don't know why the driver uses an 8 Kbyte
> > receive buffer instead of 4 Kbytes, but the current settings are
> > more than sufficient.
> >
> 
> Well, I'm not sure, on August 2021 there was a patch changing
> max_pkt_size to 8 KiB for VSS driver:
> https://lore.kernel.org/linux-hyperv/20210825190217.qh2c6yq5qr3ntum5@liuwe-devbox-debian-v2/T/
> 
> The patch mentioned a 6304 bytes VSS message. Which is part of the
> reason I tried to address the more "general" problem of potentially
> mismatching buffer size.
> 

This is certainly interesting.   The Linux driver is not processing
all those bytes, so I'm not sure what Hyper-V is passing to the
guest.  I'll check with the Hyper-V team to be sure.

Michael


RE: [PATCH 1/1] video: hyperv_fb: Fix validation of screen resolution

2022-01-23 Thread Michael Kelley (LINUX)
From: Wei Liu  Sent: Sunday, January 23, 2022 1:56 PM
> 
> On Sun, Jan 16, 2022 at 09:53:06PM +, Haiyang Zhang wrote:
> >
> >
> > > -Original Message-
> > > From: Michael Kelley (LINUX) 
> > > Sent: Sunday, January 16, 2022 2:19 PM
> > > To: KY Srinivasan ; Haiyang Zhang
> ; Stephen
> > > Hemminger ; wei@kernel.org; Wei Hu
> ; Dexuan
> > > Cui ; drawat.fl...@gmail.com; hhei ;
> linux-
> > > ker...@vger.kernel.org; linux-hyp...@vger.kernel.org; linux-
> fb...@vger.kernel.org; dri-
> > > de...@lists.freedesktop.org
> > > Cc: Michael Kelley (LINUX) 
> > > Subject: [PATCH 1/1] video: hyperv_fb: Fix validation of screen resolution
> > >
> > > In the WIN10 version of the Synthetic Video protocol with Hyper-V,
> > > Hyper-V reports a list of supported resolutions as part of the protocol
> > > negotiation. The driver calculates the maximum width and height from
> > > the list of resolutions, and uses those maximums to validate any screen
> > > resolution specified in the video= option on the kernel boot line.
> > >
> > > This method of validation is incorrect. For example, the list of
> > > supported resolutions could contain 1600x1200 and 1920x1080, both of
> > > which fit in an 8 Mbyte frame buffer.  But calculating the max width
> > > and height yields 1920 and 1200, and 1920x1200 resolution does not fit
> > > in an 8 Mbyte frame buffer.  Unfortunately, this resolution is accepted,
> > > causing a kernel fault when the driver accesses memory outside the
> > > frame buffer.
> > >
> > > Instead, validate the specified screen resolution by calculating
> > > its size, and comparing against the frame buffer size.  Delete the
> > > code for calculating the max width and height from the list of
> > > resolutions, since these max values have no use.  Also add the
> > > frame buffer size to the info message to aid in understanding why
> > > a resolution might be rejected.
> > >
> > > Fixes: 67e7cdb4829d ("video: hyperv: hyperv_fb: Obtain screen resolution 
> > > from Hyper-V
> > > host")
> > > Signed-off-by: Michael Kelley 
> [...]
> >
> > Reviewed-by: Haiyang Zhang 
> >
> 
> Applied to hyperv-fixes. Thanks.

This fix got pulled into the fbdev/for-next tree by a new maintainer, Helge 
Deller.
See 
https://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev.git/commit/?h=for-next=bcc48f8d980b12e66a3d59dfa1041667db971d86

Michael


RE: [PATCH v2] hv: account for packet descriptor in maximum packet size

2022-01-19 Thread Michael Kelley (LINUX)
From: Wei Liu  Sent: Friday, January 14, 2022 11:13 AM
> 
> On Mon, Jan 10, 2022 at 01:44:19AM +0100, Andrea Parri wrote:
> > (Extending Cc: list,)
> >
> > On Sun, Jan 09, 2022 at 05:55:16PM +0800, Yanming Liu wrote:
> > > Commit adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V
> > > out of the ring buffer") introduced a notion of maximum packet size in
> > > vmbus channel and used that size to initialize a buffer holding all
> > > incoming packet along with their vmbus packet header. Currently, some
> > > vmbus drivers set max_pkt_size to the size of their receive buffer
> > > passed to vmbus_recvpacket, however vmbus_open expects this size to also
> > > include vmbus packet header. This leads to corruption of the ring buffer
> > > state when receiving a maximum sized packet.
> > >
> > > Specifically, in hv_balloon I have observed of a dm_unballoon_request
> > > message of 4096 bytes being truncated to 4080 bytes. When the driver
> > > tries to read next packet it starts from a wrong read_index, receives
> > > garbage and prints a lot of "Unhandled message: type: " in
> > > dmesg.
> > >
> > > The same mismatch also happens in hv_fcopy, hv_kvp, hv_snapshot,
> > > hv_util, hyperv_drm and hyperv_fb, though bad cases are not observed
> > > yet.
> > >
> > > Allocate the buffer with HV_HYP_PAGE_SIZE more bytes to make room for
> > > the descriptor, assuming the vmbus packet header will never be larger
> > > than HV_HYP_PAGE_SIZE. This is essentially free compared to just adding
> > > 'sizeof(struct vmpacket_descriptor)' because these buffers are all more
> > > than HV_HYP_PAGE_SIZE bytes so kmalloc rounds them up anyway.
> > >
> > > Fixes: adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V 
> > > out of the ring buffer")
> > > Suggested-by: Andrea Parri (Microsoft) 
> > > Signed-off-by: Yanming Liu 
> >
> > Thanks for sorting this out; the patch looks good to me:
> >
> > Reviewed-by: Andrea Parri (Microsoft) 
> >
> 
> Thanks. I will pick this up after 5.17-rc1 is out.
> 
> Wei.

I'm NACK'ing this set of changes.  I've spent some further time investigating,
so let me explain.

I'm good with the overall approach of fixing individual drivers to set the
max_pkt_size to account for the VMbus packet header, as this is an
important aspect that was missed in the original coding.   But interestingly,
all but one of the miscellaneous VMbus drivers allocate significantly more
receive buffer space than is actually needed, and the max_pkt_size matching
that receive buffer size is already bigger than needed.  In all these
cases, there is already plenty of space for the VMbus packet header.

These hv-util.c drivers allocate a receive buffer 4 Kbytes in size, and all
receive only small fixed-size packets:  heartbeat, shutdown, timesync.
I don't think any changes are needed for these drivers because the default
max_pkt_size value of 4 Kbytes bytes is plenty of space even when
accounting for the VMbus packet header.

The VSS driver in hv_snapshot.c allocates a receive buffer of 8 Kbytes
and sets max_pkt_size to 8 Kbytes.  But the received messages are
all fixed size and small.  I don't know why the driver uses an 8 Kbyte
receive buffer instead of 4 Kbytes, but the current settings are
more than sufficient.

The FCOPY driver in hv_fcopy.c allocates a receive buffer of 8 Kbytes
and sets max_pkt_size to 8 Kbytes.  The received messages have
some header overhead plus up to 6 Kbytes of data, so the 8 Kbyte
receive buffer is definitely needed.  And while this one is a little
closer to filling up the available receive space than the previous
ones, there's still plenty of room for the VMbus packet header.  I
don't think any changes are needed.

The KVP driver in hv_kvp.c allocates a receive buffer of 16 Kbytes
and sets max_pkt_size to 16 Kbytes.  From what I can tell, the
received messages max out at close to 4 Kbytes.   Key exchange
messages have 512 bytes of key name and 2048 bytes of key
value, plus some header overhead.   ipaddr_value messages
are the largest, with 3 IP addresses @ 1024 bytes each, plus
a gateway with 512 bytes, and an adapter ID with 128 bytes.
But altogether, that is still less than 4096.  I don't know why
the receive buffer is 16 Kbytes, but it is plenty big and no
changes are needed.

The two frame buffer drivers also use 16 Kbyte receive buffers
and set max_pkt_size to 16 Kbytes.  Again, this looks to be overkill
as the messages received are mostly fixed size.  One message
returns a variable size list of supported screen resolutions, but
each entry in the list is only 4 bytes, and we're looking at a few
tens of resolutions at the very most.  Again, no changes are
needed.

After all this analysis, the balloon driver is the only one that
needs changing.   It uses a 4 Kbyte receive buffer, and indeed
Hyper-V may fill that receive buffer in the case of unballoon
messages.   And that where the original problem was observed.

Two other aspects for completeness.  First, all these drivers

Re: [PATCH] component: Move host device to end of device lists on binding

2021-05-13 Thread Russell King - ARM Linux admin
On Mon, May 10, 2021 at 06:05:21PM +0200, Daniel Vetter wrote:
> Entirely aside, but an s/master/aggregate/ or similar over the entire
> component.c codebase would help a pile in making it easier to understand
> which part does what. Or at least I'm always terribly confused about which
> bind binds what and all that, so maybe an additional review whether we
> have a clear split into aggregate and individual components after that
> initial fix is needed.

I'm not entirely sure what you mean "which bind binds what".

The component helper solves this problem:

We have a master or aggregate device representing a collection of
individual devices. The aggregate and individual devices may be probed
by the device model in any order. The aggregate device is only complete
once all individual and aggregate devices have been successfully probed.

It does this by tracking which devices are present, and only when they
are all present does it call the bind() operation. Conversely, if one
happens to be removed, it calls the unbind() operation. To me, that's
very simple.

When we start talking about PM, the original idea was for the aggregate
device to handle that. However, DRM/OF has pushed to change the model a
bit such that the aggregate device is created as a platform device when
we detect the presence of one of the individual devices.

I suspect what we actually want is something that, when the first
individual device gets notified of a transition to a lower power mode,
we want to place the system formed by all the devices into a low power
mode. Please realise that it may not be appropriate for every
individual device to be affected by that transition until it receives
its own PM call.

> One question I have: Why is the bridge component driver not correctly
> ordered wrt the i2c driver it needs? The idea is that the aggregate driver
> doesn't access any hw itself, but entirely relies on all its components.

As far as I'm aware, bridge was never converted to use any component
stuff, so I'm not sure what you're referring to.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH] component: Move host device to end of device lists on binding

2021-05-11 Thread Russell King - ARM Linux admin
On Sat, May 08, 2021 at 12:41:18AM -0700, Stephen Boyd wrote:
> Within the component device framework this usually isn't that bad
> because the real driver work is done at bind time via
> component{,master}_ops::bind(). It becomes a problem when the driver
> core, or host driver, wants to operate on the component device outside
> of the bind/unbind functions, e.g. via 'remove' or 'shutdown'. The
> driver core doesn't understand the relationship between the host device
> and the component devices and could possibly try to operate on component
> devices when they're already removed from the system or shut down.

You really are not supposed to be doing anything with component devices
once they have been unbound. You can do stuff with them only between the
bind() and the unbind() callbacks for the host device.

Access to the host devices outside of that is totally undefined and
should not be done.

The shutdown callback should be fine as long as the other devices are
still bound, but there will be implications if the shutdown order
matters.

However, randomly pulling devices around in the DPM list sounds to me
like a very bad idea. What happens if such re-orderings result in a
child device being shutdown after a parent device has been shut down?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [GIT PULL] immutable branch for amba changes targeting v5.12-rc1

2021-02-04 Thread Russell King - ARM Linux admin
On Thu, Feb 04, 2021 at 05:56:50PM +0100, Greg Kroah-Hartman wrote:
> On Thu, Feb 04, 2021 at 04:52:24PM +, Russell King - ARM Linux admin 
> wrote:
> > On Tue, Feb 02, 2021 at 03:06:05PM +0100, Greg Kroah-Hartman wrote:
> > > I'm glad to take this through my char/misc tree, as that's where the
> > > other coresight changes flow through.  So if no one else objects, I will
> > > do so...
> > 
> > Greg, did you end up pulling this after all? If not, Uwe produced a v2.
> > I haven't merged v2 yet as I don't know what you've done.
> 
> I thought you merged this?

I took v1, and put it in a branch I've promised in the past not to
rebase/rewind. Uwe is now asking for me to take a v2 or apply a patch
on top.

The only reason to produce an "immutable" branch is if it's the basis
for some dependent work and you need that branch merged into other
people's trees... so the whole "lets produce a v2" is really odd
workflow... I'm confused about what I should do, and who has to be
informed which option I take.

I'm rather lost here too.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [GIT PULL] immutable branch for amba changes targeting v5.12-rc1

2021-02-04 Thread Russell King - ARM Linux admin
On Tue, Feb 02, 2021 at 03:06:05PM +0100, Greg Kroah-Hartman wrote:
> I'm glad to take this through my char/misc tree, as that's where the
> other coresight changes flow through.  So if no one else objects, I will
> do so...

Greg, did you end up pulling this after all? If not, Uwe produced a v2.
I haven't merged v2 yet as I don't know what you've done.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3 0/5] amba: minor fix and various cleanups

2021-02-02 Thread Russell King - ARM Linux admin
On Tue, Jan 26, 2021 at 05:58:30PM +0100, Uwe Kleine-König wrote:
> From: Uwe Kleine-König  
> Hello,
> 
> Changes since v2 sent with Message-Id:
> 20201124133139.3072124-1-...@kleine-koenig.org:
> 
>  - Rebase to v5.11-rc1 (which resulted in a few conflicts in
>drivers/hwtracing).
>  - Add various Acks.
>  - Send to more maintainers directly (which I think is one of the
>reasons why there are so few Acks).
> 
> For my taste patch 4 needs some more acks (drivers/char/hw_random,
> drivers/dma, drivers/gpu/drm/pl111, drivers/i2c, drivers/mmc,
> drivers/vfio, drivers/watchdog and sound/arm have no maintainer feedback
> yet).
> 
> My suggestion is to let this series go in via Russell King (who cares
> for amba). Once enough Acks are there I can also provide a tag for
> merging into different trees. Just tell me if you prefer this solution.
> 
> Would be great if this could make it for v5.12, but I'm aware it's
> already late in the v5.11 cycle so it might have to wait for v5.13.

I think you need to have a 6th patch which moves the
probe/remove/shutdown methods into the bus_type - if you're setting
them for every struct device_driver, then there's no point doing that
and they may as well be in the bus_type.

Apart from that, it looks good.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3 4/5] amba: Make the remove callback return void

2021-01-26 Thread Russell King - ARM Linux admin
On Tue, Jan 26, 2021 at 06:56:52PM +0100, Uwe Kleine-König wrote:
> I'm surprised to see that the remove callback introduced in 2952ecf5df33
> ("coresight: etm4x: Refactor probing routine") has an __exit annotation.

In general, remove callbacks should not have an __exit annotation.
__exit _can_ be discarded at link time for built-in stuff.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm: bridge: dw-hdmi: Avoid resetting force in the detect function

2020-11-08 Thread Russell King - ARM Linux admin
On Sun, Nov 08, 2020 at 10:53:22AM +0100, Sam Ravnborg wrote:
> Russell,
> 
> On Sat, Oct 31, 2020 at 07:17:47PM +1100, Jonathan Liu wrote:
> > It has been observed that resetting force in the detect function can
> > result in the PHY being powered down in response to hot-plug detect
> > being asserted, even when the HDMI connector is forced on.
> > 
> > Enabling debug messages and adding a call to dump_stack() in
> > dw_hdmi_phy_power_off() shows the following in dmesg:
> > [  160.637413] dwhdmi-rockchip ff94.hdmi: EVENT=plugin
> > [  160.637433] dwhdmi-rockchip ff94.hdmi: PHY powered down in 0 
> > iterations
> > 
> > Call trace:
> > dw_hdmi_phy_power_off
> > dw_hdmi_phy_disable
> > dw_hdmi_update_power
> > dw_hdmi_detect
> > dw_hdmi_connector_detect
> > drm_helper_probe_detect_ctx
> > drm_helper_hpd_irq_event
> > dw_hdmi_irq
> > irq_thread_fn
> > irq_thread
> > kthread
> > ret_from_fork
> > 
> > Fixes: 381f05a7a842 ("drm: bridge/dw_hdmi: add connector mode forcing")
> > Signed-off-by: Jonathan Liu 
> 
> you are the original author of this code - any comments on this patch?

No further comments beyond what has already been discussed, and the
long and short of it is it's been so long that I don't remember why
that code was there. Given that, I'm not even in a position to ack
the change. Sorry.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] gpu/drm/armada: fix unused parameter warning

2020-10-12 Thread Russell King - ARM Linux admin
On Mon, Oct 12, 2020 at 04:57:24AM -0700, Bernard Zhao wrote:
> Functions armada_drm_crtc_atomic_flush &
> armada_drm_crtc_atomic_enable don`t use the second parameter.
> So we may get warning like :
> warning: unused parameter ‘***’ [-Wunused-parameter].
> This change is to fix the compile warning with -Wunused-parameter.

Under what circumstances do we build the kernel with that warning
enabled?

> 
> Signed-off-by: Bernard Zhao 
> ---
>  drivers/gpu/drm/armada/armada_crtc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/armada/armada_crtc.c 
> b/drivers/gpu/drm/armada/armada_crtc.c
> index 38dfaa46d306..fc8b922c3e44 100644
> --- a/drivers/gpu/drm/armada/armada_crtc.c
> +++ b/drivers/gpu/drm/armada/armada_crtc.c
> @@ -427,7 +427,7 @@ static int armada_drm_crtc_atomic_check(struct drm_crtc 
> *crtc,
>  }
>  
>  static void armada_drm_crtc_atomic_begin(struct drm_crtc *crtc,
> -  struct drm_crtc_state *old_crtc_state)
> + struct drm_crtc_state __attribute__((unused)) 
> *old_crtc_state)
>  {
>   struct armada_crtc *dcrtc = drm_to_armada_crtc(crtc);
>  
> @@ -441,7 +441,7 @@ static void armada_drm_crtc_atomic_begin(struct drm_crtc 
> *crtc,
>  }
>  
>  static void armada_drm_crtc_atomic_flush(struct drm_crtc *crtc,
> -  struct drm_crtc_state *old_crtc_state)
> + struct drm_crtc_state __attribute__((unused)) 
> *old_crtc_state)
>  {
>   struct armada_crtc *dcrtc = drm_to_armada_crtc(crtc);
>  
> -- 
> 2.28.0
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


  1   2   3   4   5   6   7   8   9   10   >