[PATCH 5.11 117/210] i40e: Fix kernel oops when i40e driver removes VFs

2021-04-12 Thread Greg Kroah-Hartman
From: Eryk Rybak 

[ Upstream commit 347b5650cd158d1d953487cc2bec567af5c5bf96 ]

Fix the reason of kernel oops when i40e driver removed VFs.
Added new __I40E_VFS_RELEASING state to signalize releasing
process by PF, that it makes possible to exit of reset VF procedure.
Without this patch, it is possible to suspend the VFs reset by
releasing VFs resources procedure. Retrying the reset after the
timeout works on the freed VF memory causing a kernel oops.

Fixes: d43d60e5eb95 ("i40e: ensure reset occurs when disabling VF")
Signed-off-by: Eryk Rybak 
Signed-off-by: Grzegorz Szczurek 
Reviewed-by: Aleksandr Loktionov 
Tested-by: Konrad Jankowski 
Signed-off-by: Tony Nguyen 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/intel/i40e/i40e.h | 1 +
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 9 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 118473dfdcbd..fe1258778cbc 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -142,6 +142,7 @@ enum i40e_state_t {
__I40E_VIRTCHNL_OP_PENDING,
__I40E_RECOVERY_MODE,
__I40E_VF_RESETS_DISABLED,  /* disable resets during i40e_remove */
+   __I40E_VFS_RELEASING,
/* This must be last as it determines the size of the BITMAP */
__I40E_STATE_SIZE__,
 };
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c 
b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 1b6ec9be155a..5d301a466f5c 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -137,6 +137,7 @@ void i40e_vc_notify_vf_reset(struct i40e_vf *vf)
  **/
 static inline void i40e_vc_disable_vf(struct i40e_vf *vf)
 {
+   struct i40e_pf *pf = vf->pf;
int i;
 
i40e_vc_notify_vf_reset(vf);
@@ -147,6 +148,11 @@ static inline void i40e_vc_disable_vf(struct i40e_vf *vf)
 * ensure a reset.
 */
for (i = 0; i < 20; i++) {
+   /* If PF is in VFs releasing state reset VF is impossible,
+* so leave it.
+*/
+   if (test_bit(__I40E_VFS_RELEASING, pf->state))
+   return;
if (i40e_reset_vf(vf, false))
return;
usleep_range(1, 2);
@@ -1574,6 +1580,8 @@ void i40e_free_vfs(struct i40e_pf *pf)
 
if (!pf->vf)
return;
+
+   set_bit(__I40E_VFS_RELEASING, pf->state);
while (test_and_set_bit(__I40E_VF_DISABLE, pf->state))
usleep_range(1000, 2000);
 
@@ -1631,6 +1639,7 @@ void i40e_free_vfs(struct i40e_pf *pf)
}
}
clear_bit(__I40E_VF_DISABLE, pf->state);
+   clear_bit(__I40E_VFS_RELEASING, pf->state);
 }
 
 #ifdef CONFIG_PCI_IOV
-- 
2.30.2





[PATCH 5.10 107/188] i40e: Fix kernel oops when i40e driver removes VFs

2021-04-12 Thread Greg Kroah-Hartman
From: Eryk Rybak 

[ Upstream commit 347b5650cd158d1d953487cc2bec567af5c5bf96 ]

Fix the reason of kernel oops when i40e driver removed VFs.
Added new __I40E_VFS_RELEASING state to signalize releasing
process by PF, that it makes possible to exit of reset VF procedure.
Without this patch, it is possible to suspend the VFs reset by
releasing VFs resources procedure. Retrying the reset after the
timeout works on the freed VF memory causing a kernel oops.

Fixes: d43d60e5eb95 ("i40e: ensure reset occurs when disabling VF")
Signed-off-by: Eryk Rybak 
Signed-off-by: Grzegorz Szczurek 
Reviewed-by: Aleksandr Loktionov 
Tested-by: Konrad Jankowski 
Signed-off-by: Tony Nguyen 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/intel/i40e/i40e.h | 1 +
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 9 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 118473dfdcbd..fe1258778cbc 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -142,6 +142,7 @@ enum i40e_state_t {
__I40E_VIRTCHNL_OP_PENDING,
__I40E_RECOVERY_MODE,
__I40E_VF_RESETS_DISABLED,  /* disable resets during i40e_remove */
+   __I40E_VFS_RELEASING,
/* This must be last as it determines the size of the BITMAP */
__I40E_STATE_SIZE__,
 };
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c 
b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 3b269c70dcfe..e4f13a49c3df 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -137,6 +137,7 @@ void i40e_vc_notify_vf_reset(struct i40e_vf *vf)
  **/
 static inline void i40e_vc_disable_vf(struct i40e_vf *vf)
 {
+   struct i40e_pf *pf = vf->pf;
int i;
 
i40e_vc_notify_vf_reset(vf);
@@ -147,6 +148,11 @@ static inline void i40e_vc_disable_vf(struct i40e_vf *vf)
 * ensure a reset.
 */
for (i = 0; i < 20; i++) {
+   /* If PF is in VFs releasing state reset VF is impossible,
+* so leave it.
+*/
+   if (test_bit(__I40E_VFS_RELEASING, pf->state))
+   return;
if (i40e_reset_vf(vf, false))
return;
usleep_range(1, 2);
@@ -1574,6 +1580,8 @@ void i40e_free_vfs(struct i40e_pf *pf)
 
if (!pf->vf)
return;
+
+   set_bit(__I40E_VFS_RELEASING, pf->state);
while (test_and_set_bit(__I40E_VF_DISABLE, pf->state))
usleep_range(1000, 2000);
 
@@ -1631,6 +1639,7 @@ void i40e_free_vfs(struct i40e_pf *pf)
}
}
clear_bit(__I40E_VF_DISABLE, pf->state);
+   clear_bit(__I40E_VFS_RELEASING, pf->state);
 }
 
 #ifdef CONFIG_PCI_IOV
-- 
2.30.2





[PATCH 5.4 054/111] i40e: Fix kernel oops when i40e driver removes VFs

2021-04-12 Thread Greg Kroah-Hartman
From: Eryk Rybak 

[ Upstream commit 347b5650cd158d1d953487cc2bec567af5c5bf96 ]

Fix the reason of kernel oops when i40e driver removed VFs.
Added new __I40E_VFS_RELEASING state to signalize releasing
process by PF, that it makes possible to exit of reset VF procedure.
Without this patch, it is possible to suspend the VFs reset by
releasing VFs resources procedure. Retrying the reset after the
timeout works on the freed VF memory causing a kernel oops.

Fixes: d43d60e5eb95 ("i40e: ensure reset occurs when disabling VF")
Signed-off-by: Eryk Rybak 
Signed-off-by: Grzegorz Szczurek 
Reviewed-by: Aleksandr Loktionov 
Tested-by: Konrad Jankowski 
Signed-off-by: Tony Nguyen 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/intel/i40e/i40e.h | 1 +
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 9 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 678e4190b8a8..e571c6116c4b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -152,6 +152,7 @@ enum i40e_state_t {
__I40E_VIRTCHNL_OP_PENDING,
__I40E_RECOVERY_MODE,
__I40E_VF_RESETS_DISABLED,  /* disable resets during i40e_remove */
+   __I40E_VFS_RELEASING,
/* This must be last as it determines the size of the BITMAP */
__I40E_STATE_SIZE__,
 };
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c 
b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 5acd599d6b9a..e56107305486 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -137,6 +137,7 @@ void i40e_vc_notify_vf_reset(struct i40e_vf *vf)
  **/
 static inline void i40e_vc_disable_vf(struct i40e_vf *vf)
 {
+   struct i40e_pf *pf = vf->pf;
int i;
 
i40e_vc_notify_vf_reset(vf);
@@ -147,6 +148,11 @@ static inline void i40e_vc_disable_vf(struct i40e_vf *vf)
 * ensure a reset.
 */
for (i = 0; i < 20; i++) {
+   /* If PF is in VFs releasing state reset VF is impossible,
+* so leave it.
+*/
+   if (test_bit(__I40E_VFS_RELEASING, pf->state))
+   return;
if (i40e_reset_vf(vf, false))
return;
usleep_range(1, 2);
@@ -1506,6 +1512,8 @@ void i40e_free_vfs(struct i40e_pf *pf)
 
if (!pf->vf)
return;
+
+   set_bit(__I40E_VFS_RELEASING, pf->state);
while (test_and_set_bit(__I40E_VF_DISABLE, pf->state))
usleep_range(1000, 2000);
 
@@ -1563,6 +1571,7 @@ void i40e_free_vfs(struct i40e_pf *pf)
}
}
clear_bit(__I40E_VF_DISABLE, pf->state);
+   clear_bit(__I40E_VFS_RELEASING, pf->state);
 }
 
 #ifdef CONFIG_PCI_IOV
-- 
2.30.2





[PATCH 4.19 34/66] i40e: Fix kernel oops when i40e driver removes VFs

2021-04-12 Thread Greg Kroah-Hartman
From: Eryk Rybak 

[ Upstream commit 347b5650cd158d1d953487cc2bec567af5c5bf96 ]

Fix the reason of kernel oops when i40e driver removed VFs.
Added new __I40E_VFS_RELEASING state to signalize releasing
process by PF, that it makes possible to exit of reset VF procedure.
Without this patch, it is possible to suspend the VFs reset by
releasing VFs resources procedure. Retrying the reset after the
timeout works on the freed VF memory causing a kernel oops.

Fixes: d43d60e5eb95 ("i40e: ensure reset occurs when disabling VF")
Signed-off-by: Eryk Rybak 
Signed-off-by: Grzegorz Szczurek 
Reviewed-by: Aleksandr Loktionov 
Tested-by: Konrad Jankowski 
Signed-off-by: Tony Nguyen 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/intel/i40e/i40e.h | 1 +
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 9 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 738acba7a9a3..3c921dfc2056 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -149,6 +149,7 @@ enum i40e_state_t {
__I40E_CLIENT_L2_CHANGE,
__I40E_CLIENT_RESET,
__I40E_VF_RESETS_DISABLED,  /* disable resets during i40e_remove */
+   __I40E_VFS_RELEASING,
/* This must be last as it determines the size of the BITMAP */
__I40E_STATE_SIZE__,
 };
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c 
b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 5d782148d35f..3c1533c627fd 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -137,6 +137,7 @@ void i40e_vc_notify_vf_reset(struct i40e_vf *vf)
  **/
 static inline void i40e_vc_disable_vf(struct i40e_vf *vf)
 {
+   struct i40e_pf *pf = vf->pf;
int i;
 
i40e_vc_notify_vf_reset(vf);
@@ -147,6 +148,11 @@ static inline void i40e_vc_disable_vf(struct i40e_vf *vf)
 * ensure a reset.
 */
for (i = 0; i < 20; i++) {
+   /* If PF is in VFs releasing state reset VF is impossible,
+* so leave it.
+*/
+   if (test_bit(__I40E_VFS_RELEASING, pf->state))
+   return;
if (i40e_reset_vf(vf, false))
return;
usleep_range(1, 2);
@@ -1381,6 +1387,8 @@ void i40e_free_vfs(struct i40e_pf *pf)
 
if (!pf->vf)
return;
+
+   set_bit(__I40E_VFS_RELEASING, pf->state);
while (test_and_set_bit(__I40E_VF_DISABLE, pf->state))
usleep_range(1000, 2000);
 
@@ -1438,6 +1446,7 @@ void i40e_free_vfs(struct i40e_pf *pf)
}
}
clear_bit(__I40E_VF_DISABLE, pf->state);
+   clear_bit(__I40E_VFS_RELEASING, pf->state);
 }
 
 #ifdef CONFIG_PCI_IOV
-- 
2.30.2





[PATCH 4.14 01/18] Bluetooth: fix kernel oops in store_pending_adv_report

2020-10-16 Thread Greg Kroah-Hartman
From: Alain Michaud 

commit a2ec905d1e160a33b2e210e45ad30445ef26ce0e upstream.

Fix kernel oops observed when an ext adv data is larger than 31 bytes.

This can be reproduced by setting up an advertiser with advertisement
larger than 31 bytes.  The issue is not sensitive to the advertisement
content.  In particular, this was reproduced with an advertisement of
229 bytes filled with 'A'.  See stack trace below.

This is fixed by not catching ext_adv as legacy adv are only cached to
be able to concatenate a scanable adv with its scan response before
sending it up through mgmt.

With ext_adv, this is no longer necessary.

  general protection fault:  [#1] SMP PTI
  CPU: 6 PID: 205 Comm: kworker/u17:0 Not tainted 5.4.0-37-generic #41-Ubuntu
  Hardware name: Dell Inc. XPS 15 7590/0CF6RR, BIOS 1.7.0 05/11/2020
  Workqueue: hci0 hci_rx_work [bluetooth]
  RIP: 0010:hci_bdaddr_list_lookup+0x1e/0x40 [bluetooth]
  Code: ff ff e9 26 ff ff ff 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 07 48 89 e5 
48 39 c7 75 0a eb 24 48 8b 00 48 39 f8 74 1c 44 8b 06 <44> 39 40 10 75 ef 44 0f 
b7 4e 04 66 44 39 48 14 75 e3 38 50 16 75
  RSP: 0018:bc6a40493c70 EFLAGS: 00010286
  RAX: 4141414141414141 RBX: 001b RCX: 
  RDX:  RSI: 9903e76c100f RDI: 9904289d4b28
  RBP: bc6a40493c70 R08: 93570362 R09: 
  R10:  R11: 9904344eae38 R12: 9904289d4000
  R13:  R14: ffa3 R15: 9903e76c100f
  FS: () GS:99043458() knlGS:
  CS: 0010 DS:  ES:  CR0: 80050033
  CR2: 7feed125a000 CR3: 0001b860a003 CR4: 003606e0
  Call Trace:
process_adv_report+0x12e/0x560 [bluetooth]
hci_le_meta_evt+0x7b2/0xba0 [bluetooth]
hci_event_packet+0x1c29/0x2a90 [bluetooth]
hci_rx_work+0x19b/0x360 [bluetooth]
process_one_work+0x1eb/0x3b0
worker_thread+0x4d/0x400
kthread+0x104/0x140

Fixes: c215e9397b00 ("Bluetooth: Process extended ADV report event")
Reported-by: Andy Nguyen 
Reported-by: Linus Torvalds 
Reported-by: Balakrishna Godavarthi 
Signed-off-by: Alain Michaud 
Tested-by: Sonny Sasaka 
Acked-by: Marcel Holtmann 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 net/bluetooth/hci_event.c |   22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -1133,6 +1133,9 @@ static void store_pending_adv_report(str
 {
struct discovery_state *d = >discovery;
 
+   if (len > HCI_MAX_AD_LENGTH)
+   return;
+
bacpy(>last_adv_addr, bdaddr);
d->last_adv_addr_type = bdaddr_type;
d->last_adv_rssi = rssi;
@@ -4779,6 +4782,11 @@ static void process_adv_report(struct hc
return;
}
 
+   if (len > HCI_MAX_AD_LENGTH) {
+   pr_err_ratelimited("legacy adv larger than 31 bytes");
+   return;
+   }
+
/* Find the end of the data in case the report contains padded zero
 * bytes at the end causing an invalid length value.
 *
@@ -4839,7 +4847,7 @@ static void process_adv_report(struct hc
 */
conn = check_pending_le_conn(hdev, bdaddr, bdaddr_type, type,
direct_addr);
-   if (conn && type == LE_ADV_IND) {
+   if (conn && type == LE_ADV_IND && len <= HCI_MAX_AD_LENGTH) {
/* Store report for later inclusion by
 * mgmt_device_connected
 */
@@ -4964,10 +4972,14 @@ static void hci_le_adv_report_evt(struct
struct hci_ev_le_advertising_info *ev = ptr;
s8 rssi;
 
-   rssi = ev->data[ev->length];
-   process_adv_report(hdev, ev->evt_type, >bdaddr,
-  ev->bdaddr_type, NULL, 0, rssi,
-  ev->data, ev->length);
+   if (ev->length <= HCI_MAX_AD_LENGTH) {
+   rssi = ev->data[ev->length];
+   process_adv_report(hdev, ev->evt_type, >bdaddr,
+  ev->bdaddr_type, NULL, 0, rssi,
+  ev->data, ev->length);
+   } else {
+   bt_dev_err(hdev, "Dropping invalid advertising data");
+   }
 
ptr += sizeof(*ev) + ev->length + 1;
}




[PATCH 4.9 04/16] Bluetooth: fix kernel oops in store_pending_adv_report

2020-10-16 Thread Greg Kroah-Hartman
From: Alain Michaud 

commit a2ec905d1e160a33b2e210e45ad30445ef26ce0e upstream.

Fix kernel oops observed when an ext adv data is larger than 31 bytes.

This can be reproduced by setting up an advertiser with advertisement
larger than 31 bytes.  The issue is not sensitive to the advertisement
content.  In particular, this was reproduced with an advertisement of
229 bytes filled with 'A'.  See stack trace below.

This is fixed by not catching ext_adv as legacy adv are only cached to
be able to concatenate a scanable adv with its scan response before
sending it up through mgmt.

With ext_adv, this is no longer necessary.

  general protection fault:  [#1] SMP PTI
  CPU: 6 PID: 205 Comm: kworker/u17:0 Not tainted 5.4.0-37-generic #41-Ubuntu
  Hardware name: Dell Inc. XPS 15 7590/0CF6RR, BIOS 1.7.0 05/11/2020
  Workqueue: hci0 hci_rx_work [bluetooth]
  RIP: 0010:hci_bdaddr_list_lookup+0x1e/0x40 [bluetooth]
  Code: ff ff e9 26 ff ff ff 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 07 48 89 e5 
48 39 c7 75 0a eb 24 48 8b 00 48 39 f8 74 1c 44 8b 06 <44> 39 40 10 75 ef 44 0f 
b7 4e 04 66 44 39 48 14 75 e3 38 50 16 75
  RSP: 0018:bc6a40493c70 EFLAGS: 00010286
  RAX: 4141414141414141 RBX: 001b RCX: 
  RDX:  RSI: 9903e76c100f RDI: 9904289d4b28
  RBP: bc6a40493c70 R08: 93570362 R09: 
  R10:  R11: 9904344eae38 R12: 9904289d4000
  R13:  R14: ffa3 R15: 9903e76c100f
  FS: () GS:99043458() knlGS:
  CS: 0010 DS:  ES:  CR0: 80050033
  CR2: 7feed125a000 CR3: 0001b860a003 CR4: 003606e0
  Call Trace:
process_adv_report+0x12e/0x560 [bluetooth]
hci_le_meta_evt+0x7b2/0xba0 [bluetooth]
hci_event_packet+0x1c29/0x2a90 [bluetooth]
hci_rx_work+0x19b/0x360 [bluetooth]
process_one_work+0x1eb/0x3b0
worker_thread+0x4d/0x400
kthread+0x104/0x140

Fixes: c215e9397b00 ("Bluetooth: Process extended ADV report event")
Reported-by: Andy Nguyen 
Reported-by: Linus Torvalds 
Reported-by: Balakrishna Godavarthi 
Signed-off-by: Alain Michaud 
Tested-by: Sonny Sasaka 
Acked-by: Marcel Holtmann 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 net/bluetooth/hci_event.c |   22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -1133,6 +1133,9 @@ static void store_pending_adv_report(str
 {
struct discovery_state *d = >discovery;
 
+   if (len > HCI_MAX_AD_LENGTH)
+   return;
+
bacpy(>last_adv_addr, bdaddr);
d->last_adv_addr_type = bdaddr_type;
d->last_adv_rssi = rssi;
@@ -4779,6 +4782,11 @@ static void process_adv_report(struct hc
return;
}
 
+   if (len > HCI_MAX_AD_LENGTH) {
+   pr_err_ratelimited("legacy adv larger than 31 bytes");
+   return;
+   }
+
/* Find the end of the data in case the report contains padded zero
 * bytes at the end causing an invalid length value.
 *
@@ -4839,7 +4847,7 @@ static void process_adv_report(struct hc
 */
conn = check_pending_le_conn(hdev, bdaddr, bdaddr_type, type,
direct_addr);
-   if (conn && type == LE_ADV_IND) {
+   if (conn && type == LE_ADV_IND && len <= HCI_MAX_AD_LENGTH) {
/* Store report for later inclusion by
 * mgmt_device_connected
 */
@@ -4964,10 +4972,14 @@ static void hci_le_adv_report_evt(struct
struct hci_ev_le_advertising_info *ev = ptr;
s8 rssi;
 
-   rssi = ev->data[ev->length];
-   process_adv_report(hdev, ev->evt_type, >bdaddr,
-  ev->bdaddr_type, NULL, 0, rssi,
-  ev->data, ev->length);
+   if (ev->length <= HCI_MAX_AD_LENGTH) {
+   rssi = ev->data[ev->length];
+   process_adv_report(hdev, ev->evt_type, >bdaddr,
+  ev->bdaddr_type, NULL, 0, rssi,
+  ev->data, ev->length);
+   } else {
+   bt_dev_err(hdev, "Dropping invalid advertising data");
+   }
 
ptr += sizeof(*ev) + ev->length + 1;
}




[PATCH 4.4 03/16] Bluetooth: fix kernel oops in store_pending_adv_report

2020-10-16 Thread Greg Kroah-Hartman
From: Alain Michaud 

commit a2ec905d1e160a33b2e210e45ad30445ef26ce0e upstream.

Fix kernel oops observed when an ext adv data is larger than 31 bytes.

This can be reproduced by setting up an advertiser with advertisement
larger than 31 bytes.  The issue is not sensitive to the advertisement
content.  In particular, this was reproduced with an advertisement of
229 bytes filled with 'A'.  See stack trace below.

This is fixed by not catching ext_adv as legacy adv are only cached to
be able to concatenate a scanable adv with its scan response before
sending it up through mgmt.

With ext_adv, this is no longer necessary.

  general protection fault:  [#1] SMP PTI
  CPU: 6 PID: 205 Comm: kworker/u17:0 Not tainted 5.4.0-37-generic #41-Ubuntu
  Hardware name: Dell Inc. XPS 15 7590/0CF6RR, BIOS 1.7.0 05/11/2020
  Workqueue: hci0 hci_rx_work [bluetooth]
  RIP: 0010:hci_bdaddr_list_lookup+0x1e/0x40 [bluetooth]
  Code: ff ff e9 26 ff ff ff 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 07 48 89 e5 
48 39 c7 75 0a eb 24 48 8b 00 48 39 f8 74 1c 44 8b 06 <44> 39 40 10 75 ef 44 0f 
b7 4e 04 66 44 39 48 14 75 e3 38 50 16 75
  RSP: 0018:bc6a40493c70 EFLAGS: 00010286
  RAX: 4141414141414141 RBX: 001b RCX: 
  RDX:  RSI: 9903e76c100f RDI: 9904289d4b28
  RBP: bc6a40493c70 R08: 93570362 R09: 
  R10:  R11: 9904344eae38 R12: 9904289d4000
  R13:  R14: ffa3 R15: 9903e76c100f
  FS: () GS:99043458() knlGS:
  CS: 0010 DS:  ES:  CR0: 80050033
  CR2: 7feed125a000 CR3: 0001b860a003 CR4: 003606e0
  Call Trace:
process_adv_report+0x12e/0x560 [bluetooth]
hci_le_meta_evt+0x7b2/0xba0 [bluetooth]
hci_event_packet+0x1c29/0x2a90 [bluetooth]
hci_rx_work+0x19b/0x360 [bluetooth]
process_one_work+0x1eb/0x3b0
worker_thread+0x4d/0x400
kthread+0x104/0x140

Fixes: c215e9397b00 ("Bluetooth: Process extended ADV report event")
Reported-by: Andy Nguyen 
Reported-by: Linus Torvalds 
Reported-by: Balakrishna Godavarthi 
Signed-off-by: Alain Michaud 
Tested-by: Sonny Sasaka 
Acked-by: Marcel Holtmann 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 net/bluetooth/hci_event.c |   22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -1133,6 +1133,9 @@ static void store_pending_adv_report(str
 {
struct discovery_state *d = >discovery;
 
+   if (len > HCI_MAX_AD_LENGTH)
+   return;
+
bacpy(>last_adv_addr, bdaddr);
d->last_adv_addr_type = bdaddr_type;
d->last_adv_rssi = rssi;
@@ -4752,6 +4755,11 @@ static void process_adv_report(struct hc
u32 flags;
u8 *ptr, real_len;
 
+   if (len > HCI_MAX_AD_LENGTH) {
+   pr_err_ratelimited("legacy adv larger than 31 bytes");
+   return;
+   }
+
/* Find the end of the data in case the report contains padded zero
 * bytes at the end causing an invalid length value.
 *
@@ -4812,7 +4820,7 @@ static void process_adv_report(struct hc
 */
conn = check_pending_le_conn(hdev, bdaddr, bdaddr_type, type,
direct_addr);
-   if (conn && type == LE_ADV_IND) {
+   if (conn && type == LE_ADV_IND && len <= HCI_MAX_AD_LENGTH) {
/* Store report for later inclusion by
 * mgmt_device_connected
 */
@@ -4937,10 +4945,14 @@ static void hci_le_adv_report_evt(struct
struct hci_ev_le_advertising_info *ev = ptr;
s8 rssi;
 
-   rssi = ev->data[ev->length];
-   process_adv_report(hdev, ev->evt_type, >bdaddr,
-  ev->bdaddr_type, NULL, 0, rssi,
-  ev->data, ev->length);
+   if (ev->length <= HCI_MAX_AD_LENGTH) {
+   rssi = ev->data[ev->length];
+   process_adv_report(hdev, ev->evt_type, >bdaddr,
+  ev->bdaddr_type, NULL, 0, rssi,
+  ev->data, ev->length);
+   } else {
+   bt_dev_err(hdev, "Dropping invalid advertising data");
+   }
 
ptr += sizeof(*ev) + ev->length + 1;
}




[sparc64] kernel OOPS bisected from "lockdep: improve current->(hard|soft)irqs_enabled synchronisation with actual irq state"

2020-09-10 Thread Anatoly Pugachev
Hello!

The following git patch 044d0d6de9f50192f9697583504a382347ee95ca
(linux git master branch) introduced the following kernel OOPS upon
kernel boot on my sparc64 T5-2 ldom (VM):

$ uname -a
Linux ttip 5.9.0-rc2-00011-g044d0d6de9f5 #59 SMP Thu Sep 10 13:07:45
MSK 2020 sparc64 GNU/Linux

(OOPS is from the latest tag, but the same on commit above)
...
rcu: Hierarchical SRCU implementation.
smp: Bringing up secondary CPUs ...
[ cut here ]
WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:4875 check_flags+0x9c/0x2c0
DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4 #36
Call Trace:
[<004727a8>] __warn+0xa8/0x120
[<00472c10>] warn_slowpath_fmt+0x64/0x74
[<004e859c>] check_flags+0x9c/0x2c0
[<00c17ca0>] lock_is_held_type+0x20/0x140
[<005095f4>] rcu_read_lock_sched_held+0x54/0xa0
[<004ed4c0>] lock_acquire+0x120/0x480
[<00c21610>] _raw_spin_lock+0x30/0x60
[<009b9bdc>] p1275_cmd_direct+0x1c/0x60
[<009b9ab0>] prom_startcpu_cpuid+0x30/0x40
[<004427e4>] __cpu_up+0x184/0x3a0
[<00474600>] bringup_cpu+0x20/0x120
[<0047378c>] cpuhp_invoke_callback+0xec/0x340
[<004753d4>] cpu_up+0x154/0x220
[<00475c60>] bringup_nonboot_cpus+0x60/0xa0
[<00fbc338>] smp_init+0x28/0xa0
[<00fad3b4>] kernel_init_freeable+0x18c/0x300
irq event stamp: 5135
hardirqs last  enabled at (5135): [<00c21a28>]
_raw_spin_unlock_irqrestore+0x28/0x60
hardirqs last disabled at (5134): [<00c217e0>]
_raw_spin_lock_irqsave+0x20/0x80
softirqs last  enabled at (1474): [<00c245a0>] __do_softirq+0x4e0/0x560
softirqs last disabled at (1467): [<0042d394>]
do_softirq_own_stack+0x34/0x60
random: get_random_bytes called from __warn+0xc8/0x120 with crng_init=0
---[ end trace 4cf960ae85148e2e ]---
possible reason: unannotated irqs-off.
irq event stamp: 5135
hardirqs last  enabled at (5135): [<00c21a28>]
_raw_spin_unlock_irqrestore+0x28/0x60
hardirqs last disabled at (5134): [<00c217e0>]
_raw_spin_lock_irqsave+0x20/0x80
softirqs last  enabled at (1474): [<00c245a0>] __do_softirq+0x4e0/0x560
softirqs last disabled at (1467): [<0042d394>]
do_softirq_own_stack+0x34/0x60
smp: Brought up 1 node, 32 CPUs
devtmpfs: initialized
...

full boot log in [1], kernel config in [2]

linux-2.6$ git bisect log
git bisect start
# good: [d012a7190fc1fd72ed48911e77ca97ba4521bccd] Linux 5.9-rc2
git bisect good d012a7190fc1fd72ed48911e77ca97ba4521bccd
# bad: [34d4ddd359dbcdf6c5fb3f85a179243d7a1cb7f8] Merge tag
'linux-kselftest-5.9-rc5' of
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
git bisect bad 34d4ddd359dbcdf6c5fb3f85a179243d7a1cb7f8
# bad: [e1d0126ca3a66c284a02b083a42e2b39558002cd] Merge tag
'xfs-5.9-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
git bisect bad e1d0126ca3a66c284a02b083a42e2b39558002cd
# good: [24148d8648e37f8c15bedddfa50d14a31a0582c5] Merge tag
'io_uring-5.9-2020-08-28' of git://git.kernel.dk/linux-block
git bisect good 24148d8648e37f8c15bedddfa50d14a31a0582c5
# bad: [b69bea8a657b681442765b06be92a2607b1bd875] Merge tag
'locking-urgent-2020-08-30' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad b69bea8a657b681442765b06be92a2607b1bd875
# good: [20934c0de13b49a072fb1e0ca79fe0fe0e40eae5] usb: storage: Add
unusual_uas entry for Sony PSZ drives
git bisect good 20934c0de13b49a072fb1e0ca79fe0fe0e40eae5
# good: [c4011283a7d5d64a50991dd3baa9acdf3d49092c] Merge tag
'dma-mapping-5.9-2' of git://git.infradead.org/users/hch/dma-mapping
git bisect good c4011283a7d5d64a50991dd3baa9acdf3d49092c
# good: [8bb5021cc2ee5d5dd129a9f2f5ad2bb76eea297d] Merge tag
'powerpc-5.9-4' of
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
git bisect good 8bb5021cc2ee5d5dd129a9f2f5ad2bb76eea297d
# good: [00b0ed2d4997af6d0a93edef820386951fd66d94] locking/lockdep: Cleanup
git bisect good 00b0ed2d4997af6d0a93edef820386951fd66d94
# bad: [044d0d6de9f50192f9697583504a382347ee95ca] lockdep: Only trace IRQ edges
git bisect bad 044d0d6de9f50192f9697583504a382347ee95ca
# good: [021c109330ebc1f54b546c63a078ea3c31356ecb] arm64: Implement
arch_irqs_disabled()
git bisect good 021c109330ebc1f54b546c63a078ea3c31356ecb
# good: [99dc56feb7932020502d40107a712fa302b32082] mips: Implement
arch_irqs_disabled()
git bisect good 99dc56feb7932020502d40107a712fa302b32082
# first bad commit: [044d0d6de9f50192f9697583504a382347ee95ca]
lockdep: Only trace IRQ edges


1. https://github.com/mator/sparc64-dmesg/blob/master/dmesg-5.9.0-rc4
2. https://github.com/mator/sparc64-dmesg/blob/master/config-5.9.0-rc4.gz


Re: [sparc64] kernel OOPS bisected from "lockdep: improve current->(hard|soft)irqs_enabled synchronisation with actual irq state"

2020-09-10 Thread Anatoly Pugachev
On Thu, Sep 10, 2020 at 4:40 PM  wrote:
>
> On Thu, Sep 10, 2020 at 02:43:13PM +0300, Anatoly Pugachev wrote:
> > Hello!
> >
> > The following git patch 044d0d6de9f50192f9697583504a382347ee95ca
> > (linux git master branch) introduced the following kernel OOPS upon
> > kernel boot on my sparc64 T5-2 ldom (VM):
>
> https://lkml.kernel.org/r/20200908154157.gv1362...@hirez.programming.kicks-ass.net

Peter, thanks!

That fixes the issue for me.


Re: [sparc64] kernel OOPS bisected from "lockdep: improve current->(hard|soft)irqs_enabled synchronisation with actual irq state"

2020-09-10 Thread peterz
On Thu, Sep 10, 2020 at 02:43:13PM +0300, Anatoly Pugachev wrote:
> Hello!
> 
> The following git patch 044d0d6de9f50192f9697583504a382347ee95ca
> (linux git master branch) introduced the following kernel OOPS upon
> kernel boot on my sparc64 T5-2 ldom (VM):

https://lkml.kernel.org/r/20200908154157.gv1362...@hirez.programming.kicks-ass.net


[PATCH 5.4 073/214] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-09-01 Thread Greg Kroah-Hartman
From: Marc Zyngier 

[ Upstream commit 63ef91f24f9bfc70b6446319f6cabfd094481372 ]

Booting a recent kernel on a rk3399-based system (nanopc-t4),
equipped with a recent u-boot and ATF results in an Oops due
to a NULL pointer dereference.

This turns out to be due to the rk3399-dmc driver looking for
an *undocumented* property (rockchip,pmu), and happily using
a NULL pointer when the property isn't there.

Instead, make most of what was brought in with 9173c5ceb035
("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
to TF-A.") conditioned on finding this property in the device-tree,
preventing the driver from exploding.

Cc: sta...@vger.kernel.org
Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down 
parameters to TF-A.")
Signed-off-by: Marc Zyngier 
Signed-off-by: Chanwoo Choi 
Signed-off-by: Sasha Levin 
---
 drivers/devfreq/rk3399_dmc.c | 42 
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
index 24f04f78285b7..027769e39f9b8 100644
--- a/drivers/devfreq/rk3399_dmc.c
+++ b/drivers/devfreq/rk3399_dmc.c
@@ -95,18 +95,20 @@ static int rk3399_dmcfreq_target(struct device *dev, 
unsigned long *freq,
 
mutex_lock(>lock);
 
-   if (target_rate >= dmcfreq->odt_dis_freq)
-   odt_enable = true;
-
-   /*
-* This makes a SMC call to the TF-A to set the DDR PD (power-down)
-* timings and to enable or disable the ODT (on-die termination)
-* resistors.
-*/
-   arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, dmcfreq->odt_pd_arg0,
- dmcfreq->odt_pd_arg1,
- ROCKCHIP_SIP_CONFIG_DRAM_SET_ODT_PD,
- odt_enable, 0, 0, 0, );
+   if (dmcfreq->regmap_pmu) {
+   if (target_rate >= dmcfreq->odt_dis_freq)
+   odt_enable = true;
+
+   /*
+* This makes a SMC call to the TF-A to set the DDR PD
+* (power-down) timings and to enable or disable the
+* ODT (on-die termination) resistors.
+*/
+   arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, dmcfreq->odt_pd_arg0,
+ dmcfreq->odt_pd_arg1,
+ ROCKCHIP_SIP_CONFIG_DRAM_SET_ODT_PD,
+ odt_enable, 0, 0, 0, );
+   }
 
/*
 * If frequency scaling from low to high, adjust voltage first.
@@ -371,13 +373,14 @@ static int rk3399_dmcfreq_probe(struct platform_device 
*pdev)
}
 
node = of_parse_phandle(np, "rockchip,pmu", 0);
-   if (node) {
-   data->regmap_pmu = syscon_node_to_regmap(node);
-   of_node_put(node);
-   if (IS_ERR(data->regmap_pmu)) {
-   ret = PTR_ERR(data->regmap_pmu);
-   goto err_edev;
-   }
+   if (!node)
+   goto no_pmu;
+
+   data->regmap_pmu = syscon_node_to_regmap(node);
+   of_node_put(node);
+   if (IS_ERR(data->regmap_pmu)) {
+   ret = PTR_ERR(data->regmap_pmu);
+   goto err_edev;
}
 
regmap_read(data->regmap_pmu, RK3399_PMUGRF_OS_REG2, );
@@ -399,6 +402,7 @@ static int rk3399_dmcfreq_probe(struct platform_device 
*pdev)
goto err_edev;
};
 
+no_pmu:
arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, 0, 0,
  ROCKCHIP_SIP_CONFIG_DRAM_INIT,
  0, 0, 0, 0, );
-- 
2.25.1





[PATCH 5.7 363/393] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-08-17 Thread Greg Kroah-Hartman
From: Marc Zyngier 

commit 63ef91f24f9bfc70b6446319f6cabfd094481372 upstream.

Booting a recent kernel on a rk3399-based system (nanopc-t4),
equipped with a recent u-boot and ATF results in an Oops due
to a NULL pointer dereference.

This turns out to be due to the rk3399-dmc driver looking for
an *undocumented* property (rockchip,pmu), and happily using
a NULL pointer when the property isn't there.

Instead, make most of what was brought in with 9173c5ceb035
("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
to TF-A.") conditioned on finding this property in the device-tree,
preventing the driver from exploding.

Cc: sta...@vger.kernel.org
Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down 
parameters to TF-A.")
Signed-off-by: Marc Zyngier 
Signed-off-by: Chanwoo Choi 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/devfreq/rk3399_dmc.c |   42 +++---
 1 file changed, 23 insertions(+), 19 deletions(-)

--- a/drivers/devfreq/rk3399_dmc.c
+++ b/drivers/devfreq/rk3399_dmc.c
@@ -95,18 +95,20 @@ static int rk3399_dmcfreq_target(struct
 
mutex_lock(>lock);
 
-   if (target_rate >= dmcfreq->odt_dis_freq)
-   odt_enable = true;
-
-   /*
-* This makes a SMC call to the TF-A to set the DDR PD (power-down)
-* timings and to enable or disable the ODT (on-die termination)
-* resistors.
-*/
-   arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, dmcfreq->odt_pd_arg0,
- dmcfreq->odt_pd_arg1,
- ROCKCHIP_SIP_CONFIG_DRAM_SET_ODT_PD,
- odt_enable, 0, 0, 0, );
+   if (dmcfreq->regmap_pmu) {
+   if (target_rate >= dmcfreq->odt_dis_freq)
+   odt_enable = true;
+
+   /*
+* This makes a SMC call to the TF-A to set the DDR PD
+* (power-down) timings and to enable or disable the
+* ODT (on-die termination) resistors.
+*/
+   arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, dmcfreq->odt_pd_arg0,
+ dmcfreq->odt_pd_arg1,
+ ROCKCHIP_SIP_CONFIG_DRAM_SET_ODT_PD,
+ odt_enable, 0, 0, 0, );
+   }
 
/*
 * If frequency scaling from low to high, adjust voltage first.
@@ -371,13 +373,14 @@ static int rk3399_dmcfreq_probe(struct p
}
 
node = of_parse_phandle(np, "rockchip,pmu", 0);
-   if (node) {
-   data->regmap_pmu = syscon_node_to_regmap(node);
-   of_node_put(node);
-   if (IS_ERR(data->regmap_pmu)) {
-   ret = PTR_ERR(data->regmap_pmu);
-   goto err_edev;
-   }
+   if (!node)
+   goto no_pmu;
+
+   data->regmap_pmu = syscon_node_to_regmap(node);
+   of_node_put(node);
+   if (IS_ERR(data->regmap_pmu)) {
+   ret = PTR_ERR(data->regmap_pmu);
+   goto err_edev;
}
 
regmap_read(data->regmap_pmu, RK3399_PMUGRF_OS_REG2, );
@@ -399,6 +402,7 @@ static int rk3399_dmcfreq_probe(struct p
goto err_edev;
};
 
+no_pmu:
arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, 0, 0,
  ROCKCHIP_SIP_CONFIG_DRAM_INIT,
  0, 0, 0, 0, );




[PATCH 5.8 434/464] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-08-17 Thread Greg Kroah-Hartman
From: Marc Zyngier 

commit 63ef91f24f9bfc70b6446319f6cabfd094481372 upstream.

Booting a recent kernel on a rk3399-based system (nanopc-t4),
equipped with a recent u-boot and ATF results in an Oops due
to a NULL pointer dereference.

This turns out to be due to the rk3399-dmc driver looking for
an *undocumented* property (rockchip,pmu), and happily using
a NULL pointer when the property isn't there.

Instead, make most of what was brought in with 9173c5ceb035
("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
to TF-A.") conditioned on finding this property in the device-tree,
preventing the driver from exploding.

Cc: sta...@vger.kernel.org
Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down 
parameters to TF-A.")
Signed-off-by: Marc Zyngier 
Signed-off-by: Chanwoo Choi 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/devfreq/rk3399_dmc.c |   42 +++---
 1 file changed, 23 insertions(+), 19 deletions(-)

--- a/drivers/devfreq/rk3399_dmc.c
+++ b/drivers/devfreq/rk3399_dmc.c
@@ -95,18 +95,20 @@ static int rk3399_dmcfreq_target(struct
 
mutex_lock(>lock);
 
-   if (target_rate >= dmcfreq->odt_dis_freq)
-   odt_enable = true;
-
-   /*
-* This makes a SMC call to the TF-A to set the DDR PD (power-down)
-* timings and to enable or disable the ODT (on-die termination)
-* resistors.
-*/
-   arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, dmcfreq->odt_pd_arg0,
- dmcfreq->odt_pd_arg1,
- ROCKCHIP_SIP_CONFIG_DRAM_SET_ODT_PD,
- odt_enable, 0, 0, 0, );
+   if (dmcfreq->regmap_pmu) {
+   if (target_rate >= dmcfreq->odt_dis_freq)
+   odt_enable = true;
+
+   /*
+* This makes a SMC call to the TF-A to set the DDR PD
+* (power-down) timings and to enable or disable the
+* ODT (on-die termination) resistors.
+*/
+   arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, dmcfreq->odt_pd_arg0,
+ dmcfreq->odt_pd_arg1,
+ ROCKCHIP_SIP_CONFIG_DRAM_SET_ODT_PD,
+ odt_enable, 0, 0, 0, );
+   }
 
/*
 * If frequency scaling from low to high, adjust voltage first.
@@ -371,13 +373,14 @@ static int rk3399_dmcfreq_probe(struct p
}
 
node = of_parse_phandle(np, "rockchip,pmu", 0);
-   if (node) {
-   data->regmap_pmu = syscon_node_to_regmap(node);
-   of_node_put(node);
-   if (IS_ERR(data->regmap_pmu)) {
-   ret = PTR_ERR(data->regmap_pmu);
-   goto err_edev;
-   }
+   if (!node)
+   goto no_pmu;
+
+   data->regmap_pmu = syscon_node_to_regmap(node);
+   of_node_put(node);
+   if (IS_ERR(data->regmap_pmu)) {
+   ret = PTR_ERR(data->regmap_pmu);
+   goto err_edev;
}
 
regmap_read(data->regmap_pmu, RK3399_PMUGRF_OS_REG2, );
@@ -399,6 +402,7 @@ static int rk3399_dmcfreq_probe(struct p
goto err_edev;
};
 
+no_pmu:
arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, 0, 0,
  ROCKCHIP_SIP_CONFIG_DRAM_INIT,
  0, 0, 0, 0, );




[PATCH 4.19 43/56] Bluetooth: fix kernel oops in store_pending_adv_report

2020-08-03 Thread Greg Kroah-Hartman
From: Alain Michaud 

[ Upstream commit a2ec905d1e160a33b2e210e45ad30445ef26ce0e ]

Fix kernel oops observed when an ext adv data is larger than 31 bytes.

This can be reproduced by setting up an advertiser with advertisement
larger than 31 bytes.  The issue is not sensitive to the advertisement
content.  In particular, this was reproduced with an advertisement of
229 bytes filled with 'A'.  See stack trace below.

This is fixed by not catching ext_adv as legacy adv are only cached to
be able to concatenate a scanable adv with its scan response before
sending it up through mgmt.

With ext_adv, this is no longer necessary.

  general protection fault:  [#1] SMP PTI
  CPU: 6 PID: 205 Comm: kworker/u17:0 Not tainted 5.4.0-37-generic #41-Ubuntu
  Hardware name: Dell Inc. XPS 15 7590/0CF6RR, BIOS 1.7.0 05/11/2020
  Workqueue: hci0 hci_rx_work [bluetooth]
  RIP: 0010:hci_bdaddr_list_lookup+0x1e/0x40 [bluetooth]
  Code: ff ff e9 26 ff ff ff 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 07 48 89 e5 
48 39 c7 75 0a eb 24 48 8b 00 48 39 f8 74 1c 44 8b 06 <44> 39 40 10 75 ef 44 0f 
b7 4e 04 66 44 39 48 14 75 e3 38 50 16 75
  RSP: 0018:bc6a40493c70 EFLAGS: 00010286
  RAX: 4141414141414141 RBX: 001b RCX: 
  RDX:  RSI: 9903e76c100f RDI: 9904289d4b28
  RBP: bc6a40493c70 R08: 93570362 R09: 
  R10:  R11: 9904344eae38 R12: 9904289d4000
  R13:  R14: ffa3 R15: 9903e76c100f
  FS: () GS:99043458() knlGS:
  CS: 0010 DS:  ES:  CR0: 80050033
  CR2: 7feed125a000 CR3: 0001b860a003 CR4: 003606e0
  Call Trace:
process_adv_report+0x12e/0x560 [bluetooth]
hci_le_meta_evt+0x7b2/0xba0 [bluetooth]
hci_event_packet+0x1c29/0x2a90 [bluetooth]
hci_rx_work+0x19b/0x360 [bluetooth]
process_one_work+0x1eb/0x3b0
worker_thread+0x4d/0x400
kthread+0x104/0x140

Fixes: c215e9397b00 ("Bluetooth: Process extended ADV report event")
Reported-by: Andy Nguyen 
Reported-by: Linus Torvalds 
Reported-by: Balakrishna Godavarthi 
Signed-off-by: Alain Michaud 
Tested-by: Sonny Sasaka 
Acked-by: Marcel Holtmann 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin 
---
 net/bluetooth/hci_event.c | 26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
index a044e6bb12b84..cdb92b129906f 100644
--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -1229,6 +1229,9 @@ static void store_pending_adv_report(struct hci_dev 
*hdev, bdaddr_t *bdaddr,
 {
struct discovery_state *d = >discovery;
 
+   if (len > HCI_MAX_AD_LENGTH)
+   return;
+
bacpy(>last_adv_addr, bdaddr);
d->last_adv_addr_type = bdaddr_type;
d->last_adv_rssi = rssi;
@@ -5116,7 +5119,8 @@ static struct hci_conn *check_pending_le_conn(struct 
hci_dev *hdev,
 
 static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr,
   u8 bdaddr_type, bdaddr_t *direct_addr,
-  u8 direct_addr_type, s8 rssi, u8 *data, u8 len)
+  u8 direct_addr_type, s8 rssi, u8 *data, u8 len,
+  bool ext_adv)
 {
struct discovery_state *d = >discovery;
struct smp_irk *irk;
@@ -5138,6 +5142,11 @@ static void process_adv_report(struct hci_dev *hdev, u8 
type, bdaddr_t *bdaddr,
return;
}
 
+   if (!ext_adv && len > HCI_MAX_AD_LENGTH) {
+   bt_dev_err_ratelimited(hdev, "legacy adv larger than 31 bytes");
+   return;
+   }
+
/* Find the end of the data in case the report contains padded zero
 * bytes at the end causing an invalid length value.
 *
@@ -5197,7 +5206,7 @@ static void process_adv_report(struct hci_dev *hdev, u8 
type, bdaddr_t *bdaddr,
 */
conn = check_pending_le_conn(hdev, bdaddr, bdaddr_type, type,
direct_addr);
-   if (conn && type == LE_ADV_IND) {
+   if (!ext_adv && conn && type == LE_ADV_IND && len <= HCI_MAX_AD_LENGTH) 
{
/* Store report for later inclusion by
 * mgmt_device_connected
 */
@@ -5251,7 +5260,7 @@ static void process_adv_report(struct hci_dev *hdev, u8 
type, bdaddr_t *bdaddr,
 * event or send an immediate device found event if the data
 * should not be stored for later.
 */
-   if (!has_pending_adv_report(hdev)) {
+   if (!ext_adv && !has_pending_adv_report(hdev)) {
/* If the report will trigger a SCAN_REQ store it for
 * later merging.
 */
@@ -5286,7 +5295,8 @@ static void process

[PATCH 5.4 64/90] Bluetooth: fix kernel oops in store_pending_adv_report

2020-08-03 Thread Greg Kroah-Hartman
From: Alain Michaud 

[ Upstream commit a2ec905d1e160a33b2e210e45ad30445ef26ce0e ]

Fix kernel oops observed when an ext adv data is larger than 31 bytes.

This can be reproduced by setting up an advertiser with advertisement
larger than 31 bytes.  The issue is not sensitive to the advertisement
content.  In particular, this was reproduced with an advertisement of
229 bytes filled with 'A'.  See stack trace below.

This is fixed by not catching ext_adv as legacy adv are only cached to
be able to concatenate a scanable adv with its scan response before
sending it up through mgmt.

With ext_adv, this is no longer necessary.

  general protection fault:  [#1] SMP PTI
  CPU: 6 PID: 205 Comm: kworker/u17:0 Not tainted 5.4.0-37-generic #41-Ubuntu
  Hardware name: Dell Inc. XPS 15 7590/0CF6RR, BIOS 1.7.0 05/11/2020
  Workqueue: hci0 hci_rx_work [bluetooth]
  RIP: 0010:hci_bdaddr_list_lookup+0x1e/0x40 [bluetooth]
  Code: ff ff e9 26 ff ff ff 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 07 48 89 e5 
48 39 c7 75 0a eb 24 48 8b 00 48 39 f8 74 1c 44 8b 06 <44> 39 40 10 75 ef 44 0f 
b7 4e 04 66 44 39 48 14 75 e3 38 50 16 75
  RSP: 0018:bc6a40493c70 EFLAGS: 00010286
  RAX: 4141414141414141 RBX: 001b RCX: 
  RDX:  RSI: 9903e76c100f RDI: 9904289d4b28
  RBP: bc6a40493c70 R08: 93570362 R09: 
  R10:  R11: 9904344eae38 R12: 9904289d4000
  R13:  R14: ffa3 R15: 9903e76c100f
  FS: () GS:99043458() knlGS:
  CS: 0010 DS:  ES:  CR0: 80050033
  CR2: 7feed125a000 CR3: 0001b860a003 CR4: 003606e0
  Call Trace:
process_adv_report+0x12e/0x560 [bluetooth]
hci_le_meta_evt+0x7b2/0xba0 [bluetooth]
hci_event_packet+0x1c29/0x2a90 [bluetooth]
hci_rx_work+0x19b/0x360 [bluetooth]
process_one_work+0x1eb/0x3b0
worker_thread+0x4d/0x400
kthread+0x104/0x140

Fixes: c215e9397b00 ("Bluetooth: Process extended ADV report event")
Reported-by: Andy Nguyen 
Reported-by: Linus Torvalds 
Reported-by: Balakrishna Godavarthi 
Signed-off-by: Alain Michaud 
Tested-by: Sonny Sasaka 
Acked-by: Marcel Holtmann 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin 
---
 net/bluetooth/hci_event.c | 26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
index 88cd410e57289..44385252d7b6a 100644
--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -1274,6 +1274,9 @@ static void store_pending_adv_report(struct hci_dev 
*hdev, bdaddr_t *bdaddr,
 {
struct discovery_state *d = >discovery;
 
+   if (len > HCI_MAX_AD_LENGTH)
+   return;
+
bacpy(>last_adv_addr, bdaddr);
d->last_adv_addr_type = bdaddr_type;
d->last_adv_rssi = rssi;
@@ -5231,7 +5234,8 @@ static struct hci_conn *check_pending_le_conn(struct 
hci_dev *hdev,
 
 static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr,
   u8 bdaddr_type, bdaddr_t *direct_addr,
-  u8 direct_addr_type, s8 rssi, u8 *data, u8 len)
+  u8 direct_addr_type, s8 rssi, u8 *data, u8 len,
+  bool ext_adv)
 {
struct discovery_state *d = >discovery;
struct smp_irk *irk;
@@ -5253,6 +5257,11 @@ static void process_adv_report(struct hci_dev *hdev, u8 
type, bdaddr_t *bdaddr,
return;
}
 
+   if (!ext_adv && len > HCI_MAX_AD_LENGTH) {
+   bt_dev_err_ratelimited(hdev, "legacy adv larger than 31 bytes");
+   return;
+   }
+
/* Find the end of the data in case the report contains padded zero
 * bytes at the end causing an invalid length value.
 *
@@ -5312,7 +5321,7 @@ static void process_adv_report(struct hci_dev *hdev, u8 
type, bdaddr_t *bdaddr,
 */
conn = check_pending_le_conn(hdev, bdaddr, bdaddr_type, type,
direct_addr);
-   if (conn && type == LE_ADV_IND) {
+   if (!ext_adv && conn && type == LE_ADV_IND && len <= HCI_MAX_AD_LENGTH) 
{
/* Store report for later inclusion by
 * mgmt_device_connected
 */
@@ -5366,7 +5375,7 @@ static void process_adv_report(struct hci_dev *hdev, u8 
type, bdaddr_t *bdaddr,
 * event or send an immediate device found event if the data
 * should not be stored for later.
 */
-   if (!has_pending_adv_report(hdev)) {
+   if (!ext_adv && !has_pending_adv_report(hdev)) {
/* If the report will trigger a SCAN_REQ store it for
 * later merging.
 */
@@ -5401,7 +5410,8 @@ static void process

[PATCH 5.7 084/120] Bluetooth: fix kernel oops in store_pending_adv_report

2020-08-03 Thread Greg Kroah-Hartman
From: Alain Michaud 

[ Upstream commit a2ec905d1e160a33b2e210e45ad30445ef26ce0e ]

Fix kernel oops observed when an ext adv data is larger than 31 bytes.

This can be reproduced by setting up an advertiser with advertisement
larger than 31 bytes.  The issue is not sensitive to the advertisement
content.  In particular, this was reproduced with an advertisement of
229 bytes filled with 'A'.  See stack trace below.

This is fixed by not catching ext_adv as legacy adv are only cached to
be able to concatenate a scanable adv with its scan response before
sending it up through mgmt.

With ext_adv, this is no longer necessary.

  general protection fault:  [#1] SMP PTI
  CPU: 6 PID: 205 Comm: kworker/u17:0 Not tainted 5.4.0-37-generic #41-Ubuntu
  Hardware name: Dell Inc. XPS 15 7590/0CF6RR, BIOS 1.7.0 05/11/2020
  Workqueue: hci0 hci_rx_work [bluetooth]
  RIP: 0010:hci_bdaddr_list_lookup+0x1e/0x40 [bluetooth]
  Code: ff ff e9 26 ff ff ff 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 07 48 89 e5 
48 39 c7 75 0a eb 24 48 8b 00 48 39 f8 74 1c 44 8b 06 <44> 39 40 10 75 ef 44 0f 
b7 4e 04 66 44 39 48 14 75 e3 38 50 16 75
  RSP: 0018:bc6a40493c70 EFLAGS: 00010286
  RAX: 4141414141414141 RBX: 001b RCX: 
  RDX:  RSI: 9903e76c100f RDI: 9904289d4b28
  RBP: bc6a40493c70 R08: 93570362 R09: 
  R10:  R11: 9904344eae38 R12: 9904289d4000
  R13:  R14: ffa3 R15: 9903e76c100f
  FS: () GS:99043458() knlGS:
  CS: 0010 DS:  ES:  CR0: 80050033
  CR2: 7feed125a000 CR3: 0001b860a003 CR4: 003606e0
  Call Trace:
process_adv_report+0x12e/0x560 [bluetooth]
hci_le_meta_evt+0x7b2/0xba0 [bluetooth]
hci_event_packet+0x1c29/0x2a90 [bluetooth]
hci_rx_work+0x19b/0x360 [bluetooth]
process_one_work+0x1eb/0x3b0
worker_thread+0x4d/0x400
kthread+0x104/0x140

Fixes: c215e9397b00 ("Bluetooth: Process extended ADV report event")
Reported-by: Andy Nguyen 
Reported-by: Linus Torvalds 
Reported-by: Balakrishna Godavarthi 
Signed-off-by: Alain Michaud 
Tested-by: Sonny Sasaka 
Acked-by: Marcel Holtmann 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin 
---
 net/bluetooth/hci_event.c | 26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
index b11f8d391ad82..fe75f435171ce 100644
--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -1305,6 +1305,9 @@ static void store_pending_adv_report(struct hci_dev 
*hdev, bdaddr_t *bdaddr,
 {
struct discovery_state *d = >discovery;
 
+   if (len > HCI_MAX_AD_LENGTH)
+   return;
+
bacpy(>last_adv_addr, bdaddr);
d->last_adv_addr_type = bdaddr_type;
d->last_adv_rssi = rssi;
@@ -5317,7 +5320,8 @@ static struct hci_conn *check_pending_le_conn(struct 
hci_dev *hdev,
 
 static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr,
   u8 bdaddr_type, bdaddr_t *direct_addr,
-  u8 direct_addr_type, s8 rssi, u8 *data, u8 len)
+  u8 direct_addr_type, s8 rssi, u8 *data, u8 len,
+  bool ext_adv)
 {
struct discovery_state *d = >discovery;
struct smp_irk *irk;
@@ -5339,6 +5343,11 @@ static void process_adv_report(struct hci_dev *hdev, u8 
type, bdaddr_t *bdaddr,
return;
}
 
+   if (!ext_adv && len > HCI_MAX_AD_LENGTH) {
+   bt_dev_err_ratelimited(hdev, "legacy adv larger than 31 bytes");
+   return;
+   }
+
/* Find the end of the data in case the report contains padded zero
 * bytes at the end causing an invalid length value.
 *
@@ -5398,7 +5407,7 @@ static void process_adv_report(struct hci_dev *hdev, u8 
type, bdaddr_t *bdaddr,
 */
conn = check_pending_le_conn(hdev, bdaddr, bdaddr_type, type,
direct_addr);
-   if (conn && type == LE_ADV_IND) {
+   if (!ext_adv && conn && type == LE_ADV_IND && len <= HCI_MAX_AD_LENGTH) 
{
/* Store report for later inclusion by
 * mgmt_device_connected
 */
@@ -5452,7 +5461,7 @@ static void process_adv_report(struct hci_dev *hdev, u8 
type, bdaddr_t *bdaddr,
 * event or send an immediate device found event if the data
 * should not be stored for later.
 */
-   if (!has_pending_adv_report(hdev)) {
+   if (!ext_adv && !has_pending_adv_report(hdev)) {
/* If the report will trigger a SCAN_REQ store it for
 * later merging.
 */
@@ -5487,7 +5496,8 @@ static void process

[PATCH 5.4 134/138] ASoC: topology: fix kernel oops on route addition error

2020-07-27 Thread Greg Kroah-Hartman
From: Pierre-Louis Bossart 

commit 6f0307df83f2aa6bdf656c2219c89ce96502d20e upstream.

When errors happens while loading graph components, the kernel oopses
while trying to remove all topology components. This can be
root-caused to a list pointing to memory that was already freed on
error.

remove_route() is already called on errors and will perform the
required cleanups so there's no need to free the route memory in
soc_tplg_dapm_graph_elems_load() if the route was added to the
list. We do however want to free the routes allocated but not added to
the list.

Fixes: 7df04ea7a31ea ('ASoC: topology: modify dapm route loading routine and 
add dapm route unloading')
Signed-off-by: Pierre-Louis Bossart 
Reviewed-by: Ranjani Sridharan 
Reviewed-by: Kai Vehmanen 
Link: 
https://lore.kernel.org/r/20200707203749.113883-2-pierre-louis.boss...@linux.intel.com
Signed-off-by: Mark Brown 
Signed-off-by: Greg Kroah-Hartman 

---
 sound/soc/soc-topology.c |   22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

--- a/sound/soc/soc-topology.c
+++ b/sound/soc/soc-topology.c
@@ -1284,17 +1284,29 @@ static int soc_tplg_dapm_graph_elems_loa
list_add([i]->dobj.list, >comp->dobj_list);
 
ret = soc_tplg_add_route(tplg, routes[i]);
-   if (ret < 0)
+   if (ret < 0) {
+   /*
+* this route was added to the list, it will
+* be freed in remove_route() so increment the
+* counter to skip it in the error handling
+* below.
+*/
+   i++;
break;
+   }
 
/* add route, but keep going if some fail */
snd_soc_dapm_add_routes(dapm, routes[i], 1);
}
 
-   /* free memory allocated for all dapm routes in case of error */
-   if (ret < 0)
-   for (i = 0; i < count ; i++)
-   kfree(routes[i]);
+   /*
+* free memory allocated for all dapm routes not added to the
+* list in case of error
+*/
+   if (ret < 0) {
+   while (i < count)
+   kfree(routes[i++]);
+   }
 
/*
 * free pointer to array of dapm routes as this is no longer needed.




[PATCH 5.7 174/179] ASoC: topology: fix kernel oops on route addition error

2020-07-27 Thread Greg Kroah-Hartman
From: Pierre-Louis Bossart 

commit 6f0307df83f2aa6bdf656c2219c89ce96502d20e upstream.

When errors happens while loading graph components, the kernel oopses
while trying to remove all topology components. This can be
root-caused to a list pointing to memory that was already freed on
error.

remove_route() is already called on errors and will perform the
required cleanups so there's no need to free the route memory in
soc_tplg_dapm_graph_elems_load() if the route was added to the
list. We do however want to free the routes allocated but not added to
the list.

Fixes: 7df04ea7a31ea ('ASoC: topology: modify dapm route loading routine and 
add dapm route unloading')
Signed-off-by: Pierre-Louis Bossart 
Reviewed-by: Ranjani Sridharan 
Reviewed-by: Kai Vehmanen 
Link: 
https://lore.kernel.org/r/20200707203749.113883-2-pierre-louis.boss...@linux.intel.com
Signed-off-by: Mark Brown 
Signed-off-by: Greg Kroah-Hartman 

---
 sound/soc/soc-topology.c |   22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

--- a/sound/soc/soc-topology.c
+++ b/sound/soc/soc-topology.c
@@ -1285,17 +1285,29 @@ static int soc_tplg_dapm_graph_elems_loa
list_add([i]->dobj.list, >comp->dobj_list);
 
ret = soc_tplg_add_route(tplg, routes[i]);
-   if (ret < 0)
+   if (ret < 0) {
+   /*
+* this route was added to the list, it will
+* be freed in remove_route() so increment the
+* counter to skip it in the error handling
+* below.
+*/
+   i++;
break;
+   }
 
/* add route, but keep going if some fail */
snd_soc_dapm_add_routes(dapm, routes[i], 1);
}
 
-   /* free memory allocated for all dapm routes in case of error */
-   if (ret < 0)
-   for (i = 0; i < count ; i++)
-   kfree(routes[i]);
+   /*
+* free memory allocated for all dapm routes not added to the
+* list in case of error
+*/
+   if (ret < 0) {
+   while (i < count)
+   kfree(routes[i++]);
+   }
 
/*
 * free pointer to array of dapm routes as this is no longer needed.




Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'

2020-07-16 Thread Andy Shevchenko
On Thu, Jul 16, 2020 at 09:22:11PM +0300, Maxim Levitsky wrote:
> On Thu, 2020-07-16 at 21:21 +0300, Andy Shevchenko wrote:
> > On Thu, Jul 16, 2020 at 09:00:00PM +0300, Maxim Levitsky wrote:
> > > On Thu, 2020-07-16 at 18:47 +0300, Andy Shevchenko wrote:

...

> > > It works (no more oops)
> > 
> > Thanks for testing. I'm about to send formal patch, can you give your 
> > Tested-by tag there then?
> 
> Of course.
> 
> Tested-by: Maxim Levitsky 

Thanks, I meant there [1] :-)

[1]: 
https://lore.kernel.org/lkml/20200716182747.54929-1-andriy.shevche...@linux.intel.com/T/#u

-- 
With Best Regards,
Andy Shevchenko




Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'

2020-07-16 Thread Andy Shevchenko
On Thu, Jul 16, 2020 at 09:00:00PM +0300, Maxim Levitsky wrote:
> On Thu, 2020-07-16 at 18:47 +0300, Andy Shevchenko wrote:
> > On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote:
> > > Hi!
> > > 
> > > Few days ago I bisected a regression on 5.8 kernel:
> > > 
> > > I have nvidia rtx 2070s and its USB type C port driver (which is open 
> > > source)
> > > started to crash on load:
> > 
> > ...
> > 
> > > Reverting the commit helped fix this oops.
> > > 
> > > My .config attached.
> > > If any more info is needed I'll be happy to provide it,
> > > and of course test patches.
> > 
> > Can you test below?
> > 
> > diff --git a/drivers/base/property.c b/drivers/base/property.c
> > index 1e6d75e65938..d58aa98fe964 100644
> > --- a/drivers/base/property.c
> > +++ b/drivers/base/property.c
> > @@ -721,7 +721,7 @@ struct fwnode_handle *device_get_next_child_node(struct 
> > device *dev,
> > return next;
> >  
> > /* When no more children in primary, continue with secondary */
> > -   if (!IS_ERR_OR_NULL(fwnode->secondary))
> > +   if (fwnode && !IS_ERR_OR_NULL(fwnode->secondary))
> > next = fwnode_get_next_child_node(fwnode->secondary, child);
> >  
> > return next;
> 
> It works (no more oops)

Thanks for testing. I'm about to send formal patch, can you give your Tested-by 
tag there then?

-- 
With Best Regards,
Andy Shevchenko




Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'

2020-07-16 Thread Maxim Levitsky
On Thu, 2020-07-16 at 21:21 +0300, Andy Shevchenko wrote:
> On Thu, Jul 16, 2020 at 09:00:00PM +0300, Maxim Levitsky wrote:
> > On Thu, 2020-07-16 at 18:47 +0300, Andy Shevchenko wrote:
> > > On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote:
> > > > Hi!
> > > > 
> > > > Few days ago I bisected a regression on 5.8 kernel:
> > > > 
> > > > I have nvidia rtx 2070s and its USB type C port driver (which is open 
> > > > source)
> > > > started to crash on load:
> > > 
> > > ...
> > > 
> > > > Reverting the commit helped fix this oops.
> > > > 
> > > > My .config attached.
> > > > If any more info is needed I'll be happy to provide it,
> > > > and of course test patches.
> > > 
> > > Can you test below?
> > > 
> > > diff --git a/drivers/base/property.c b/drivers/base/property.c
> > > index 1e6d75e65938..d58aa98fe964 100644
> > > --- a/drivers/base/property.c
> > > +++ b/drivers/base/property.c
> > > @@ -721,7 +721,7 @@ struct fwnode_handle 
> > > *device_get_next_child_node(struct device *dev,
> > >   return next;
> > >  
> > >   /* When no more children in primary, continue with secondary */
> > > - if (!IS_ERR_OR_NULL(fwnode->secondary))
> > > + if (fwnode && !IS_ERR_OR_NULL(fwnode->secondary))
> > >   next = fwnode_get_next_child_node(fwnode->secondary, child);
> > >  
> > >   return next;
> > 
> > It works (no more oops)
> 
> Thanks for testing. I'm about to send formal patch, can you give your 
> Tested-by tag there then?

Of course.

Tested-by: Maxim Levitsky 

Best regards,
Maxim Levitsky
> 




Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'

2020-07-16 Thread Maxim Levitsky
On Thu, 2020-07-16 at 18:47 +0300, Andy Shevchenko wrote:
> On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote:
> > Hi!
> > 
> > Few days ago I bisected a regression on 5.8 kernel:
> > 
> > I have nvidia rtx 2070s and its USB type C port driver (which is open 
> > source)
> > started to crash on load:
> 
> ...
> 
> > Reverting the commit helped fix this oops.
> > 
> > My .config attached.
> > If any more info is needed I'll be happy to provide it,
> > and of course test patches.
> 
> Can you test below?
> 
> diff --git a/drivers/base/property.c b/drivers/base/property.c
> index 1e6d75e65938..d58aa98fe964 100644
> --- a/drivers/base/property.c
> +++ b/drivers/base/property.c
> @@ -721,7 +721,7 @@ struct fwnode_handle *device_get_next_child_node(struct 
> device *dev,
>   return next;
>  
>   /* When no more children in primary, continue with secondary */
> - if (!IS_ERR_OR_NULL(fwnode->secondary))
> + if (fwnode && !IS_ERR_OR_NULL(fwnode->secondary))
>   next = fwnode_get_next_child_node(fwnode->secondary, child);
>  
>   return next;

It works (no more oops)

Best regards,
Maxim Levitsky



Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'

2020-07-16 Thread Maxim Levitsky
23
> > # good: [081096d98bb23946f16215357b141c5616b234bf] Merge tag 'tty-5.8-rc1' 
> > of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
> > git bisect good 081096d98bb23946f16215357b141c5616b234bf
> > # bad: [3a2a8751742133a7bbc49b9d1bcbd52e212edff6] Merge tag 'for-v5.8' of 
> > git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
> > git bisect bad 3a2a8751742133a7bbc49b9d1bcbd52e212edff6
> > # bad: [a1e81f9654eef650d3ee35c94a8cab00b5cd379c] m68k: implement 
> > flush_icache_user_range
> > git bisect bad a1e81f9654eef650d3ee35c94a8cab00b5cd379c
> > # good: [c336c022503d1be719ca06f2526c211709e3d2d3] staging: wfx: remove 
> > false positive warning
> > git bisect good c336c022503d1be719ca06f2526c211709e3d2d3
> > # good: [05c8a4fc44a916dd897769ca69b42381f9177ec4] habanalabs: correctly 
> > cast u64 to void*
> > git bisect good 05c8a4fc44a916dd897769ca69b42381f9177ec4
> > # good: [a3975dea1696b7c81319dc4b66e3c378dd47ccfb] Merge tag 'iio-for-5.8c' 
> > of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
> > git bisect good a3975dea1696b7c81319dc4b66e3c378dd47ccfb
> > # bad: [f558b8364e19f9222e7976c64e9367f66bab02cc] Merge tag 
> > 'driver-core-5.8-rc1' of 
> > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
> > git bisect bad f558b8364e19f9222e7976c64e9367f66bab02cc
> > # good: [b6d90ef9a439b4ef73a350789bf766a1339a703d] staging: vchi: Get rid 
> > of not implemented function declarations
> > git bisect good b6d90ef9a439b4ef73a350789bf766a1339a703d
> > # good: [93d2e4322aa74c1ad1e8c2160608eb9a960d69ff] of: platform: Batch 
> > fwnode parsing when adding all top level devices
> > git bisect good 93d2e4322aa74c1ad1e8c2160608eb9a960d69ff
> > # bad: [c2c076166b5880eabe068ce1cab30bf6edeeea1a] firmware_loader: change 
> > enum fw_opt to u32
> > git bisect bad c2c076166b5880eabe068ce1cab30bf6edeeea1a
> > # bad: [2cd38fd15e4ebcfe917a443734820269f8b5ba2b] driver core: Remove 
> > unnecessary is_fwnode_dev variable in device_add()
> > git bisect bad 2cd38fd15e4ebcfe917a443734820269f8b5ba2b
> > # good: [c82c83c330654c5639960ebc3dabbae53c43f79e] driver core: platform: 
> > Fix spelling errors in platform.c
> > git bisect good c82c83c330654c5639960ebc3dabbae53c43f79e
> > # bad: [114dbb4fa7c4053a51964d112e2851e818e085c6] drivers property: When no 
> > children in primary, try secondary
> > git bisect bad 114dbb4fa7c4053a51964d112e2851e818e085c6
> > # first bad commit: [114dbb4fa7c4053a51964d112e2851e818e085c6] drivers 
> > property: When no children in primary, try secondary
> > 
> > 
> > Reverting the commit helped fix this oops.
> > 
> > My .config attached.
> > If any more info is needed I'll be happy to provide it,
> > and of course test patches.
> > 
> > Best regards,
> > Maxim Levitsky
> 
> 


Turns out that kernel has decode_stacktrace.sh. I always decoded the symbols 
manually.
I will send the decoded trace from now on in bug reports.

IMHO it would be usefull to include a pointer to it in the kernel oops report 
since many people like me don't know about this nice script.

[mlevitsk@starship 
~/UPSTREAM/linux-kernel/work_area/ucsi_crash]$../../src/scripts/decode_stacktrace.sh
 ../../src/vmlinux ../../src/ ../../src/ < ./stacktrace.txt 
[  +0.43] CPU: 19 PID: 31281 Comm: kworker/19:1 Tainted: PW  O  
5.8.0-rc3.stable #133
[  +0.45] Hardware name: Gigabyte Technology Co., Ltd. TRX40 
DESIGNARE/TRX40 DESIGNARE, BIOS F4c 03/05/2020
[  +0.30] Workqueue: events_long ucsi_init_work [typec_ucsi]
[   +0.48] RIP: 0010:device_get_next_child_node 
(/home/mlevitsk/UPSTREAM/linux-kernel/src/drivers/base/property.c:715) 
[ +0.24] Code: 18 48 85 db 74 24 48 8b 43 08 48 85 c0 74 1b 48 8b 40 50 48 
85 c0 74 12 48 89 ee 48 89 df ff d0 48 85 c0 74 05 5b 5d 41 5c c3 <48> 8b 03 48 
85 c0 74 f3 48>
All code

   0:   18 48 85sbb%cl,-0x7b(%rax)
   3:   db 74 24 48 (bad)  0x48(%rsp)
   7:   8b 43 08mov0x8(%rbx),%eax
   a:   48 85 c0test   %rax,%rax
   d:   74 1b   je 0x2a
   f:   48 8b 40 50 mov0x50(%rax),%rax
  13:   48 85 c0test   %rax,%rax
  16:   74 12   je 0x2a
  18:   48 89 eemov%rbp,%rsi
  1b:   48 89 dfmov%rbx,%rdi
  1e:   ff d0   callq  *%rax
  20:   48 85 c0test   %rax,%rax
  23:   74 05   je 0x2a
  25:   5b  pop%rbx
  26:   5d  pop%rbp
  27:   41 5c   pop%r12
  29:   c3  retq   
  2a:*  48 8b 03 

Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'

2020-07-16 Thread Andy Shevchenko
On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote:
> Hi!
> 
> Few days ago I bisected a regression on 5.8 kernel:
> 
> I have nvidia rtx 2070s and its USB type C port driver (which is open source)
> started to crash on load:

...

> Reverting the commit helped fix this oops.
> 
> My .config attached.
> If any more info is needed I'll be happy to provide it,
> and of course test patches.

Can you test below?

diff --git a/drivers/base/property.c b/drivers/base/property.c
index 1e6d75e65938..d58aa98fe964 100644
--- a/drivers/base/property.c
+++ b/drivers/base/property.c
@@ -721,7 +721,7 @@ struct fwnode_handle *device_get_next_child_node(struct 
device *dev,
return next;
 
/* When no more children in primary, continue with secondary */
-   if (!IS_ERR_OR_NULL(fwnode->secondary))
+   if (fwnode && !IS_ERR_OR_NULL(fwnode->secondary))
next = fwnode_get_next_child_node(fwnode->secondary, child);
 
return next;
-- 
With Best Regards,
Andy Shevchenko




Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'

2020-07-16 Thread Andy Shevchenko
On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote:
> Hi!
> 
> Few days ago I bisected a regression on 5.8 kernel:
> 
> I have nvidia rtx 2070s and its USB type C port driver (which is open source)
> started to crash on load:

I'm looking at this, but I have questions:
- any pointers to the device tree excerpt which this tries to iterate over
- can you provide full Code: line?

Only way I see, why it happens, is that fwnode is not initialized properly
somewhere (means it has garbage in the secondary pointer).

> [  +0.43] CPU: 19 PID: 31281 Comm: kworker/19:1 Tainted: PW  O
>   5.8.0-rc3.stable #133
> [  +0.45] Hardware name: Gigabyte Technology Co., Ltd. TRX40 
> DESIGNARE/TRX40 DESIGNARE, BIOS F4c 03/05/2020
> [  +0.30] Workqueue: events_long ucsi_init_work [typec_ucsi]
> [  +0.48] RIP: 0010:device_get_next_child_node+0x5b/0xb0
> [  +0.24] Code: 18 48 85 db 74 24 48 8b 43 08 48 85 c0 74 1b 48 8b 40 50 
> 48 85 c0 74 12 48 89 ee 48 89 df ff d0 48 85 c0 74 05 5b 5d 41 5c c3 <48> 8b 
> 03 48 85 c0 74 f3 48>
> [  +0.65] RSP: 0018:c900038d7e08 EFLAGS: 00010246
> [  +0.44] RAX: 889fb6b62f00 RBX:  RCX: 
> 0001
> [  +0.27] RDX: 889fb6fd4a70 RSI:  RDI: 
> 889fb6b63608
> [  +0.46] RBP:  R08: 0001 R09: 
> 7fff
> [  +0.24] R10: 2075ce282580 R11: 0062de3e R12: 
> 889fb6b63608
> [  +0.43] R13: 0001 R14: 889fb6b63018 R15: 
> 0001
> [  +0.44] FS:  () GS:889fbe4c() 
> knlGS:
> [  +0.24] CS:  0010 DS:  ES:  CR0: 80050033
> [  +0.42] CR2:  CR3: 00175621b000 CR4: 
> 00340ea0
> [  +0.46] Call Trace:
> [  +0.30]  ucsi_init+0x213/0x530 [typec_ucsi]
> [  +0.28]  ucsi_init_work+0x12/0x20 [typec_ucsi]
> [  +0.49]  process_one_work+0x1d2/0x390
> [  +0.27]  worker_thread+0x4a/0x3b0
> [  +0.25]  ? process_one_work+0x390/0x390
> [  +0.49]  kthread+0xf9/0x130
> [  +0.26]  ? kthread_park+0x90/0x90
> [  +0.28]  ret_from_fork+0x1f/0x30
> [  +0.48] Modules linked in: ucsi_ccg typec_ucsi typec hfsplus cdrom ntfs 
> msdos vfio_pci vfio_virqfd vfio_iommu_type1 vfio vhost_net vhost vhost_iotlb 
> tap xfs rfcomm xt_M>
> [  +0.39]  usb_storage ext4 mbcache jbd2 amdgpu gpu_sched ttm 
> drm_kms_helper syscopyarea sysfillrect ahci sysimgblt fb_sys_fops 
> crc32_pclmul libahci crc32c_intel igb ccp >
> [  +0.000289] CR2: 
> [  +0.26] ---[ end trace 38ebb9aebd55fbff ]---
> [  +0.014201] RIP: 0010:device_get_next_child_node+0x5b/0xb0
> [  +0.30] Code: 18 48 85 db 74 24 48 8b 43 08 48 85 c0 74 1b 48 8b 40 50 
> 48 85 c0 74 12 48 89 ee 48 89 df ff d0 48 85 c0 74 05 5b 5d 41 5c c3 <48> 8b 
> 03 48 85 c0 74 f3 48>
> [  +0.75] RSP: 0018:c900038d7e08 EFLAGS: 00010246
> [  +0.27] RAX: 889fb6b62f00 RBX:  RCX: 
> 0001
> [  +0.48] RDX: 889fb6fd4a70 RSI:  RDI: 
> 889fb6b63608
> [  +0.49] RBP:  R08: 0001 R09: 
> 7fff
> [  +0.27] R10: 2075ce282580 R11: 0062de3e R12: 
> 889fb6b63608
> [  +0.49] R13: 0001 R14: 889fb6b63018 R15: 
> 0001
> [  +0.50] FS:  () GS:889fbe4c() 
> knlGS:
> [  +0.27] CS:  0010 DS:  ES:  CR0: 80050033
> [  +0.50] CR2:  CR3: 00175621b000 CR4: 
> 00340ea0
> 
> I bisected this, while passing the UCSI controller to a VM, and this
> is the result:
> 
> git bisect start
> # good: [3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162] Linux 5.7
> git bisect good 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162
> # bad: [48778464bb7d346b47157d21ffde2af6b2d39110] Linux 5.8-rc2
> git bisect bad 48778464bb7d346b47157d21ffde2af6b2d39110
> # good: [a98f670e41a99f53acb1fb33cee9c6abbb2e6f23] Merge tag 'media/v5.8-1' 
> of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
> git bisect good a98f670e41a99f53acb1fb33cee9c6abbb2e6f23
> # good: [081096d98bb23946f16215357b141c5616b234bf] Merge tag 'tty-5.8-rc1' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
> git bisect good 081096d98bb23946f16215357b141c5616b234bf
> # bad: [3a2a8751742133a7bbc49b9d1bcbd52e212edff6] Merge tag 'for-v5.8' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
> git bisect bad 3a2a8751742133a7bbc49b9d1bcbd52e212edff6
> # bad: [a1e81f9654eef650d3ee35c94a8cab00b5cd379c] m68k: implement 
> flush_icache_user_range
> git bisect bad a1e81f9654eef650d3ee35c94a8cab00b5cd379c
> # good: [c336c022503d1be719ca06f2526c211709e3d2d3] staging: wfx: remove false 
> positive warning
> git bisect good c336c022503d1be719ca06f2526c211709e3d2d3
> # good: [05c8a4fc44a916dd897769ca69b42381f9177ec4] habanalabs: correctly cast 
> u64 

Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'

2020-07-16 Thread Maxim Levitsky
On Thu, 2020-07-16 at 10:28 +0200, Greg KH wrote:
> On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote:
> > Hi!
> > 
> > Few days ago I bisected a regression on 5.8 kernel:
> > 
> > I have nvidia rtx 2070s and its USB type C port driver (which is open 
> > source)
> 
> Is that driver merged into the tree?  If not, do you have a pointer to
> it somewhere?
> 
> thanks,
> 
> greg k-h
> 
It is in the tree.

CONFIG_TYPEC_UCSI selectes the generic UCSI driver
CONFIG_UCSI_CCG selects the hardware driver,
which is an i2c driver which binds to an i2c device (I think with address 0x8)
on an i2c controller, which is exposed by function 3 of the NVIDIA card, and 
uses the
CONFIG_I2C_NVIDIA_GPU driver.

We also have CONFIG_TYPEC_NVIDIA_ALTMODE which I haven't researched
what it does.

Best regards,
Maxim Levitsky



Re: kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'

2020-07-16 Thread Greg KH
On Thu, Jul 16, 2020 at 11:17:03AM +0300, Maxim Levitsky wrote:
> Hi!
> 
> Few days ago I bisected a regression on 5.8 kernel:
> 
> I have nvidia rtx 2070s and its USB type C port driver (which is open source)

Is that driver merged into the tree?  If not, do you have a pointer to
it somewhere?

thanks,

greg k-h


kernel oops in 'typec_ucsi' due to commit 'drivers property: When no children in primary, try secondary'

2020-07-16 Thread Maxim Levitsky
Hi!

Few days ago I bisected a regression on 5.8 kernel:

I have nvidia rtx 2070s and its USB type C port driver (which is open source)
started to crash on load:

[  +0.43] CPU: 19 PID: 31281 Comm: kworker/19:1 Tainted: PW  O  
5.8.0-rc3.stable #133
[  +0.45] Hardware name: Gigabyte Technology Co., Ltd. TRX40 
DESIGNARE/TRX40 DESIGNARE, BIOS F4c 03/05/2020
[  +0.30] Workqueue: events_long ucsi_init_work [typec_ucsi]
[  +0.48] RIP: 0010:device_get_next_child_node+0x5b/0xb0
[  +0.24] Code: 18 48 85 db 74 24 48 8b 43 08 48 85 c0 74 1b 48 8b 40 50 48 
85 c0 74 12 48 89 ee 48 89 df ff d0 48 85 c0 74 05 5b 5d 41 5c c3 <48> 8b 03 48 
85 c0 74 f3 48>
[  +0.65] RSP: 0018:c900038d7e08 EFLAGS: 00010246
[  +0.44] RAX: 889fb6b62f00 RBX:  RCX: 0001
[  +0.27] RDX: 889fb6fd4a70 RSI:  RDI: 889fb6b63608
[  +0.46] RBP:  R08: 0001 R09: 7fff
[  +0.24] R10: 2075ce282580 R11: 0062de3e R12: 889fb6b63608
[  +0.43] R13: 0001 R14: 889fb6b63018 R15: 0001
[  +0.44] FS:  () GS:889fbe4c() 
knlGS:
[  +0.24] CS:  0010 DS:  ES:  CR0: 80050033
[  +0.42] CR2:  CR3: 00175621b000 CR4: 00340ea0
[  +0.46] Call Trace:
[  +0.30]  ucsi_init+0x213/0x530 [typec_ucsi]
[  +0.28]  ucsi_init_work+0x12/0x20 [typec_ucsi]
[  +0.49]  process_one_work+0x1d2/0x390
[  +0.27]  worker_thread+0x4a/0x3b0
[  +0.25]  ? process_one_work+0x390/0x390
[  +0.49]  kthread+0xf9/0x130
[  +0.26]  ? kthread_park+0x90/0x90
[  +0.28]  ret_from_fork+0x1f/0x30
[  +0.48] Modules linked in: ucsi_ccg typec_ucsi typec hfsplus cdrom ntfs 
msdos vfio_pci vfio_virqfd vfio_iommu_type1 vfio vhost_net vhost vhost_iotlb 
tap xfs rfcomm xt_M>
[  +0.39]  usb_storage ext4 mbcache jbd2 amdgpu gpu_sched ttm 
drm_kms_helper syscopyarea sysfillrect ahci sysimgblt fb_sys_fops crc32_pclmul 
libahci crc32c_intel igb ccp >
[  +0.000289] CR2: 
[  +0.26] ---[ end trace 38ebb9aebd55fbff ]---
[  +0.014201] RIP: 0010:device_get_next_child_node+0x5b/0xb0
[  +0.30] Code: 18 48 85 db 74 24 48 8b 43 08 48 85 c0 74 1b 48 8b 40 50 48 
85 c0 74 12 48 89 ee 48 89 df ff d0 48 85 c0 74 05 5b 5d 41 5c c3 <48> 8b 03 48 
85 c0 74 f3 48>
[  +0.75] RSP: 0018:c900038d7e08 EFLAGS: 00010246
[  +0.27] RAX: 889fb6b62f00 RBX:  RCX: 0001
[  +0.48] RDX: 889fb6fd4a70 RSI:  RDI: 889fb6b63608
[  +0.49] RBP:  R08: 0001 R09: 7fff
[  +0.27] R10: 2075ce282580 R11: 0062de3e R12: 889fb6b63608
[  +0.49] R13: 0001 R14: 889fb6b63018 R15: 0001
[  +0.50] FS:  () GS:889fbe4c() 
knlGS:
[  +0.27] CS:  0010 DS:  ES:  CR0: 80050033
[  +0.50] CR2:  CR3: 00175621b000 CR4: 00340ea0

I bisected this, while passing the UCSI controller to a VM, and this
is the result:

git bisect start
# good: [3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162] Linux 5.7
git bisect good 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162
# bad: [48778464bb7d346b47157d21ffde2af6b2d39110] Linux 5.8-rc2
git bisect bad 48778464bb7d346b47157d21ffde2af6b2d39110
# good: [a98f670e41a99f53acb1fb33cee9c6abbb2e6f23] Merge tag 'media/v5.8-1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
git bisect good a98f670e41a99f53acb1fb33cee9c6abbb2e6f23
# good: [081096d98bb23946f16215357b141c5616b234bf] Merge tag 'tty-5.8-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
git bisect good 081096d98bb23946f16215357b141c5616b234bf
# bad: [3a2a8751742133a7bbc49b9d1bcbd52e212edff6] Merge tag 'for-v5.8' of 
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
git bisect bad 3a2a8751742133a7bbc49b9d1bcbd52e212edff6
# bad: [a1e81f9654eef650d3ee35c94a8cab00b5cd379c] m68k: implement 
flush_icache_user_range
git bisect bad a1e81f9654eef650d3ee35c94a8cab00b5cd379c
# good: [c336c022503d1be719ca06f2526c211709e3d2d3] staging: wfx: remove false 
positive warning
git bisect good c336c022503d1be719ca06f2526c211709e3d2d3
# good: [05c8a4fc44a916dd897769ca69b42381f9177ec4] habanalabs: correctly cast 
u64 to void*
git bisect good 05c8a4fc44a916dd897769ca69b42381f9177ec4
# good: [a3975dea1696b7c81319dc4b66e3c378dd47ccfb] Merge tag 'iio-for-5.8c' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
git bisect good a3975dea1696b7c81319dc4b66e3c378dd47ccfb
# bad: [f558b8364e19f9222e7976c64e9367f66bab02cc] Merge tag 
'driver-core-5.8-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
git bisect bad f558b8364e19f9222e7976c64e9367f66bab02cc
# good: [b6d90ef9a439b4ef73a350789bf766a1339a703d] staging: vchi: 

Re: [PATCH v1] Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()

2020-07-12 Thread Pavel Machek
On Tue 2020-07-07 17:38:46, Marcel Holtmann wrote:
> Hi Miao-chen,
> 
> > This fixes the kernel oops by removing unnecessary background scan
> > update from hci_adv_monitors_clear() which shouldn't invoke any work
> > queue.
> > 
> > The following test was performed.
> > - Run "rmmod btusb" and verify that no kernel oops is triggered.
> > 
> > Signed-off-by: Miao-chen Chou 
> > Reviewed-by: Abhishek Pandit-Subedi 
> > Reviewed-by: Alain Michaud 
> > ---
> > 
> > net/bluetooth/hci_core.c | 2 --
> > 1 file changed, 2 deletions(-)
> 
> patch has been applied to bluetooth-next tree.

Bluetooth no longer seems to oops for me... but there's different
showstopper in next (graphics -- i915 -- related). Oh well :-(.

Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH v1] Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()

2020-07-07 Thread Marcel Holtmann
Hi Miao-chen,

> This fixes the kernel oops by removing unnecessary background scan
> update from hci_adv_monitors_clear() which shouldn't invoke any work
> queue.
> 
> The following test was performed.
> - Run "rmmod btusb" and verify that no kernel oops is triggered.
> 
> Signed-off-by: Miao-chen Chou 
> Reviewed-by: Abhishek Pandit-Subedi 
> Reviewed-by: Alain Michaud 
> ---
> 
> net/bluetooth/hci_core.c | 2 --
> 1 file changed, 2 deletions(-)

patch has been applied to bluetooth-next tree.

Regards

Marcel



Re: [PATCH v1] Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()

2020-07-06 Thread Miao-chen Chou
Hi Marcel,

In case you missed this thread, my suggestion is to revert the
previous patch and apply this patch. Please see my earlier email for
the reason. Thanks.

Regards,
Miao

On Tue, Jun 30, 2020 at 2:55 PM Miao-chen Chou  wrote:
>
> Hi Marcel,
>
> hci_unregister_dev() is invoked when the controller is intended to be
> removed by btusb driver. In other words, there should not be any
> activity on hdev's workqueue, so the destruction of the workqueue
> should be the first thing to do to prevent the clear helpers from
> issuing any work. So my suggestion is to revert the patch re-arranging
> the workqueue and apply this instead.
> I should have uploaded this earlier, but I encountered some troubles
> while verifying the changes. Sorry for the inconvenience.
>
> Regards,
> Miao
>
> On Mon, Jun 29, 2020 at 11:51 PM Marcel Holtmann  wrote:
> >
> > Hi Miao-chen,
> >
> > > This fixes the kernel oops by removing unnecessary background scan
> > > update from hci_adv_monitors_clear() which shouldn't invoke any work
> > > queue.
> > >
> > > The following test was performed.
> > > - Run "rmmod btusb" and verify that no kernel oops is triggered.
> > >
> > > Signed-off-by: Miao-chen Chou 
> > > Reviewed-by: Abhishek Pandit-Subedi 
> > > Reviewed-by: Alain Michaud 
> > > ---
> > >
> > > net/bluetooth/hci_core.c | 2 --
> > > 1 file changed, 2 deletions(-)
> > >
> > > diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> > > index 5577cf9e2c7cd..77615161c7d72 100644
> > > --- a/net/bluetooth/hci_core.c
> > > +++ b/net/bluetooth/hci_core.c
> > > @@ -3005,8 +3005,6 @@ void hci_adv_monitors_clear(struct hci_dev *hdev)
> > >   hci_free_adv_monitor(monitor);
> > >
> > >   idr_destroy(>adv_monitors_idr);
> > > -
> > > - hci_update_background_scan(hdev);
> > > }
> >
> > I am happy to apply this as well, but I also applied another patch 
> > re-arranging the workqueue destroy handling. Can you check which prefer or 
> > if we should include both patches.
> >
> > Regards
> >
> > Marcel
> >


Re: [PATCH v3] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-30 Thread Chanwoo Choi
Hi Marc,

On 6/30/20 7:05 PM, Marc Zyngier wrote:
> Booting a recent kernel on a rk3399-based system (nanopc-t4),
> equipped with a recent u-boot and ATF results in an Oops due
> to a NULL pointer dereference.
> 
> This turns out to be due to the rk3399-dmc driver looking for
> an *undocumented* property (rockchip,pmu), and happily using
> a NULL pointer when the property isn't there.
> 
> Instead, make most of what was brought in with 9173c5ceb035
> ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
> to TF-A.") conditioned on finding this property in the device-tree,
> preventing the driver from exploding.
> 
> Cc: sta...@vger.kernel.org
> Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down 
> parameters to TF-A.")
> Signed-off-by: Marc Zyngier 
> ---
> * From v2:
>   - Trimmed down commit message
>   - Cc stable
> 
>  drivers/devfreq/rk3399_dmc.c | 42 
>  1 file changed, 23 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
> index 24f04f78285b..027769e39f9b 100644
> --- a/drivers/devfreq/rk3399_dmc.c
> +++ b/drivers/devfreq/rk3399_dmc.c
> @@ -95,18 +95,20 @@ static int rk3399_dmcfreq_target(struct device *dev, 
> unsigned long *freq,
>  
>   mutex_lock(>lock);
>  
> - if (target_rate >= dmcfreq->odt_dis_freq)
> - odt_enable = true;
> -
> - /*
> -  * This makes a SMC call to the TF-A to set the DDR PD (power-down)
> -  * timings and to enable or disable the ODT (on-die termination)
> -  * resistors.
> -  */
> - arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, dmcfreq->odt_pd_arg0,
> -   dmcfreq->odt_pd_arg1,
> -   ROCKCHIP_SIP_CONFIG_DRAM_SET_ODT_PD,
> -   odt_enable, 0, 0, 0, );
> + if (dmcfreq->regmap_pmu) {
> + if (target_rate >= dmcfreq->odt_dis_freq)
> + odt_enable = true;
> +
> + /*
> +  * This makes a SMC call to the TF-A to set the DDR PD
> +  * (power-down) timings and to enable or disable the
> +  * ODT (on-die termination) resistors.
> +  */
> + arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, dmcfreq->odt_pd_arg0,
> +   dmcfreq->odt_pd_arg1,
> +   ROCKCHIP_SIP_CONFIG_DRAM_SET_ODT_PD,
> +   odt_enable, 0, 0, 0, );
> + }
>  
>   /*
>* If frequency scaling from low to high, adjust voltage first.
> @@ -371,13 +373,14 @@ static int rk3399_dmcfreq_probe(struct platform_device 
> *pdev)
>   }
>  
>   node = of_parse_phandle(np, "rockchip,pmu", 0);
> - if (node) {
> - data->regmap_pmu = syscon_node_to_regmap(node);
> - of_node_put(node);
> - if (IS_ERR(data->regmap_pmu)) {
> - ret = PTR_ERR(data->regmap_pmu);
> - goto err_edev;
> - }
> + if (!node)
> + goto no_pmu;
> +
> + data->regmap_pmu = syscon_node_to_regmap(node);
> + of_node_put(node);
> + if (IS_ERR(data->regmap_pmu)) {
> + ret = PTR_ERR(data->regmap_pmu);
> + goto err_edev;
>   }
>  
>   regmap_read(data->regmap_pmu, RK3399_PMUGRF_OS_REG2, );
> @@ -399,6 +402,7 @@ static int rk3399_dmcfreq_probe(struct platform_device 
> *pdev)
>   goto err_edev;
>   };
>  
> +no_pmu:
>   arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, 0, 0,
> ROCKCHIP_SIP_CONFIG_DRAM_INIT,
> 0, 0, 0, 0, );
> 

Applied it. Thanks.

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


Re: [PATCH v1] Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()

2020-06-30 Thread Miao-chen Chou
Hi Marcel,

hci_unregister_dev() is invoked when the controller is intended to be
removed by btusb driver. In other words, there should not be any
activity on hdev's workqueue, so the destruction of the workqueue
should be the first thing to do to prevent the clear helpers from
issuing any work. So my suggestion is to revert the patch re-arranging
the workqueue and apply this instead.
I should have uploaded this earlier, but I encountered some troubles
while verifying the changes. Sorry for the inconvenience.

Regards,
Miao

On Mon, Jun 29, 2020 at 11:51 PM Marcel Holtmann  wrote:
>
> Hi Miao-chen,
>
> > This fixes the kernel oops by removing unnecessary background scan
> > update from hci_adv_monitors_clear() which shouldn't invoke any work
> > queue.
> >
> > The following test was performed.
> > - Run "rmmod btusb" and verify that no kernel oops is triggered.
> >
> > Signed-off-by: Miao-chen Chou 
> > Reviewed-by: Abhishek Pandit-Subedi 
> > Reviewed-by: Alain Michaud 
> > ---
> >
> > net/bluetooth/hci_core.c | 2 --
> > 1 file changed, 2 deletions(-)
> >
> > diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> > index 5577cf9e2c7cd..77615161c7d72 100644
> > --- a/net/bluetooth/hci_core.c
> > +++ b/net/bluetooth/hci_core.c
> > @@ -3005,8 +3005,6 @@ void hci_adv_monitors_clear(struct hci_dev *hdev)
> >   hci_free_adv_monitor(monitor);
> >
> >   idr_destroy(>adv_monitors_idr);
> > -
> > - hci_update_background_scan(hdev);
> > }
>
> I am happy to apply this as well, but I also applied another patch 
> re-arranging the workqueue destroy handling. Can you check which prefer or if 
> we should include both patches.
>
> Regards
>
> Marcel
>


[PATCH v3] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-30 Thread Marc Zyngier
Booting a recent kernel on a rk3399-based system (nanopc-t4),
equipped with a recent u-boot and ATF results in an Oops due
to a NULL pointer dereference.

This turns out to be due to the rk3399-dmc driver looking for
an *undocumented* property (rockchip,pmu), and happily using
a NULL pointer when the property isn't there.

Instead, make most of what was brought in with 9173c5ceb035
("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
to TF-A.") conditioned on finding this property in the device-tree,
preventing the driver from exploding.

Cc: sta...@vger.kernel.org
Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down 
parameters to TF-A.")
Signed-off-by: Marc Zyngier 
---
* From v2:
  - Trimmed down commit message
  - Cc stable

 drivers/devfreq/rk3399_dmc.c | 42 
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
index 24f04f78285b..027769e39f9b 100644
--- a/drivers/devfreq/rk3399_dmc.c
+++ b/drivers/devfreq/rk3399_dmc.c
@@ -95,18 +95,20 @@ static int rk3399_dmcfreq_target(struct device *dev, 
unsigned long *freq,
 
mutex_lock(>lock);
 
-   if (target_rate >= dmcfreq->odt_dis_freq)
-   odt_enable = true;
-
-   /*
-* This makes a SMC call to the TF-A to set the DDR PD (power-down)
-* timings and to enable or disable the ODT (on-die termination)
-* resistors.
-*/
-   arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, dmcfreq->odt_pd_arg0,
- dmcfreq->odt_pd_arg1,
- ROCKCHIP_SIP_CONFIG_DRAM_SET_ODT_PD,
- odt_enable, 0, 0, 0, );
+   if (dmcfreq->regmap_pmu) {
+   if (target_rate >= dmcfreq->odt_dis_freq)
+   odt_enable = true;
+
+   /*
+* This makes a SMC call to the TF-A to set the DDR PD
+* (power-down) timings and to enable or disable the
+* ODT (on-die termination) resistors.
+*/
+   arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, dmcfreq->odt_pd_arg0,
+ dmcfreq->odt_pd_arg1,
+ ROCKCHIP_SIP_CONFIG_DRAM_SET_ODT_PD,
+ odt_enable, 0, 0, 0, );
+   }
 
/*
 * If frequency scaling from low to high, adjust voltage first.
@@ -371,13 +373,14 @@ static int rk3399_dmcfreq_probe(struct platform_device 
*pdev)
}
 
node = of_parse_phandle(np, "rockchip,pmu", 0);
-   if (node) {
-   data->regmap_pmu = syscon_node_to_regmap(node);
-   of_node_put(node);
-   if (IS_ERR(data->regmap_pmu)) {
-   ret = PTR_ERR(data->regmap_pmu);
-   goto err_edev;
-   }
+   if (!node)
+   goto no_pmu;
+
+   data->regmap_pmu = syscon_node_to_regmap(node);
+   of_node_put(node);
+   if (IS_ERR(data->regmap_pmu)) {
+   ret = PTR_ERR(data->regmap_pmu);
+   goto err_edev;
}
 
regmap_read(data->regmap_pmu, RK3399_PMUGRF_OS_REG2, );
@@ -399,6 +402,7 @@ static int rk3399_dmcfreq_probe(struct platform_device 
*pdev)
goto err_edev;
};
 
+no_pmu:
arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, 0, 0,
  ROCKCHIP_SIP_CONFIG_DRAM_INIT,
  0, 0, 0, 0, );
-- 
2.27.0



Re: [PATCH v1] Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()

2020-06-30 Thread Marcel Holtmann
Hi Miao-chen,

> This fixes the kernel oops by removing unnecessary background scan
> update from hci_adv_monitors_clear() which shouldn't invoke any work
> queue.
> 
> The following test was performed.
> - Run "rmmod btusb" and verify that no kernel oops is triggered.
> 
> Signed-off-by: Miao-chen Chou 
> Reviewed-by: Abhishek Pandit-Subedi 
> Reviewed-by: Alain Michaud 
> ---
> 
> net/bluetooth/hci_core.c | 2 --
> 1 file changed, 2 deletions(-)
> 
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index 5577cf9e2c7cd..77615161c7d72 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -3005,8 +3005,6 @@ void hci_adv_monitors_clear(struct hci_dev *hdev)
>   hci_free_adv_monitor(monitor);
> 
>   idr_destroy(>adv_monitors_idr);
> -
> - hci_update_background_scan(hdev);
> }

I am happy to apply this as well, but I also applied another patch re-arranging 
the workqueue destroy handling. Can you check which prefer or if we should 
include both patches.

Regards

Marcel



[PATCH v1] Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()

2020-06-29 Thread Miao-chen Chou
This fixes the kernel oops by removing unnecessary background scan
update from hci_adv_monitors_clear() which shouldn't invoke any work
queue.

The following test was performed.
- Run "rmmod btusb" and verify that no kernel oops is triggered.

Signed-off-by: Miao-chen Chou 
Reviewed-by: Abhishek Pandit-Subedi 
Reviewed-by: Alain Michaud 
---

 net/bluetooth/hci_core.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 5577cf9e2c7cd..77615161c7d72 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -3005,8 +3005,6 @@ void hci_adv_monitors_clear(struct hci_dev *hdev)
hci_free_adv_monitor(monitor);
 
idr_destroy(>adv_monitors_idr);
-
-   hci_update_background_scan(hdev);
 }
 
 void hci_free_adv_monitor(struct adv_monitor *monitor)
-- 
2.26.2



Re: [PATCH v2] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-29 Thread Chanwoo Choi
Hi Marc,
Hi Marc,

On 6/29/20 10:22 PM, Marc Zyngier wrote:
> On 2020-06-29 12:29, Chanwoo Choi wrote:
>> Hi Enric and Mark,
>>
>> On 6/29/20 8:05 PM, Enric Balletbo i Serra wrote:
>>> Hi Chanwoo and Marc,
>>>
>>> On 29/6/20 13:09, Chanwoo Choi wrote:
 Hi Enric,

 Could you check this issue? Your patch[1] causes this issue.
 As Marc mentioned, although rk3399-dmc.c handled 'rockchip,pmu'
 as the mandatory property, your patch[1] didn't add the 'rockchip,pmu'
 property to the documentation.

>>>
>>> I think the problem is that the DT binding patch, for some reason, was 
>>> missed
>>> and didn't land. The patch seems to have all the required reviews and acks.
>>>
>>>   https://patchwork.kernel.org/patch/10901593/
>>>
>>> Sorry because I didn't notice this issue when 9173c5ceb035 landed. And 
>>> thanks
>>> for fixing the issue.
>>
>> If the 'rockchip,pmu' propery is mandatory, instead of Mark's patch,
>> we better to require the merge of patch[1] to DT maintainer.
> 
> It is way too late. Firmware exists (mainline u-boot, for one) that
> do not expose the new property, and you can't demand that people
> upgrade. This is an ABI bug, and we now have to live with it.

As you commented, it is proper that rk3399-dmc.c treats 'rockchip,pmu'
property as optional. Could you send v3 with edited patch descritpion
and adding stable mailing list to Cc?

> 
> So, yes to fixing the DT, and no to *only* fixing the DT.

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


Re: [PATCH v2] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-29 Thread Enric Balletbo i Serra
Hi Chanwoo,

On 29/6/20 13:29, Chanwoo Choi wrote:
> Hi Enric and Mark,
> 
> On 6/29/20 8:05 PM, Enric Balletbo i Serra wrote:
>> Hi Chanwoo and Marc,
>>
>> On 29/6/20 13:09, Chanwoo Choi wrote:
>>> Hi Enric,
>>>
>>> Could you check this issue? Your patch[1] causes this issue.
>>> As Marc mentioned, although rk3399-dmc.c handled 'rockchip,pmu'
>>> as the mandatory property, your patch[1] didn't add the 'rockchip,pmu'
>>> property to the documentation. 
>>>
>>
>> I think the problem is that the DT binding patch, for some reason, was missed
>> and didn't land. The patch seems to have all the required reviews and acks.
>>
>>   https://patchwork.kernel.org/patch/10901593/
>>
>> Sorry because I didn't notice this issue when 9173c5ceb035 landed. And thanks
>> for fixing the issue.
> 
> If the 'rockchip,pmu' propery is mandatory, instead of Mark's patch,
> we better to require the merge of patch[1] to DT maintainer.
> 
> [1] https://patchwork.kernel.org/patch/10901593/
> 

Give me some time to double check, because I think that at this point, is needed
on some devices with old firmware but not now. It's been a while since I worked
on this, but I suspect that being optional is the right way.

Maybe Heiko, who IIRC worked on TF-A has a more clear thought on this?

Thanks,
 Enric

>>
>> Best regards,
>>  Enric
>>
>>> [1] 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT
>>> and auto power down parameters to TF-A.")
>>>
>>>
>>> On 6/29/20 5:18 PM, Marc Zyngier wrote:
>>>> Hi Chanwoo,
>>>>
>>>> On Mon, 29 Jun 2020 03:43:37 +0100,
>>>> Chanwoo Choi  wrote:
>>>>>
>>>>> Hi Marc,
>>>>>
>>>>> On 6/23/20 12:28 AM, Marc Zyngier wrote:
>>>>
>>>> [...]
>>>>
>>>>> It looks good to me. But, I think that it is not necessary
>>>>> fully kernel panic log about NULL pointer. It is enoughspsp
>>>>> just mentioning the NULL pointer issue without full kernel panic log.
>>>>
>>>> I personally find the backtrace useful as it allows people with the
>>>> same issue to trawl the kernel log and find whether it has already be
>>>> fixed upstream. But it's only me, and I'm not attached to it.
>>>>
>>>>> So, how about editing the patch description as following or others simply?
>>>>> and we need to add 'sta...@vger.kernel.org' to Cc list for applying it
>>>>> to stable branch.
>>>>
>>>> Looks good to me.
>>>>
>>>>>
>>>>>
>>>>>   PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent
>>>>>
>>>>> Booting a recent kernel on a rk3399-based system (nanopc-t4),
>>>>> equipped with a recent u-boot and ATF results in the kernel panic
>>>>> about NULL pointer issue.
>>>>
>>>> nit: "results in a kernel panic on dereferencing a NULL pointer".
>>>>
>>>>>
>>>>> This turns out to be due to the rk3399-dmc driver looking for
>>>>> an *undocumented* property (rockchip,pmu), and happily using
>>>>> a NULL pointer when the property isn't there.
>>>>>
>>>>> Instead, make most of what was brought in with 9173c5ceb035
>>>>> ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
>>>>> to TF-A.") conditioned on finding this property in the device-tree,
>>>>> preventing the driver from exploding.
>>>>>
>>>>> Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto 
>>>>> power down parameters to TF-A.")
>>>>> Signed-off-by: Marc Zyngier 
>>>>> Signed-off-by: Chanwoo Choi 
>>>>
>>>>
>>>> Note that the biggest issue is still there: the driver is using an
>>>> undocumented property, and this patch is just papering over it.
>>>> Since I expect this property to be useful for something, it would be
>>>> good for whoever knows what it does to document it.
>>>
>>> Hi Marc,
>>>
>>> You are right. We have to do two step:
>>> 1. Add missing explanation of 'rockchip,pmu' property to dt-binding document
>>> 2. If possible, add 'rockchip,pmu' property node to rk3399_dmc dt node.
>>>
>>> When I tried to 

Re: [PATCH v2] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-29 Thread Chanwoo Choi
Hi Enric and Mark,

On 6/29/20 8:05 PM, Enric Balletbo i Serra wrote:
> Hi Chanwoo and Marc,
> 
> On 29/6/20 13:09, Chanwoo Choi wrote:
>> Hi Enric,
>>
>> Could you check this issue? Your patch[1] causes this issue.
>> As Marc mentioned, although rk3399-dmc.c handled 'rockchip,pmu'
>> as the mandatory property, your patch[1] didn't add the 'rockchip,pmu'
>> property to the documentation. 
>>
> 
> I think the problem is that the DT binding patch, for some reason, was missed
> and didn't land. The patch seems to have all the required reviews and acks.
> 
>   https://patchwork.kernel.org/patch/10901593/
> 
> Sorry because I didn't notice this issue when 9173c5ceb035 landed. And thanks
> for fixing the issue.

If the 'rockchip,pmu' propery is mandatory, instead of Mark's patch,
we better to require the merge of patch[1] to DT maintainer.

[1] https://patchwork.kernel.org/patch/10901593/

> 
> Best regards,
>  Enric
> 
>> [1] 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT
>> and auto power down parameters to TF-A.")
>>
>>
>> On 6/29/20 5:18 PM, Marc Zyngier wrote:
>>> Hi Chanwoo,
>>>
>>> On Mon, 29 Jun 2020 03:43:37 +0100,
>>> Chanwoo Choi  wrote:
>>>>
>>>> Hi Marc,
>>>>
>>>> On 6/23/20 12:28 AM, Marc Zyngier wrote:
>>>
>>> [...]
>>>
>>>> It looks good to me. But, I think that it is not necessary
>>>> fully kernel panic log about NULL pointer. It is enoughspsp
>>>> just mentioning the NULL pointer issue without full kernel panic log.
>>>
>>> I personally find the backtrace useful as it allows people with the
>>> same issue to trawl the kernel log and find whether it has already be
>>> fixed upstream. But it's only me, and I'm not attached to it.
>>>
>>>> So, how about editing the patch description as following or others simply?
>>>> and we need to add 'sta...@vger.kernel.org' to Cc list for applying it
>>>> to stable branch.
>>>
>>> Looks good to me.
>>>
>>>>
>>>>
>>>>   PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent
>>>>
>>>> Booting a recent kernel on a rk3399-based system (nanopc-t4),
>>>> equipped with a recent u-boot and ATF results in the kernel panic
>>>> about NULL pointer issue.
>>>
>>> nit: "results in a kernel panic on dereferencing a NULL pointer".
>>>
>>>>
>>>> This turns out to be due to the rk3399-dmc driver looking for
>>>> an *undocumented* property (rockchip,pmu), and happily using
>>>> a NULL pointer when the property isn't there.
>>>>
>>>> Instead, make most of what was brought in with 9173c5ceb035
>>>> ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
>>>> to TF-A.") conditioned on finding this property in the device-tree,
>>>> preventing the driver from exploding.
>>>>
>>>> Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto 
>>>> power down parameters to TF-A.")
>>>> Signed-off-by: Marc Zyngier 
>>>> Signed-off-by: Chanwoo Choi 
>>>
>>>
>>> Note that the biggest issue is still there: the driver is using an
>>> undocumented property, and this patch is just papering over it.
>>> Since I expect this property to be useful for something, it would be
>>> good for whoever knows what it does to document it.
>>
>> Hi Marc,
>>
>> You are right. We have to do two step:
>> 1. Add missing explanation of 'rockchip,pmu' property to dt-binding document
>> 2. If possible, add 'rockchip,pmu' property node to rk3399_dmc dt node.
>>
>> When I tried to find usage example of 'rockchip,pmu' property,
>> I found them as following: The 'rockchip,pmu' property[2] indicates
>> 'PMU (Power Management Unit)'. 
>>
>> $ grep -rn "rockchip,pmu" arch/arm64/boot/dts/
>> arch/arm64/boot/dts/rockchip/px30.dtsi:1211: rockchip,pmu = 
>> <>;
>> arch/arm64/boot/dts/rockchip/rk3399.dtsi:1909:   rockchip,pmu = 
>> <>;
>> arch/arm64/boot/dts/rockchip/rk3368.dtsi:807:rockchip,pmu = 
>> <>;
>>
>> [2] the description of 'rockchip,pmu' property
>> - 
>> https://protect2.fireeye.com/url?k=e55f0ba3-b8384f85-e55e80ec-0cc47a31384a-d9c5f6b28aba9be6=1=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.7.2%2Fs

Re: [PATCH v2] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-29 Thread Chanwoo Choi
Hi Enric,

Could you check this issue? Your patch[1] causes this issue.
As Marc mentioned, although rk3399-dmc.c handled 'rockchip,pmu'
as the mandatory property, your patch[1] didn't add the 'rockchip,pmu'
property to the documentation. 

[1] 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT
and auto power down parameters to TF-A.")


On 6/29/20 5:18 PM, Marc Zyngier wrote:
> Hi Chanwoo,
> 
> On Mon, 29 Jun 2020 03:43:37 +0100,
> Chanwoo Choi  wrote:
>>
>> Hi Marc,
>>
>> On 6/23/20 12:28 AM, Marc Zyngier wrote:
> 
> [...]
> 
>> It looks good to me. But, I think that it is not necessary
>> fully kernel panic log about NULL pointer. It is enoughspsp
>> just mentioning the NULL pointer issue without full kernel panic log.
> 
> I personally find the backtrace useful as it allows people with the
> same issue to trawl the kernel log and find whether it has already be
> fixed upstream. But it's only me, and I'm not attached to it.
> 
>> So, how about editing the patch description as following or others simply?
>> and we need to add 'sta...@vger.kernel.org' to Cc list for applying it
>> to stable branch.
> 
> Looks good to me.
> 
>>
>>
>>   PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent
>>
>> Booting a recent kernel on a rk3399-based system (nanopc-t4),
>> equipped with a recent u-boot and ATF results in the kernel panic
>> about NULL pointer issue.
> 
> nit: "results in a kernel panic on dereferencing a NULL pointer".
> 
>>
>> This turns out to be due to the rk3399-dmc driver looking for
>> an *undocumented* property (rockchip,pmu), and happily using
>> a NULL pointer when the property isn't there.
>>
>> Instead, make most of what was brought in with 9173c5ceb035
>> ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
>> to TF-A.") conditioned on finding this property in the device-tree,
>> preventing the driver from exploding.
>>
>> Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto power 
>> down parameters to TF-A.")
>> Signed-off-by: Marc Zyngier 
>> Signed-off-by: Chanwoo Choi 
> 
> 
> Note that the biggest issue is still there: the driver is using an
> undocumented property, and this patch is just papering over it.
> Since I expect this property to be useful for something, it would be
> good for whoever knows what it does to document it.

Hi Marc,

You are right. We have to do two step:
1. Add missing explanation of 'rockchip,pmu' property to dt-binding document
2. If possible, add 'rockchip,pmu' property node to rk3399_dmc dt node.

When I tried to find usage example of 'rockchip,pmu' property,
I found them as following: The 'rockchip,pmu' property[2] indicates
'PMU (Power Management Unit)'. 

$ grep -rn "rockchip,pmu" arch/arm64/boot/dts/
arch/arm64/boot/dts/rockchip/px30.dtsi:1211:rockchip,pmu = 
<>;
arch/arm64/boot/dts/rockchip/rk3399.dtsi:1909:  rockchip,pmu = 
<>;
arch/arm64/boot/dts/rockchip/rk3368.dtsi:807:   rockchip,pmu = 
<>;

[2] the description of 'rockchip,pmu' property
- 
https://elixir.bootlin.com/linux/v5.7.2/source/Documentation/devicetree/bindings/pinctrl/rockchip,pinctrl.txt#L40


If don't receive the any reply, I'll add as following:

cwchoi00@chan-linux-pc:~/kernel/git.kernel/linux.chanwoo$ d
diff --git a/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt 
b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
index 0ec68141f85a..161e60ea874b 100644
--- a/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
+++ b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
@@ -18,6 +18,8 @@ Optional properties:
 format depends on the interrupt controller.
 It should be a DCF interrupt. When DDR DVFS finishes
 a DCF interrupt is triggered.
+- rockchip,pmu: Phandle to the syscon managing the "pmu general
+register files".
 
 Following properties relate to DDR timing:
 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


Re: [PATCH v2] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-29 Thread Enric Balletbo i Serra
Hi Chanwoo and Marc,

On 29/6/20 13:09, Chanwoo Choi wrote:
> Hi Enric,
> 
> Could you check this issue? Your patch[1] causes this issue.
> As Marc mentioned, although rk3399-dmc.c handled 'rockchip,pmu'
> as the mandatory property, your patch[1] didn't add the 'rockchip,pmu'
> property to the documentation. 
> 

I think the problem is that the DT binding patch, for some reason, was missed
and didn't land. The patch seems to have all the required reviews and acks.

  https://patchwork.kernel.org/patch/10901593/

Sorry because I didn't notice this issue when 9173c5ceb035 landed. And thanks
for fixing the issue.

Best regards,
 Enric

> [1] 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT
> and auto power down parameters to TF-A.")
> 
> 
> On 6/29/20 5:18 PM, Marc Zyngier wrote:
>> Hi Chanwoo,
>>
>> On Mon, 29 Jun 2020 03:43:37 +0100,
>> Chanwoo Choi  wrote:
>>>
>>> Hi Marc,
>>>
>>> On 6/23/20 12:28 AM, Marc Zyngier wrote:
>>
>> [...]
>>
>>> It looks good to me. But, I think that it is not necessary
>>> fully kernel panic log about NULL pointer. It is enoughspsp
>>> just mentioning the NULL pointer issue without full kernel panic log.
>>
>> I personally find the backtrace useful as it allows people with the
>> same issue to trawl the kernel log and find whether it has already be
>> fixed upstream. But it's only me, and I'm not attached to it.
>>
>>> So, how about editing the patch description as following or others simply?
>>> and we need to add 'sta...@vger.kernel.org' to Cc list for applying it
>>> to stable branch.
>>
>> Looks good to me.
>>
>>>
>>>
>>>   PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent
>>>
>>> Booting a recent kernel on a rk3399-based system (nanopc-t4),
>>> equipped with a recent u-boot and ATF results in the kernel panic
>>> about NULL pointer issue.
>>
>> nit: "results in a kernel panic on dereferencing a NULL pointer".
>>
>>>
>>> This turns out to be due to the rk3399-dmc driver looking for
>>> an *undocumented* property (rockchip,pmu), and happily using
>>> a NULL pointer when the property isn't there.
>>>
>>> Instead, make most of what was brought in with 9173c5ceb035
>>> ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
>>> to TF-A.") conditioned on finding this property in the device-tree,
>>> preventing the driver from exploding.
>>>
>>> Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto power 
>>> down parameters to TF-A.")
>>> Signed-off-by: Marc Zyngier 
>>> Signed-off-by: Chanwoo Choi 
>>
>>
>> Note that the biggest issue is still there: the driver is using an
>> undocumented property, and this patch is just papering over it.
>> Since I expect this property to be useful for something, it would be
>> good for whoever knows what it does to document it.
> 
> Hi Marc,
> 
> You are right. We have to do two step:
> 1. Add missing explanation of 'rockchip,pmu' property to dt-binding document
> 2. If possible, add 'rockchip,pmu' property node to rk3399_dmc dt node.
> 
> When I tried to find usage example of 'rockchip,pmu' property,
> I found them as following: The 'rockchip,pmu' property[2] indicates
> 'PMU (Power Management Unit)'. 
> 
> $ grep -rn "rockchip,pmu" arch/arm64/boot/dts/
> arch/arm64/boot/dts/rockchip/px30.dtsi:1211:  rockchip,pmu = 
> <>;
> arch/arm64/boot/dts/rockchip/rk3399.dtsi:1909:rockchip,pmu = 
> <>;
> arch/arm64/boot/dts/rockchip/rk3368.dtsi:807: rockchip,pmu = 
> <>;
> 
> [2] the description of 'rockchip,pmu' property
> - 
> https://elixir.bootlin.com/linux/v5.7.2/source/Documentation/devicetree/bindings/pinctrl/rockchip,pinctrl.txt#L40
> 
> 
> If don't receive the any reply, I'll add as following:
> 
> cwchoi00@chan-linux-pc:~/kernel/git.kernel/linux.chanwoo$ d
> diff --git a/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt 
> b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> index 0ec68141f85a..161e60ea874b 100644
> --- a/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> +++ b/Documentation/devicetree/bindings/devfreq/rk3399_dmc.txt
> @@ -18,6 +18,8 @@ Optional properties:
>  format depends on the interrupt controller.
>  It should be a DCF interrupt. When DDR DVFS finishes
>  a DCF interrupt is triggered.
> +- rockchip,pmu: Phandle to the syscon managing the "pmu 
> general
> +register files".
>  
>  Following properties relate to DDR timing:
>  
> 
> 


Re: [PATCH v2] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-29 Thread Chanwoo Choi
Hi Enric,

On 6/29/20 8:26 PM, Enric Balletbo i Serra wrote:
> Hi Chanwoo,
> 
> On 29/6/20 13:29, Chanwoo Choi wrote:
>> Hi Enric and Mark,
>>
>> On 6/29/20 8:05 PM, Enric Balletbo i Serra wrote:
>>> Hi Chanwoo and Marc,
>>>
>>> On 29/6/20 13:09, Chanwoo Choi wrote:
>>>> Hi Enric,
>>>>
>>>> Could you check this issue? Your patch[1] causes this issue.
>>>> As Marc mentioned, although rk3399-dmc.c handled 'rockchip,pmu'
>>>> as the mandatory property, your patch[1] didn't add the 'rockchip,pmu'
>>>> property to the documentation. 
>>>>
>>>
>>> I think the problem is that the DT binding patch, for some reason, was 
>>> missed
>>> and didn't land. The patch seems to have all the required reviews and acks.
>>>
>>>   https://patchwork.kernel.org/patch/10901593/
>>>
>>> Sorry because I didn't notice this issue when 9173c5ceb035 landed. And 
>>> thanks
>>> for fixing the issue.
>>
>> If the 'rockchip,pmu' propery is mandatory, instead of Mark's patch,
>> we better to require the merge of patch[1] to DT maintainer.
>>
>> [1] https://patchwork.kernel.org/patch/10901593/
>>
> 
> Give me some time to double check, because I think that at this point, is 
> needed
> on some devices with old firmware but not now. It's been a while since I 
> worked
> on this, but I suspect that being optional is the right way.

OK. Thanks for your reply.

> 
> Maybe Heiko, who IIRC worked on TF-A has a more clear thought on this?
> 
> Thanks,
>  Enric
> 
>>>
>>> Best regards,
>>>  Enric
>>>
>>>> [1] 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT
>>>> and auto power down parameters to TF-A.")
>>>>
>>>>
>>>> On 6/29/20 5:18 PM, Marc Zyngier wrote:
>>>>> Hi Chanwoo,
>>>>>
>>>>> On Mon, 29 Jun 2020 03:43:37 +0100,
>>>>> Chanwoo Choi  wrote:
>>>>>>
>>>>>> Hi Marc,
>>>>>>
>>>>>> On 6/23/20 12:28 AM, Marc Zyngier wrote:
>>>>>
>>>>> [...]
>>>>>
>>>>>> It looks good to me. But, I think that it is not necessary
>>>>>> fully kernel panic log about NULL pointer. It is enoughspsp
>>>>>> just mentioning the NULL pointer issue without full kernel panic log.
>>>>>
>>>>> I personally find the backtrace useful as it allows people with the
>>>>> same issue to trawl the kernel log and find whether it has already be
>>>>> fixed upstream. But it's only me, and I'm not attached to it.
>>>>>
>>>>>> So, how about editing the patch description as following or others 
>>>>>> simply?
>>>>>> and we need to add 'sta...@vger.kernel.org' to Cc list for applying it
>>>>>> to stable branch.
>>>>>
>>>>> Looks good to me.
>>>>>
>>>>>>
>>>>>>
>>>>>>   PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent
>>>>>>
>>>>>> Booting a recent kernel on a rk3399-based system (nanopc-t4),
>>>>>> equipped with a recent u-boot and ATF results in the kernel panic
>>>>>> about NULL pointer issue.
>>>>>
>>>>> nit: "results in a kernel panic on dereferencing a NULL pointer".
>>>>>
>>>>>>
>>>>>> This turns out to be due to the rk3399-dmc driver looking for
>>>>>> an *undocumented* property (rockchip,pmu), and happily using
>>>>>> a NULL pointer when the property isn't there.
>>>>>>
>>>>>> Instead, make most of what was brought in with 9173c5ceb035
>>>>>> ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
>>>>>> to TF-A.") conditioned on finding this property in the device-tree,
>>>>>> preventing the driver from exploding.
>>>>>>
>>>>>> Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto 
>>>>>> power down parameters to TF-A.")
>>>>>> Signed-off-by: Marc Zyngier 
>>>>>> Signed-off-by: Chanwoo Choi 
>>>>>
>>>>>
>>>>> Note that the biggest issue is still there: the driver 

Re: [PATCH v2] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-29 Thread Marc Zyngier
Hi Chanwoo,

On Mon, 29 Jun 2020 03:43:37 +0100,
Chanwoo Choi  wrote:
> 
> Hi Marc,
> 
> On 6/23/20 12:28 AM, Marc Zyngier wrote:

[...]

> It looks good to me. But, I think that it is not necessary
> fully kernel panic log about NULL pointer. It is enoughspsp
> just mentioning the NULL pointer issue without full kernel panic log.

I personally find the backtrace useful as it allows people with the
same issue to trawl the kernel log and find whether it has already be
fixed upstream. But it's only me, and I'm not attached to it.

> So, how about editing the patch description as following or others simply?
> and we need to add 'sta...@vger.kernel.org' to Cc list for applying it
> to stable branch.

Looks good to me.

> 
> 
>   PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent
> 
> Booting a recent kernel on a rk3399-based system (nanopc-t4),
> equipped with a recent u-boot and ATF results in the kernel panic
> about NULL pointer issue.

nit: "results in a kernel panic on dereferencing a NULL pointer".

> 
> This turns out to be due to the rk3399-dmc driver looking for
> an *undocumented* property (rockchip,pmu), and happily using
> a NULL pointer when the property isn't there.
>
> Instead, make most of what was brought in with 9173c5ceb035
> ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
> to TF-A.") conditioned on finding this property in the device-tree,
> preventing the driver from exploding.
> 
> Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto power 
> down parameters to TF-A.")
> Signed-off-by: Marc Zyngier 
> Signed-off-by: Chanwoo Choi 


Note that the biggest issue is still there: the driver is using an
undocumented property, and this patch is just papering over it.
Since I expect this property to be useful for something, it would be
good for whoever knows what it does to document it.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v2] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-29 Thread Marc Zyngier

On 2020-06-29 12:29, Chanwoo Choi wrote:

Hi Enric and Mark,

On 6/29/20 8:05 PM, Enric Balletbo i Serra wrote:

Hi Chanwoo and Marc,

On 29/6/20 13:09, Chanwoo Choi wrote:

Hi Enric,

Could you check this issue? Your patch[1] causes this issue.
As Marc mentioned, although rk3399-dmc.c handled 'rockchip,pmu'
as the mandatory property, your patch[1] didn't add the 
'rockchip,pmu'

property to the documentation.



I think the problem is that the DT binding patch, for some reason, was 
missed
and didn't land. The patch seems to have all the required reviews and 
acks.


  https://patchwork.kernel.org/patch/10901593/

Sorry because I didn't notice this issue when 9173c5ceb035 landed. And 
thanks

for fixing the issue.


If the 'rockchip,pmu' propery is mandatory, instead of Mark's patch,
we better to require the merge of patch[1] to DT maintainer.


It is way too late. Firmware exists (mainline u-boot, for one) that
do not expose the new property, and you can't demand that people
upgrade. This is an ABI bug, and we now have to live with it.

So, yes to fixing the DT, and no to *only* fixing the DT.

Thanks,

M.
--
Jazz is not dead. It just smells funny...


Re: [PATCH v2] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-28 Thread Chanwoo Choi
e.
> 
> Instead, make most of what was brought in with 9173c5ceb035
> ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
> to TF-A.") conditioned on finding this property in the device-tree,
> preventing the driver from exploding.
> 
> Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down 
> parameters to TF-A.")
> Signed-off-by: Marc Zyngier 
> ---
>  drivers/devfreq/rk3399_dmc.c | 42 
>  1 file changed, 23 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
> index 24f04f78285b..027769e39f9b 100644
> --- a/drivers/devfreq/rk3399_dmc.c
> +++ b/drivers/devfreq/rk3399_dmc.c
> @@ -95,18 +95,20 @@ static int rk3399_dmcfreq_target(struct device *dev, 
> unsigned long *freq,
>  
>   mutex_lock(>lock);
>  
> - if (target_rate >= dmcfreq->odt_dis_freq)
> - odt_enable = true;
> -
> - /*
> -  * This makes a SMC call to the TF-A to set the DDR PD (power-down)
> -  * timings and to enable or disable the ODT (on-die termination)
> -  * resistors.
> -  */
> - arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, dmcfreq->odt_pd_arg0,
> -   dmcfreq->odt_pd_arg1,
> -   ROCKCHIP_SIP_CONFIG_DRAM_SET_ODT_PD,
> -   odt_enable, 0, 0, 0, );
> + if (dmcfreq->regmap_pmu) {
> + if (target_rate >= dmcfreq->odt_dis_freq)
> + odt_enable = true;
> +
> + /*
> +  * This makes a SMC call to the TF-A to set the DDR PD
> +  * (power-down) timings and to enable or disable the
> +  * ODT (on-die termination) resistors.
> +  */
> + arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, dmcfreq->odt_pd_arg0,
> +   dmcfreq->odt_pd_arg1,
> +   ROCKCHIP_SIP_CONFIG_DRAM_SET_ODT_PD,
> +   odt_enable, 0, 0, 0, );
> + }
>  
>   /*
>* If frequency scaling from low to high, adjust voltage first.
> @@ -371,13 +373,14 @@ static int rk3399_dmcfreq_probe(struct platform_device 
> *pdev)
>   }
>  
>   node = of_parse_phandle(np, "rockchip,pmu", 0);
> - if (node) {
> - data->regmap_pmu = syscon_node_to_regmap(node);
> - of_node_put(node);
> - if (IS_ERR(data->regmap_pmu)) {
> - ret = PTR_ERR(data->regmap_pmu);
> - goto err_edev;
> - }
> + if (!node)
> + goto no_pmu;
> +
> + data->regmap_pmu = syscon_node_to_regmap(node);
> + of_node_put(node);
> + if (IS_ERR(data->regmap_pmu)) {
> + ret = PTR_ERR(data->regmap_pmu);
> + goto err_edev;
>   }
>  
>   regmap_read(data->regmap_pmu, RK3399_PMUGRF_OS_REG2, );
> @@ -399,6 +402,7 @@ static int rk3399_dmcfreq_probe(struct platform_device 
> *pdev)
>   goto err_edev;
>   };
>  
> +no_pmu:
>   arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, 0, 0,
> ROCKCHIP_SIP_CONFIG_DRAM_INIT,
> 0, 0, 0, 0, );
> 

It looks good to me. But, I think that it is not necessary
fully kernel panic log about NULL pointer. It is enoughspsp
just mentioning the NULL pointer issue without full kernel panic log.

So, how about editing the patch description as following or others simply?
and we need to add 'sta...@vger.kernel.org' to Cc list for applying it
to stable branch.


  PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

Booting a recent kernel on a rk3399-based system (nanopc-t4),
equipped with a recent u-boot and ATF results in the kernel panic
about NULL pointer issue.

This turns out to be due to the rk3399-dmc driver looking for
an *undocumented* property (rockchip,pmu), and happily using
a NULL pointer when the property isn't there.

Instead, make most of what was brought in with 9173c5ceb035
("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
to TF-A.") conditioned on finding this property in the device-tree,
preventing the driver from exploding.

Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto power 
down parameters to TF-A.")
Signed-off-by: Marc Zyngier 
Signed-off-by: Chanwoo Choi 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


Re: [PATCH] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-23 Thread Marc Zyngier

On 2020-06-23 09:55, Heiko Stübner wrote:

Am Montag, 22. Juni 2020, 17:07:52 CEST schrieb Marc Zyngier:


[...]

maz@fine-girl:~$ sudo dtc -I dtb /sys/firmware/fdt 2>/dev/null | grep 
-A

5 dmc
dmc {
u-boot,dm-pre-reloc;
compatible = "rockchip,rk3399-dmc";
devfreq-events = <0xc8>;

[followed by a ton of timings...]

It is definitely coming from u-boot (I don't provide any DTB 
otherwise,
and you can find the corresponding node and timings in the u-boot 
tree).


which is probably the source of the problem :-) .

I'm pretty sure the "reviewed" binding in the kernel doesn't match the
dt-nodes used in uboot.


and the driver doesn't match the binding either. Frankly, this is badly
messed up.

While u-boot these days syncs the main devicetrees from Linux, the 
memory
setup stuff is pretty specific to uboot (and lives in separate dtsi 
files).


And I guess you're the only one feeding uboot's dtb to Linux directly, 
hence

nobody else did encounter this before ;-) .


I'm not "feeding" it directly. I'm using the expected DT distribution
mechanism, which is the boot firmware. Nobody should ever have to 
provide

their own DT to the kernel.

Thanks,

M. (starting to like ACPI more and more every day)
--
Jazz is not dead. It just smells funny...


Re: [PATCH] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-23 Thread Heiko Stübner
Am Montag, 22. Juni 2020, 17:07:52 CEST schrieb Marc Zyngier:
> Hi Heiko,
> 
> On 2020-06-22 14:54, Heiko Stübner wrote:
> > Hi Marc,
> > 
> > Am Montag, 22. Juni 2020, 15:31:55 CEST schrieb Marc Zyngier:
> >> On Sat, 13 Jun 2020 11:24:35 +0100
> >> Marc Zyngier  wrote:
> >> 
> >> > Booting a recent kernel on a rk3399-based system (nanopc-t4),
> >> > equipped with a recent u-boot and ATF results in the following:
> >> >
> >> > [5.607431] Unable to handle kernel NULL pointer dereference at 
> >> > virtual address 01e4
> >> > [5.608219] Mem abort info:
> >> > [5.608469]   ESR = 0x9604
> >> > [5.608749]   EC = 0x25: DABT (current EL), IL = 32 bits
> >> > [5.609223]   SET = 0, FnV = 0
> >> > [5.609600]   EA = 0, S1PTW = 0
> >> > [5.609891] Data abort info:
> >> > [5.610149]   ISV = 0, ISS = 0x0004
> >> > [5.610489]   CM = 0, WnR = 0
> >> > [5.610757] user pgtable: 4k pages, 48-bit VAs, pgdp=e62fb000
> >> > [5.611326] [01e4] pgd=, 
> >> > p4d=
> >> > [5.611931] Internal error: Oops: 9604 [#1] SMP
> >> > [5.612363] Modules linked in: rockchip_thermal(E+) rk3399_dmc(E+) 
> >> > soundcore(E) dw_wdt(E) rockchip_dfi(E) nvmem_rockchip_efuse(E) 
> >> > pwm_rockchip(E) cfg80211(E+) rockchip_saradc(E) industrialio(E) 
> >> > rfkill(E) cpufreq_dt(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) 
> >> > crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) realtek(E) nvme(E) 
> >> > nvme_core(E) t10_pi(E) xhci_plat_hcd(E) xhci_hcd(E) rtc_rk808(E) 
> >> > rk808_regulator(E) clk_rk808(E) dwc3(E) udc_core(E) roles(E) ulpi(E) 
> >> > rk808(E)
> >> fan53555(E) rockchipdrm(E) analogix_dp(E) dw_hdmi(E) cec(E)
> >> dw_mipi_dsi(E) fixed(E) dwc3_of_simple(E) phy_rockchip_emmc(E)
> >> gpio_keys(E) drm_kms_helper(E) phy_rockchip_inno_usb2(E)
> >> ehci_platform(E) dwmac_rk(E) stmmac_platform(E) phy_rockchip_pcie(E)
> >> ohci_platform(E) ohci_hcd(E) rockchip_io_domain(E) stmmac(E)
> >> phy_rockchip_typec(E) ehci_hcd(E) sdhci_of_arasan(E) mdio_xpcs(E)
> >> sdhci_pltfm(E) cqhci(E) drm(E) sdhci(E) phylink(E) of_mdio(E)
> >> usbcore(E) i2c_rk3x(E) dw_mmc_rockchip(E) dw_mmc_pltfm(E) dw_mmc(E)
> >> fixed_phy(E) libphy(E)
> >> > [5.612454]  pl330(E)
> >> > [5.620255] CPU: 1 PID: 270 Comm: systemd-udevd Tainted: G
> >> > E 5.7.0-13692-g83ae758d8b22 #1157
> >> > [5.621110] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 
> >> > 2020.07-rc4-00023-g10d4cafe0f 06/10/2020
> >> > [5.621947] pstate: 4005 (nZcv daif -PAN -UAO BTYPE=--)
> >> > [5.622446] pc : regmap_read+0x1c/0x80
> >> > [5.622787] lr : rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
> >> > [5.623299] sp : 8000126cb8a0
> >> > [5.623594] x29: 8000126cb8a0 x28: 8000126cbdb0
> >> > [5.624063] x27: f22dac40 x26: f6779800
> >> > [5.624533] x25: f6779810 x24: ffea
> >> > [5.625002] x23: ffea x22: f65b74c8
> >> > [5.625471] x21: f783ca08 x20: f65b7480
> >> > [5.625941] x19:  x18: 0001
> >> > [5.626410] x17:  x16: 
> >> > [5.626878] x15: f22db138 x14: 
> >> > [5.627347] x13: 0018 x12: 80001106a8c7
> >> > [5.627817] x11: 0003 x10: 0101010101010101
> >> > [5.627861] systemd[1]: Found device SPCC M.2 PCIE SSD 3.
> >> > [5.628286] x9 : 88d7c89c x8 : 7f7f7f7f7f7f7f7f
> >> > [5.629238] x7 : fefefeff646c606d x6 : 1c0e0e0ee3e8e9f0
> >> > [5.629709] x5 : 706968630e0e0e1c x4 : 80808080
> >> > [5.630178] x3 : 937b1b5b1b434b80 x2 : 8000126cb944
> >> > [5.630648] x1 : 0308 x0 : 
> >> > [5.631119] Call trace:
> >> > [5.631346]  regmap_read+0x1c/0x80
> >> > [5.631654]  rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
> >> > [5.632142]  platform_drv_probe+0x5c/0xb0
> >> > [5.632500]  really_probe+0xe4/0x448
> >> > [5.632819]  driver_probe_device+0xfc/0x168
> >> > [5.633191]  device_driver_attach+0x7c/0x88
> >> > [5.633567]  __driver_attach+0xac/0x178
> >> > [5.633914]  bus_for_each_dev+0x78/0xc8
> >> > [5.634261]  driver_attach+0x2c/0x38
> >> > [5.634582]  bus_add_driver+0x14c/0x230
> >> > [5.634925]  driver_register+0x6c/0x128
> >> > [5.635269]  __platform_driver_register+0x50/0x60
> >> > [5.635692]  rk3399_dmcfreq_driver_init+0x2c/0x1000 [rk3399_dmc]
> >> > [5.636226]  do_one_initcall+0x50/0x230
> >> > [5.636569]  do_init_module+0x60/0x248
> >> > [5.636902]  load_module+0x21f8/0x28d8
> >> > [5.637237]  __do_sys_finit_module+0xb0/0x118
> >> > [5.637627]  __arm64_sys_finit_module+0x28/0x38
> >> > [5.638031]  el0_svc_common.constprop.0+0x7c/0x1f8
> >> > [5.638456]  do_el0_svc+0x2c/0x98
> >> > [5.638754]  el0_svc+0x18/0x48
> >> > [5.639029]  

[PATCH v2] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-22 Thread Marc Zyngier
Booting a recent kernel on a rk3399-based system (nanopc-t4),
equipped with a recent u-boot and ATF results in the following:

[5.607431] Unable to handle kernel NULL pointer dereference at virtual 
address 01e4
[5.608219] Mem abort info:
[5.608469]   ESR = 0x9604
[5.608749]   EC = 0x25: DABT (current EL), IL = 32 bits
[5.609223]   SET = 0, FnV = 0
[5.609600]   EA = 0, S1PTW = 0
[5.609891] Data abort info:
[5.610149]   ISV = 0, ISS = 0x0004
[5.610489]   CM = 0, WnR = 0
[5.610757] user pgtable: 4k pages, 48-bit VAs, pgdp=e62fb000
[5.611326] [01e4] pgd=, p4d=
[5.611931] Internal error: Oops: 9604 [#1] SMP
[5.612363] Modules linked in: rockchip_thermal(E+) rk3399_dmc(E+) 
soundcore(E) dw_wdt(E) rockchip_dfi(E) nvmem_rockchip_efuse(E) pwm_rockchip(E) 
cfg80211(E+) rockchip_saradc(E) industrialio(E) rfkill(E) cpufreq_dt(E) 
ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc32c_generic(E) crc16(E) 
mbcache(E) jbd2(E) realtek(E) nvme(E) nvme_core(E) t10_pi(E) xhci_plat_hcd(E) 
xhci_hcd(E) rtc_rk808(E) rk808_regulator(E) clk_rk808(E) dwc3(E) udc_core(E) 
roles(E) ulpi(E) rk808(E) fan53555(E) rockchipdrm(E) analogix_dp(E) dw_hdmi(E) 
cec(E) dw_mipi_dsi(E) fixed(E) dwc3_of_simple(E) phy_rockchip_emmc(E) 
gpio_keys(E) drm_kms_helper(E) phy_rockchip_inno_usb2(E) ehci_platform(E) 
dwmac_rk(E) stmmac_platform(E) phy_rockchip_pcie(E) ohci_platform(E) 
ohci_hcd(E) rockchip_io_domain(E) stmmac(E) phy_rockchip_typec(E) ehci_hcd(E) 
sdhci_of_arasan(E) mdio_xpcs(E) sdhci_pltfm(E) cqhci(E) drm(E) sdhci(E) 
phylink(E) of_mdio(E) usbcore(E) i2c_rk3x(E) dw_mmc_rockchip(E) dw_mmc_pltfm(E) 
dw_mmc(E) fixed_phy(E) libphy(E)
[5.612454]  pl330(E)
[5.620255] CPU: 1 PID: 270 Comm: systemd-udevd Tainted: GE 
5.7.0-13692-g83ae758d8b22 #1157
[5.621110] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 
2020.07-rc4-00023-g10d4cafe0f 06/10/2020
[5.621947] pstate: 4005 (nZcv daif -PAN -UAO BTYPE=--)
[5.622446] pc : regmap_read+0x1c/0x80
[5.622787] lr : rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
[5.623299] sp : 8000126cb8a0
[5.623594] x29: 8000126cb8a0 x28: 8000126cbdb0
[5.624063] x27: f22dac40 x26: f6779800
[5.624533] x25: f6779810 x24: ffea
[5.625002] x23: ffea x22: f65b74c8
[5.625471] x21: f783ca08 x20: f65b7480
[5.625941] x19:  x18: 0001
[5.626410] x17:  x16: 
[5.626878] x15: f22db138 x14: 
[5.627347] x13: 0018 x12: 80001106a8c7
[5.627817] x11: 0003 x10: 0101010101010101
[5.627861] systemd[1]: Found device SPCC M.2 PCIE SSD 3.
[5.628286] x9 : 88d7c89c x8 : 7f7f7f7f7f7f7f7f
[5.629238] x7 : fefefeff646c606d x6 : 1c0e0e0ee3e8e9f0
[5.629709] x5 : 706968630e0e0e1c x4 : 80808080
[5.630178] x3 : 937b1b5b1b434b80 x2 : 8000126cb944
[5.630648] x1 : 0308 x0 : 
[5.631119] Call trace:
[5.631346]  regmap_read+0x1c/0x80
[5.631654]  rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
[5.632142]  platform_drv_probe+0x5c/0xb0
[5.632500]  really_probe+0xe4/0x448
[5.632819]  driver_probe_device+0xfc/0x168
[5.633191]  device_driver_attach+0x7c/0x88
[5.633567]  __driver_attach+0xac/0x178
[5.633914]  bus_for_each_dev+0x78/0xc8
[5.634261]  driver_attach+0x2c/0x38
[5.634582]  bus_add_driver+0x14c/0x230
[5.634925]  driver_register+0x6c/0x128
[5.635269]  __platform_driver_register+0x50/0x60
[5.635692]  rk3399_dmcfreq_driver_init+0x2c/0x1000 [rk3399_dmc]
[5.636226]  do_one_initcall+0x50/0x230
[5.636569]  do_init_module+0x60/0x248
[5.636902]  load_module+0x21f8/0x28d8
[5.637237]  __do_sys_finit_module+0xb0/0x118
[5.637627]  __arm64_sys_finit_module+0x28/0x38
[5.638031]  el0_svc_common.constprop.0+0x7c/0x1f8
[5.638456]  do_el0_svc+0x2c/0x98
[5.638754]  el0_svc+0x18/0x48
[5.639029]  el0_sync_handler+0x8c/0x2d4
[5.639378]  el0_sync+0x158/0x180
[5.639680] Code: a9bd7bfd 910003fd a90153f3 aa0003f3 (b941e400)
[5.640221] ---[ end trace 63675fe5d0021970 ]---

This turns out to be due to the rk3399-dmc driver looking for
an *undocumented* property (rockchip,pmu), and happily using
a NULL pointer when the property isn't there.

Instead, make most of what was brought in with 9173c5ceb035
("PM / devfreq: rk3399_dmc: Pass ODT and auto power down parameters
to TF-A.") conditioned on finding this property in the device-tree,
preventing the driver from exploding.

Fixes: 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto power down 
parameters to TF-A.")
Signed-off-by: Marc Zyngier 
---
 drivers/devfreq/rk3399_dmc.c | 42 
 1 file changed, 23 insertions(+), 19 

Re: [PATCH] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-22 Thread Marc Zyngier

Hi Heiko,

On 2020-06-22 14:54, Heiko Stübner wrote:

Hi Marc,

Am Montag, 22. Juni 2020, 15:31:55 CEST schrieb Marc Zyngier:

On Sat, 13 Jun 2020 11:24:35 +0100
Marc Zyngier  wrote:

> Booting a recent kernel on a rk3399-based system (nanopc-t4),
> equipped with a recent u-boot and ATF results in the following:
>
> [5.607431] Unable to handle kernel NULL pointer dereference at virtual 
address 01e4
> [5.608219] Mem abort info:
> [5.608469]   ESR = 0x9604
> [5.608749]   EC = 0x25: DABT (current EL), IL = 32 bits
> [5.609223]   SET = 0, FnV = 0
> [5.609600]   EA = 0, S1PTW = 0
> [5.609891] Data abort info:
> [5.610149]   ISV = 0, ISS = 0x0004
> [5.610489]   CM = 0, WnR = 0
> [5.610757] user pgtable: 4k pages, 48-bit VAs, pgdp=e62fb000
> [5.611326] [01e4] pgd=, p4d=
> [5.611931] Internal error: Oops: 9604 [#1] SMP
> [5.612363] Modules linked in: rockchip_thermal(E+) rk3399_dmc(E+) 
soundcore(E) dw_wdt(E) rockchip_dfi(E) nvmem_rockchip_efuse(E) pwm_rockchip(E) 
cfg80211(E+) rockchip_saradc(E) industrialio(E) rfkill(E) cpufreq_dt(E) 
ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc32c_generic(E) crc16(E) mbcache(E) 
jbd2(E) realtek(E) nvme(E) nvme_core(E) t10_pi(E) xhci_plat_hcd(E) xhci_hcd(E) 
rtc_rk808(E) rk808_regulator(E) clk_rk808(E) dwc3(E) udc_core(E) roles(E) ulpi(E) 
rk808(E)
fan53555(E) rockchipdrm(E) analogix_dp(E) dw_hdmi(E) cec(E)
dw_mipi_dsi(E) fixed(E) dwc3_of_simple(E) phy_rockchip_emmc(E)
gpio_keys(E) drm_kms_helper(E) phy_rockchip_inno_usb2(E)
ehci_platform(E) dwmac_rk(E) stmmac_platform(E) phy_rockchip_pcie(E)
ohci_platform(E) ohci_hcd(E) rockchip_io_domain(E) stmmac(E)
phy_rockchip_typec(E) ehci_hcd(E) sdhci_of_arasan(E) mdio_xpcs(E)
sdhci_pltfm(E) cqhci(E) drm(E) sdhci(E) phylink(E) of_mdio(E)
usbcore(E) i2c_rk3x(E) dw_mmc_rockchip(E) dw_mmc_pltfm(E) dw_mmc(E)
fixed_phy(E) libphy(E)
> [5.612454]  pl330(E)
> [5.620255] CPU: 1 PID: 270 Comm: systemd-udevd Tainted: GE
 5.7.0-13692-g83ae758d8b22 #1157
> [5.621110] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 
2020.07-rc4-00023-g10d4cafe0f 06/10/2020
> [5.621947] pstate: 4005 (nZcv daif -PAN -UAO BTYPE=--)
> [5.622446] pc : regmap_read+0x1c/0x80
> [5.622787] lr : rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
> [5.623299] sp : 8000126cb8a0
> [5.623594] x29: 8000126cb8a0 x28: 8000126cbdb0
> [5.624063] x27: f22dac40 x26: f6779800
> [5.624533] x25: f6779810 x24: ffea
> [5.625002] x23: ffea x22: f65b74c8
> [5.625471] x21: f783ca08 x20: f65b7480
> [5.625941] x19:  x18: 0001
> [5.626410] x17:  x16: 
> [5.626878] x15: f22db138 x14: 
> [5.627347] x13: 0018 x12: 80001106a8c7
> [5.627817] x11: 0003 x10: 0101010101010101
> [5.627861] systemd[1]: Found device SPCC M.2 PCIE SSD 3.
> [5.628286] x9 : 88d7c89c x8 : 7f7f7f7f7f7f7f7f
> [5.629238] x7 : fefefeff646c606d x6 : 1c0e0e0ee3e8e9f0
> [5.629709] x5 : 706968630e0e0e1c x4 : 80808080
> [5.630178] x3 : 937b1b5b1b434b80 x2 : 8000126cb944
> [5.630648] x1 : 0308 x0 : 
> [5.631119] Call trace:
> [5.631346]  regmap_read+0x1c/0x80
> [5.631654]  rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
> [5.632142]  platform_drv_probe+0x5c/0xb0
> [5.632500]  really_probe+0xe4/0x448
> [5.632819]  driver_probe_device+0xfc/0x168
> [5.633191]  device_driver_attach+0x7c/0x88
> [5.633567]  __driver_attach+0xac/0x178
> [5.633914]  bus_for_each_dev+0x78/0xc8
> [5.634261]  driver_attach+0x2c/0x38
> [5.634582]  bus_add_driver+0x14c/0x230
> [5.634925]  driver_register+0x6c/0x128
> [5.635269]  __platform_driver_register+0x50/0x60
> [5.635692]  rk3399_dmcfreq_driver_init+0x2c/0x1000 [rk3399_dmc]
> [5.636226]  do_one_initcall+0x50/0x230
> [5.636569]  do_init_module+0x60/0x248
> [5.636902]  load_module+0x21f8/0x28d8
> [5.637237]  __do_sys_finit_module+0xb0/0x118
> [5.637627]  __arm64_sys_finit_module+0x28/0x38
> [5.638031]  el0_svc_common.constprop.0+0x7c/0x1f8
> [5.638456]  do_el0_svc+0x2c/0x98
> [5.638754]  el0_svc+0x18/0x48
> [5.639029]  el0_sync_handler+0x8c/0x2d4
> [5.639378]  el0_sync+0x158/0x180
> [5.639680] Code: a9bd7bfd 910003fd a90153f3 aa0003f3 (b941e400)
> [5.640221] ---[ end trace 63675fe5d0021970 ]---
>
> This turns out to be due to the rk3399-dmc driver looking for
> an *undocumented* property (rockchip,pmu), and happily using
> a NULL pointer when the property isn't there.
>
> The very existence of this driver in the kernel is highly doubtful
> (I'd expect firmware to deal with this directly), but in the meantime
> let's prevent it from 

Re: [PATCH] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-22 Thread Heiko Stübner
Hi Marc,

Am Montag, 22. Juni 2020, 15:31:55 CEST schrieb Marc Zyngier:
> On Sat, 13 Jun 2020 11:24:35 +0100
> Marc Zyngier  wrote:
> 
> > Booting a recent kernel on a rk3399-based system (nanopc-t4),
> > equipped with a recent u-boot and ATF results in the following:
> > 
> > [5.607431] Unable to handle kernel NULL pointer dereference at virtual 
> > address 01e4
> > [5.608219] Mem abort info:
> > [5.608469]   ESR = 0x9604
> > [5.608749]   EC = 0x25: DABT (current EL), IL = 32 bits
> > [5.609223]   SET = 0, FnV = 0
> > [5.609600]   EA = 0, S1PTW = 0
> > [5.609891] Data abort info:
> > [5.610149]   ISV = 0, ISS = 0x0004
> > [5.610489]   CM = 0, WnR = 0
> > [5.610757] user pgtable: 4k pages, 48-bit VAs, pgdp=e62fb000
> > [5.611326] [01e4] pgd=, p4d=
> > [5.611931] Internal error: Oops: 9604 [#1] SMP
> > [5.612363] Modules linked in: rockchip_thermal(E+) rk3399_dmc(E+) 
> > soundcore(E) dw_wdt(E) rockchip_dfi(E) nvmem_rockchip_efuse(E) 
> > pwm_rockchip(E) cfg80211(E+) rockchip_saradc(E) industrialio(E) rfkill(E) 
> > cpufreq_dt(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc32c_generic(E) 
> > crc16(E) mbcache(E) jbd2(E) realtek(E) nvme(E) nvme_core(E) t10_pi(E) 
> > xhci_plat_hcd(E) xhci_hcd(E) rtc_rk808(E) rk808_regulator(E) clk_rk808(E) 
> > dwc3(E) udc_core(E) roles(E) ulpi(E) rk808(E)
> fan53555(E) rockchipdrm(E) analogix_dp(E) dw_hdmi(E) cec(E)
> dw_mipi_dsi(E) fixed(E) dwc3_of_simple(E) phy_rockchip_emmc(E)
> gpio_keys(E) drm_kms_helper(E) phy_rockchip_inno_usb2(E)
> ehci_platform(E) dwmac_rk(E) stmmac_platform(E) phy_rockchip_pcie(E)
> ohci_platform(E) ohci_hcd(E) rockchip_io_domain(E) stmmac(E)
> phy_rockchip_typec(E) ehci_hcd(E) sdhci_of_arasan(E) mdio_xpcs(E)
> sdhci_pltfm(E) cqhci(E) drm(E) sdhci(E) phylink(E) of_mdio(E)
> usbcore(E) i2c_rk3x(E) dw_mmc_rockchip(E) dw_mmc_pltfm(E) dw_mmc(E)
> fixed_phy(E) libphy(E)
> > [5.612454]  pl330(E)
> > [5.620255] CPU: 1 PID: 270 Comm: systemd-udevd Tainted: GE  
> >5.7.0-13692-g83ae758d8b22 #1157
> > [5.621110] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 
> > 2020.07-rc4-00023-g10d4cafe0f 06/10/2020
> > [5.621947] pstate: 4005 (nZcv daif -PAN -UAO BTYPE=--)
> > [5.622446] pc : regmap_read+0x1c/0x80
> > [5.622787] lr : rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
> > [5.623299] sp : 8000126cb8a0
> > [5.623594] x29: 8000126cb8a0 x28: 8000126cbdb0
> > [5.624063] x27: f22dac40 x26: f6779800
> > [5.624533] x25: f6779810 x24: ffea
> > [5.625002] x23: ffea x22: f65b74c8
> > [5.625471] x21: f783ca08 x20: f65b7480
> > [5.625941] x19:  x18: 0001
> > [5.626410] x17:  x16: 
> > [5.626878] x15: f22db138 x14: 
> > [5.627347] x13: 0018 x12: 80001106a8c7
> > [5.627817] x11: 0003 x10: 0101010101010101
> > [5.627861] systemd[1]: Found device SPCC M.2 PCIE SSD 3.
> > [5.628286] x9 : 88d7c89c x8 : 7f7f7f7f7f7f7f7f
> > [5.629238] x7 : fefefeff646c606d x6 : 1c0e0e0ee3e8e9f0
> > [5.629709] x5 : 706968630e0e0e1c x4 : 80808080
> > [5.630178] x3 : 937b1b5b1b434b80 x2 : 8000126cb944
> > [5.630648] x1 : 0308 x0 : 
> > [5.631119] Call trace:
> > [5.631346]  regmap_read+0x1c/0x80
> > [5.631654]  rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
> > [5.632142]  platform_drv_probe+0x5c/0xb0
> > [5.632500]  really_probe+0xe4/0x448
> > [5.632819]  driver_probe_device+0xfc/0x168
> > [5.633191]  device_driver_attach+0x7c/0x88
> > [5.633567]  __driver_attach+0xac/0x178
> > [5.633914]  bus_for_each_dev+0x78/0xc8
> > [5.634261]  driver_attach+0x2c/0x38
> > [5.634582]  bus_add_driver+0x14c/0x230
> > [5.634925]  driver_register+0x6c/0x128
> > [5.635269]  __platform_driver_register+0x50/0x60
> > [5.635692]  rk3399_dmcfreq_driver_init+0x2c/0x1000 [rk3399_dmc]
> > [5.636226]  do_one_initcall+0x50/0x230
> > [5.636569]  do_init_module+0x60/0x248
> > [5.636902]  load_module+0x21f8/0x28d8
> > [5.637237]  __do_sys_finit_module+0xb0/0x118
> > [5.637627]  __arm64_sys_finit_module+0x28/0x38
> > [5.638031]  el0_svc_common.constprop.0+0x7c/0x1f8
> > [5.638456]  do_el0_svc+0x2c/0x98
> > [5.638754]  el0_svc+0x18/0x48
> > [5.639029]  el0_sync_handler+0x8c/0x2d4
> > [5.639378]  el0_sync+0x158/0x180
> > [5.639680] Code: a9bd7bfd 910003fd a90153f3 aa0003f3 (b941e400)
> > [5.640221] ---[ end trace 63675fe5d0021970 ]---
> > 
> > This turns out to be due to the rk3399-dmc driver looking for
> > an *undocumented* property (rockchip,pmu), and happily using
> > a NULL pointer when the property isn't there.
> > 
> > The very existence 

Re: [PATCH] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-22 Thread Marc Zyngier
Hi Heiko,

On Sat, 13 Jun 2020 11:24:35 +0100
Marc Zyngier  wrote:

> Booting a recent kernel on a rk3399-based system (nanopc-t4),
> equipped with a recent u-boot and ATF results in the following:
> 
> [5.607431] Unable to handle kernel NULL pointer dereference at virtual 
> address 01e4
> [5.608219] Mem abort info:
> [5.608469]   ESR = 0x9604
> [5.608749]   EC = 0x25: DABT (current EL), IL = 32 bits
> [5.609223]   SET = 0, FnV = 0
> [5.609600]   EA = 0, S1PTW = 0
> [5.609891] Data abort info:
> [5.610149]   ISV = 0, ISS = 0x0004
> [5.610489]   CM = 0, WnR = 0
> [5.610757] user pgtable: 4k pages, 48-bit VAs, pgdp=e62fb000
> [5.611326] [01e4] pgd=, p4d=
> [5.611931] Internal error: Oops: 9604 [#1] SMP
> [5.612363] Modules linked in: rockchip_thermal(E+) rk3399_dmc(E+) 
> soundcore(E) dw_wdt(E) rockchip_dfi(E) nvmem_rockchip_efuse(E) 
> pwm_rockchip(E) cfg80211(E+) rockchip_saradc(E) industrialio(E) rfkill(E) 
> cpufreq_dt(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc32c_generic(E) 
> crc16(E) mbcache(E) jbd2(E) realtek(E) nvme(E) nvme_core(E) t10_pi(E) 
> xhci_plat_hcd(E) xhci_hcd(E) rtc_rk808(E) rk808_regulator(E) clk_rk808(E) 
> dwc3(E) udc_core(E) roles(E) ulpi(E) rk808(E)
fan53555(E) rockchipdrm(E) analogix_dp(E) dw_hdmi(E) cec(E)
dw_mipi_dsi(E) fixed(E) dwc3_of_simple(E) phy_rockchip_emmc(E)
gpio_keys(E) drm_kms_helper(E) phy_rockchip_inno_usb2(E)
ehci_platform(E) dwmac_rk(E) stmmac_platform(E) phy_rockchip_pcie(E)
ohci_platform(E) ohci_hcd(E) rockchip_io_domain(E) stmmac(E)
phy_rockchip_typec(E) ehci_hcd(E) sdhci_of_arasan(E) mdio_xpcs(E)
sdhci_pltfm(E) cqhci(E) drm(E) sdhci(E) phylink(E) of_mdio(E)
usbcore(E) i2c_rk3x(E) dw_mmc_rockchip(E) dw_mmc_pltfm(E) dw_mmc(E)
fixed_phy(E) libphy(E)
> [5.612454]  pl330(E)
> [5.620255] CPU: 1 PID: 270 Comm: systemd-udevd Tainted: GE
>  5.7.0-13692-g83ae758d8b22 #1157
> [5.621110] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 
> 2020.07-rc4-00023-g10d4cafe0f 06/10/2020
> [5.621947] pstate: 4005 (nZcv daif -PAN -UAO BTYPE=--)
> [5.622446] pc : regmap_read+0x1c/0x80
> [5.622787] lr : rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
> [5.623299] sp : 8000126cb8a0
> [5.623594] x29: 8000126cb8a0 x28: 8000126cbdb0
> [5.624063] x27: f22dac40 x26: f6779800
> [5.624533] x25: f6779810 x24: ffea
> [5.625002] x23: ffea x22: f65b74c8
> [5.625471] x21: f783ca08 x20: f65b7480
> [5.625941] x19:  x18: 0001
> [5.626410] x17:  x16: 
> [5.626878] x15: f22db138 x14: 
> [5.627347] x13: 0018 x12: 80001106a8c7
> [5.627817] x11: 0003 x10: 0101010101010101
> [5.627861] systemd[1]: Found device SPCC M.2 PCIE SSD 3.
> [5.628286] x9 : 88d7c89c x8 : 7f7f7f7f7f7f7f7f
> [5.629238] x7 : fefefeff646c606d x6 : 1c0e0e0ee3e8e9f0
> [5.629709] x5 : 706968630e0e0e1c x4 : 80808080
> [5.630178] x3 : 937b1b5b1b434b80 x2 : 8000126cb944
> [5.630648] x1 : 0308 x0 : 
> [5.631119] Call trace:
> [5.631346]  regmap_read+0x1c/0x80
> [5.631654]  rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
> [5.632142]  platform_drv_probe+0x5c/0xb0
> [5.632500]  really_probe+0xe4/0x448
> [5.632819]  driver_probe_device+0xfc/0x168
> [5.633191]  device_driver_attach+0x7c/0x88
> [5.633567]  __driver_attach+0xac/0x178
> [5.633914]  bus_for_each_dev+0x78/0xc8
> [5.634261]  driver_attach+0x2c/0x38
> [5.634582]  bus_add_driver+0x14c/0x230
> [5.634925]  driver_register+0x6c/0x128
> [5.635269]  __platform_driver_register+0x50/0x60
> [5.635692]  rk3399_dmcfreq_driver_init+0x2c/0x1000 [rk3399_dmc]
> [5.636226]  do_one_initcall+0x50/0x230
> [5.636569]  do_init_module+0x60/0x248
> [5.636902]  load_module+0x21f8/0x28d8
> [5.637237]  __do_sys_finit_module+0xb0/0x118
> [5.637627]  __arm64_sys_finit_module+0x28/0x38
> [5.638031]  el0_svc_common.constprop.0+0x7c/0x1f8
> [5.638456]  do_el0_svc+0x2c/0x98
> [5.638754]  el0_svc+0x18/0x48
> [5.639029]  el0_sync_handler+0x8c/0x2d4
> [5.639378]  el0_sync+0x158/0x180
> [5.639680] Code: a9bd7bfd 910003fd a90153f3 aa0003f3 (b941e400)
> [5.640221] ---[ end trace 63675fe5d0021970 ]---
> 
> This turns out to be due to the rk3399-dmc driver looking for
> an *undocumented* property (rockchip,pmu), and happily using
> a NULL pointer when the property isn't there.
> 
> The very existence of this driver in the kernel is highly doubtful
> (I'd expect firmware to deal with this directly), but in the meantime
> let's prevent it from oopsing the kernel at probe time if this
> property isn't present.
> 
> Signed-off-by: Marc Zyngier 

[PATCH] PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent

2020-06-13 Thread Marc Zyngier
Booting a recent kernel on a rk3399-based system (nanopc-t4),
equipped with a recent u-boot and ATF results in the following:

[5.607431] Unable to handle kernel NULL pointer dereference at virtual 
address 01e4
[5.608219] Mem abort info:
[5.608469]   ESR = 0x9604
[5.608749]   EC = 0x25: DABT (current EL), IL = 32 bits
[5.609223]   SET = 0, FnV = 0
[5.609600]   EA = 0, S1PTW = 0
[5.609891] Data abort info:
[5.610149]   ISV = 0, ISS = 0x0004
[5.610489]   CM = 0, WnR = 0
[5.610757] user pgtable: 4k pages, 48-bit VAs, pgdp=e62fb000
[5.611326] [01e4] pgd=, p4d=
[5.611931] Internal error: Oops: 9604 [#1] SMP
[5.612363] Modules linked in: rockchip_thermal(E+) rk3399_dmc(E+) 
soundcore(E) dw_wdt(E) rockchip_dfi(E) nvmem_rockchip_efuse(E) pwm_rockchip(E) 
cfg80211(E+) rockchip_saradc(E) industrialio(E) rfkill(E) cpufreq_dt(E) 
ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc32c_generic(E) crc16(E) 
mbcache(E) jbd2(E) realtek(E) nvme(E) nvme_core(E) t10_pi(E) xhci_plat_hcd(E) 
xhci_hcd(E) rtc_rk808(E) rk808_regulator(E) clk_rk808(E) dwc3(E) udc_core(E) 
roles(E) ulpi(E) rk808(E) fan53555(E) rockchipdrm(E) analogix_dp(E) dw_hdmi(E) 
cec(E) dw_mipi_dsi(E) fixed(E) dwc3_of_simple(E) phy_rockchip_emmc(E) 
gpio_keys(E) drm_kms_helper(E) phy_rockchip_inno_usb2(E) ehci_platform(E) 
dwmac_rk(E) stmmac_platform(E) phy_rockchip_pcie(E) ohci_platform(E) 
ohci_hcd(E) rockchip_io_domain(E) stmmac(E) phy_rockchip_typec(E) ehci_hcd(E) 
sdhci_of_arasan(E) mdio_xpcs(E) sdhci_pltfm(E) cqhci(E) drm(E) sdhci(E) 
phylink(E) of_mdio(E) usbcore(E) i2c_rk3x(E) dw_mmc_rockchip(E) dw_mmc_pltfm(E) 
dw_mmc(E) fixed_phy(E) libphy(E)
[5.612454]  pl330(E)
[5.620255] CPU: 1 PID: 270 Comm: systemd-udevd Tainted: GE 
5.7.0-13692-g83ae758d8b22 #1157
[5.621110] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 
2020.07-rc4-00023-g10d4cafe0f 06/10/2020
[5.621947] pstate: 4005 (nZcv daif -PAN -UAO BTYPE=--)
[5.622446] pc : regmap_read+0x1c/0x80
[5.622787] lr : rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
[5.623299] sp : 8000126cb8a0
[5.623594] x29: 8000126cb8a0 x28: 8000126cbdb0
[5.624063] x27: f22dac40 x26: f6779800
[5.624533] x25: f6779810 x24: ffea
[5.625002] x23: ffea x22: f65b74c8
[5.625471] x21: f783ca08 x20: f65b7480
[5.625941] x19:  x18: 0001
[5.626410] x17:  x16: 
[5.626878] x15: f22db138 x14: 
[5.627347] x13: 0018 x12: 80001106a8c7
[5.627817] x11: 0003 x10: 0101010101010101
[5.627861] systemd[1]: Found device SPCC M.2 PCIE SSD 3.
[5.628286] x9 : 88d7c89c x8 : 7f7f7f7f7f7f7f7f
[5.629238] x7 : fefefeff646c606d x6 : 1c0e0e0ee3e8e9f0
[5.629709] x5 : 706968630e0e0e1c x4 : 80808080
[5.630178] x3 : 937b1b5b1b434b80 x2 : 8000126cb944
[5.630648] x1 : 0308 x0 : 
[5.631119] Call trace:
[5.631346]  regmap_read+0x1c/0x80
[5.631654]  rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
[5.632142]  platform_drv_probe+0x5c/0xb0
[5.632500]  really_probe+0xe4/0x448
[5.632819]  driver_probe_device+0xfc/0x168
[5.633191]  device_driver_attach+0x7c/0x88
[5.633567]  __driver_attach+0xac/0x178
[5.633914]  bus_for_each_dev+0x78/0xc8
[5.634261]  driver_attach+0x2c/0x38
[5.634582]  bus_add_driver+0x14c/0x230
[5.634925]  driver_register+0x6c/0x128
[5.635269]  __platform_driver_register+0x50/0x60
[5.635692]  rk3399_dmcfreq_driver_init+0x2c/0x1000 [rk3399_dmc]
[5.636226]  do_one_initcall+0x50/0x230
[5.636569]  do_init_module+0x60/0x248
[5.636902]  load_module+0x21f8/0x28d8
[5.637237]  __do_sys_finit_module+0xb0/0x118
[5.637627]  __arm64_sys_finit_module+0x28/0x38
[5.638031]  el0_svc_common.constprop.0+0x7c/0x1f8
[5.638456]  do_el0_svc+0x2c/0x98
[5.638754]  el0_svc+0x18/0x48
[5.639029]  el0_sync_handler+0x8c/0x2d4
[5.639378]  el0_sync+0x158/0x180
[5.639680] Code: a9bd7bfd 910003fd a90153f3 aa0003f3 (b941e400)
[5.640221] ---[ end trace 63675fe5d0021970 ]---

This turns out to be due to the rk3399-dmc driver looking for
an *undocumented* property (rockchip,pmu), and happily using
a NULL pointer when the property isn't there.

The very existence of this driver in the kernel is highly doubtful
(I'd expect firmware to deal with this directly), but in the meantime
let's prevent it from oopsing the kernel at probe time if this
property isn't present.

Signed-off-by: Marc Zyngier 
---
 drivers/devfreq/rk3399_dmc.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
index 24f04f78285b..bee233a2e0ce 100644
--- 

[PATCH 5.6 102/194] drm/i915/gvt: Fix kernel oops for 3-level ppgtt guest

2020-05-18 Thread Greg Kroah-Hartman
From: Zhenyu Wang 

[ Upstream commit 72a7a9925e2beea09b109dffb3384c9bf920d9da ]

As i915 won't allocate extra PDP for current default PML4 table,
so for 3-level ppgtt guest, we would hit kernel pointer access
failure on extra PDP pointers. So this trys to bypass that now.
It won't impact real shadow PPGTT setup, so guest context still
works.

This is verified on 4.15 guest kernel with i915.enable_ppgtt=1
to force on old aliasing ppgtt behavior.

Fixes: 4f15665ccbba ("drm/i915: Add ppgtt to GVT GEM context")
Reviewed-by: Xiong Zhang 
Signed-off-by: Zhenyu Wang 
Link: 
http://patchwork.freedesktop.org/patch/msgid/20200506095918.124913-1-zhen...@linux.intel.com
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/i915/gvt/scheduler.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c 
b/drivers/gpu/drm/i915/gvt/scheduler.c
index 685d1e04a5ff6..709ad181bc94a 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -375,7 +375,11 @@ static void set_context_ppgtt_from_shadow(struct 
intel_vgpu_workload *workload,
for (i = 0; i < GVT_RING_CTX_NR_PDPS; i++) {
struct i915_page_directory * const pd =
i915_pd_entry(ppgtt->pd, i);
-
+   /* skip now as current i915 ppgtt alloc won't allocate
+  top level pdp for non 4-level table, won't impact
+  shadow ppgtt. */
+   if (!pd)
+   break;
px_dma(pd) = mm->ppgtt_mm.shadow_pdps[i];
}
}
-- 
2.20.1





[PATCH 5.4 078/147] drm/i915/gvt: Fix kernel oops for 3-level ppgtt guest

2020-05-18 Thread Greg Kroah-Hartman
From: Zhenyu Wang 

[ Upstream commit 72a7a9925e2beea09b109dffb3384c9bf920d9da ]

As i915 won't allocate extra PDP for current default PML4 table,
so for 3-level ppgtt guest, we would hit kernel pointer access
failure on extra PDP pointers. So this trys to bypass that now.
It won't impact real shadow PPGTT setup, so guest context still
works.

This is verified on 4.15 guest kernel with i915.enable_ppgtt=1
to force on old aliasing ppgtt behavior.

Fixes: 4f15665ccbba ("drm/i915: Add ppgtt to GVT GEM context")
Reviewed-by: Xiong Zhang 
Signed-off-by: Zhenyu Wang 
Link: 
http://patchwork.freedesktop.org/patch/msgid/20200506095918.124913-1-zhen...@linux.intel.com
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/i915/gvt/scheduler.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c 
b/drivers/gpu/drm/i915/gvt/scheduler.c
index 6c79d16b381ea..058dcd5416440 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -374,7 +374,11 @@ static void set_context_ppgtt_from_shadow(struct 
intel_vgpu_workload *workload,
for (i = 0; i < GVT_RING_CTX_NR_PDPS; i++) {
struct i915_page_directory * const pd =
i915_pd_entry(ppgtt->pd, i);
-
+   /* skip now as current i915 ppgtt alloc won't allocate
+  top level pdp for non 4-level table, won't impact
+  shadow ppgtt. */
+   if (!pd)
+   break;
px_dma(pd) = mm->ppgtt_mm.shadow_pdps[i];
}
}
-- 
2.20.1





[PATCH 4.19 059/114] drm/amdgpu: Fix KFD-related kernel oops on Hawaii

2019-10-10 Thread Greg Kroah-Hartman
From: Felix Kuehling 

[ Upstream commit dcafbd50f2e4d5cc964aae409fb5691b743fba23 ]

Hawaii needs to flush caches explicitly, submitting an IB in a user
VMID from kernel mode. There is no s_fence in this case.

Fixes: eb3961a57424 ("drm/amdgpu: remove fence context from the job")
Signed-off-by: Felix Kuehling 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 51b5e977ca885..f4e9d1b10e3ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -139,7 +139,8 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
/* ring tests don't use a job */
if (job) {
vm = job->vm;
-   fence_ctx = job->base.s_fence->scheduled.context;
+   fence_ctx = job->base.s_fence ?
+   job->base.s_fence->scheduled.context : 0;
} else {
vm = NULL;
fence_ctx = 0;
-- 
2.20.1





[PATCH 5.3 104/148] drm/amdgpu: Fix KFD-related kernel oops on Hawaii

2019-10-10 Thread Greg Kroah-Hartman
From: Felix Kuehling 

[ Upstream commit dcafbd50f2e4d5cc964aae409fb5691b743fba23 ]

Hawaii needs to flush caches explicitly, submitting an IB in a user
VMID from kernel mode. There is no s_fence in this case.

Fixes: eb3961a57424 ("drm/amdgpu: remove fence context from the job")
Signed-off-by: Felix Kuehling 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 7850084a05e3a..60655834d6498 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -143,7 +143,8 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
/* ring tests don't use a job */
if (job) {
vm = job->vm;
-   fence_ctx = job->base.s_fence->scheduled.context;
+   fence_ctx = job->base.s_fence ?
+   job->base.s_fence->scheduled.context : 0;
} else {
vm = NULL;
fence_ctx = 0;
-- 
2.20.1





Applied "spi: stm32-qspi: Fix kernel oops when unbinding driver" to the spi tree

2019-10-04 Thread Mark Brown
The patch

   spi: stm32-qspi: Fix kernel oops when unbinding driver

has been applied to the spi tree at

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-5.4

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From 3c0af1dd2fe78adc02fe21f6cfe7d6cb8602573e Mon Sep 17 00:00:00 2001
From: Patrice Chotard 
Date: Fri, 4 Oct 2019 14:36:06 +0200
Subject: [PATCH] spi: stm32-qspi: Fix kernel oops when unbinding driver

spi_master_put() must only be called in .probe() in case of error.

As devm_spi_register_master() is used during probe, spi_master_put()
mustn't be called in .remove() callback.

It fixes the following kernel WARNING/Oops when executing
echo "58003000.spi" > /sys/bus/platform/drivers/stm32-qspi/unbind :

[ cut here ]
WARNING: CPU: 1 PID: 496 at fs/kernfs/dir.c:1504 
kernfs_remove_by_name_ns+0x9c/0xa4
kernfs: can not remove 'uevent', no directory
Modules linked in:
CPU: 1 PID: 496 Comm: sh Not tainted 5.3.0-rc1-00219-ga0e07bb51a37 #62
Hardware name: STM32 (Device Tree Support)
[] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[] (show_stack) from [] (dump_stack+0xb4/0xc8)
[] (dump_stack) from [] (__warn.part.3+0xbc/0xd8)
[] (__warn.part.3) from [] (warn_slowpath_fmt+0x68/0x8c)
[] (warn_slowpath_fmt) from [] 
(kernfs_remove_by_name_ns+0x9c/0xa4)
[] (kernfs_remove_by_name_ns) from [] 
(device_del+0x128/0x358)
[] (device_del) from [] (device_unregister+0x24/0x64)
[] (device_unregister) from [] 
(spi_unregister_controller+0x88/0xe8)
[] (spi_unregister_controller) from [] 
(release_nodes+0x1bc/0x200)
[] (release_nodes) from [] 
(device_release_driver_internal+0xec/0x1ac)
[] (device_release_driver_internal) from [] 
(unbind_store+0x60/0xd4)
[] (unbind_store) from [] (kernfs_fop_write+0xe8/0x1c4)
[] (kernfs_fop_write) from [] (__vfs_write+0x2c/0x1c0)
[] (__vfs_write) from [] (vfs_write+0xa4/0x184)
[] (vfs_write) from [] (ksys_write+0x58/0xd0)
[] (ksys_write) from [] (ret_fast_syscall+0x0/0x54)
Exception stack(0xdd289fa8 to 0xdd289ff0)
9fa0:   006c 000e20e8 0001 000e20e8 000d 
9fc0: 006c 000e20e8 b6f87da0 0004 000d 000d  
9fe0: 0004 bee639b0 b6f2286b b6eaf6c6
---[ end trace 1b15df8a02d76aef ]---
[ cut here ]
WARNING: CPU: 1 PID: 496 at fs/kernfs/dir.c:1504 
kernfs_remove_by_name_ns+0x9c/0xa4
kernfs: can not remove 'online', no directory
Modules linked in:
CPU: 1 PID: 496 Comm: sh Tainted: GW 
5.3.0-rc1-00219-ga0e07bb51a37 #62
Hardware name: STM32 (Device Tree Support)
[] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[] (show_stack) from [] (dump_stack+0xb4/0xc8)
[] (dump_stack) from [] (__warn.part.3+0xbc/0xd8)
[] (__warn.part.3) from [] (warn_slowpath_fmt+0x68/0x8c)
[] (warn_slowpath_fmt) from [] 
(kernfs_remove_by_name_ns+0x9c/0xa4)
[] (kernfs_remove_by_name_ns) from [] 
(device_remove_attrs+0x20/0x5c)
[] (device_remove_attrs) from [] (device_del+0x134/0x358)
[] (device_del) from [] (device_unregister+0x24/0x64)
[] (device_unregister) from [] 
(spi_unregister_controller+0x88/0xe8)
[] (spi_unregister_controller) from [] 
(release_nodes+0x1bc/0x200)
[] (release_nodes) from [] 
(device_release_driver_internal+0xec/0x1ac)
[] (device_release_driver_internal) from [] 
(unbind_store+0x60/0xd4)
[] (unbind_store) from [] (kernfs_fop_write+0xe8/0x1c4)
[] (kernfs_fop_write) from [] (__vfs_write+0x2c/0x1c0)
[] (__vfs_write) from [] (vfs_write+0xa4/0x184)
[] (vfs_write) from [] (ksys_write+0x58/0xd0)
[] (ksys_write) from [] (ret_fast_syscall+0x0/0x54)
Exception stack(0xdd289fa8 to 0xdd289ff0)
9fa0:   006c 000e20e8 0001 000e20e8 000d 
9fc0: 006c 000e20e8 b6f87da0 0004 000d 000d  
9fe0: 0004 bee639b0 b6f2286b b6eaf6c6
---[ end trace 1b15df8a02d76af0 ]---
8<--- cut here ---
Unable to handle kernel NULL pointer dereference at virtual address 0050
pgd = e612f14d
[0050] *pgd=ff1f5835
Internal error: Oops: 17 [#1] SMP ARM
Modules linked in:
CPU: 1 PID: 496 Comm: sh Tainted: GW 
5.3.0-rc1-00219-ga0e07bb51a37 #62
Hardware name: STM32 (Device Tree Support)
PC is at kernfs_find_ns+0x8/0xfc
LR is at kernfs_find_and_get_ns+0x30/0x48
pc : []lr : [] 

spi: stm32-qspi: Fix kernel oops when unbinding driver

2019-10-04 Thread patrice.chotard
From: Patrice Chotard 

spi_master_put() must only be called in .probe() in case of error.

As devm_spi_register_master() is used during probe, spi_master_put()
mustn't be called in .remove() callback.

It fixes the following kernel WARNING/Oops when executing
echo "58003000.spi" > /sys/bus/platform/drivers/stm32-qspi/unbind :

[ cut here ]
WARNING: CPU: 1 PID: 496 at fs/kernfs/dir.c:1504 
kernfs_remove_by_name_ns+0x9c/0xa4
kernfs: can not remove 'uevent', no directory
Modules linked in:
CPU: 1 PID: 496 Comm: sh Not tainted 5.3.0-rc1-00219-ga0e07bb51a37 #62
Hardware name: STM32 (Device Tree Support)
[] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[] (show_stack) from [] (dump_stack+0xb4/0xc8)
[] (dump_stack) from [] (__warn.part.3+0xbc/0xd8)
[] (__warn.part.3) from [] (warn_slowpath_fmt+0x68/0x8c)
[] (warn_slowpath_fmt) from [] 
(kernfs_remove_by_name_ns+0x9c/0xa4)
[] (kernfs_remove_by_name_ns) from [] 
(device_del+0x128/0x358)
[] (device_del) from [] (device_unregister+0x24/0x64)
[] (device_unregister) from [] 
(spi_unregister_controller+0x88/0xe8)
[] (spi_unregister_controller) from [] 
(release_nodes+0x1bc/0x200)
[] (release_nodes) from [] 
(device_release_driver_internal+0xec/0x1ac)
[] (device_release_driver_internal) from [] 
(unbind_store+0x60/0xd4)
[] (unbind_store) from [] (kernfs_fop_write+0xe8/0x1c4)
[] (kernfs_fop_write) from [] (__vfs_write+0x2c/0x1c0)
[] (__vfs_write) from [] (vfs_write+0xa4/0x184)
[] (vfs_write) from [] (ksys_write+0x58/0xd0)
[] (ksys_write) from [] (ret_fast_syscall+0x0/0x54)
Exception stack(0xdd289fa8 to 0xdd289ff0)
9fa0:   006c 000e20e8 0001 000e20e8 000d 
9fc0: 006c 000e20e8 b6f87da0 0004 000d 000d  
9fe0: 0004 bee639b0 b6f2286b b6eaf6c6
---[ end trace 1b15df8a02d76aef ]---
[ cut here ]
WARNING: CPU: 1 PID: 496 at fs/kernfs/dir.c:1504 
kernfs_remove_by_name_ns+0x9c/0xa4
kernfs: can not remove 'online', no directory
Modules linked in:
CPU: 1 PID: 496 Comm: sh Tainted: GW 
5.3.0-rc1-00219-ga0e07bb51a37 #62
Hardware name: STM32 (Device Tree Support)
[] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[] (show_stack) from [] (dump_stack+0xb4/0xc8)
[] (dump_stack) from [] (__warn.part.3+0xbc/0xd8)
[] (__warn.part.3) from [] (warn_slowpath_fmt+0x68/0x8c)
[] (warn_slowpath_fmt) from [] 
(kernfs_remove_by_name_ns+0x9c/0xa4)
[] (kernfs_remove_by_name_ns) from [] 
(device_remove_attrs+0x20/0x5c)
[] (device_remove_attrs) from [] (device_del+0x134/0x358)
[] (device_del) from [] (device_unregister+0x24/0x64)
[] (device_unregister) from [] 
(spi_unregister_controller+0x88/0xe8)
[] (spi_unregister_controller) from [] 
(release_nodes+0x1bc/0x200)
[] (release_nodes) from [] 
(device_release_driver_internal+0xec/0x1ac)
[] (device_release_driver_internal) from [] 
(unbind_store+0x60/0xd4)
[] (unbind_store) from [] (kernfs_fop_write+0xe8/0x1c4)
[] (kernfs_fop_write) from [] (__vfs_write+0x2c/0x1c0)
[] (__vfs_write) from [] (vfs_write+0xa4/0x184)
[] (vfs_write) from [] (ksys_write+0x58/0xd0)
[] (ksys_write) from [] (ret_fast_syscall+0x0/0x54)
Exception stack(0xdd289fa8 to 0xdd289ff0)
9fa0:   006c 000e20e8 0001 000e20e8 000d 
9fc0: 006c 000e20e8 b6f87da0 0004 000d 000d  
9fe0: 0004 bee639b0 b6f2286b b6eaf6c6
---[ end trace 1b15df8a02d76af0 ]---
8<--- cut here ---
Unable to handle kernel NULL pointer dereference at virtual address 0050
pgd = e612f14d
[0050] *pgd=ff1f5835
Internal error: Oops: 17 [#1] SMP ARM
Modules linked in:
CPU: 1 PID: 496 Comm: sh Tainted: GW 
5.3.0-rc1-00219-ga0e07bb51a37 #62
Hardware name: STM32 (Device Tree Support)
PC is at kernfs_find_ns+0x8/0xfc
LR is at kernfs_find_and_get_ns+0x30/0x48
pc : []lr : []psr: 40010013
sp : dd289dac  ip :   fp : 
r10:   r9 : def6ec58  r8 : dd289e54
r7 :   r6 : c0abb234  r5 :   r4 : c0d26a30
r3 : ddab5080  r2 :   r1 : c0abb234  r0 : 
Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: dd11c06a  DAC: 0051
Process sh (pid: 496, stack limit = 0xe13a592d)
Stack: (0xdd289dac to 0xdd28a000)
9da0:c0d26a30  c0abb234  c02e4ac8
9dc0:  c0976b44 def6ec00 dea53810 dd289e54 c02e864c c0a61a48 c0a4a5ec
9de0: c0d630a8 def6ec00 c0d04c48 c02e86e0 def6ec00 de909338 c0d04c48 c05833b0
9e00:  c0638144 dd289e54 def59900  475b3ee5 def6ec00 
9e20: def6ec00 def59b80 dd289e54 def59900  c05835f8 def6ec00 c0638dac
9e40: 000a dea53810 c0d04c48 c058c580 dea53810 def59500 def59b80 475b3ee5
9e60: ddc63e00 dea53810 dea3fe10 c0d63a0c dea53810 ddc63e00 dd289f78 dd240d10
9e80:  c0588a44 c0d59a20 000d c0d63a0c c0586840 000d dd240d00
9ea0:   ddc63e00 c02e64e8   c0d04c48 dd9bbcc0
9ec0: c02e6400 dd289f78 

[PATCH 5.2 137/313] PM / devfreq: Fix kernel oops on governor module load

2019-10-03 Thread Greg Kroah-Hartman
From: Ezequiel Garcia 

[ Upstream commit 7544fd7f384591038646d3cd9efb311ab4509e24 ]

A bit unexpectedly (but still documented), request_module may
return a positive value, in case of a modprobe error.
This is currently causing issues in the devfreq framework.

When a request_module exits with a positive value, we currently
return that via ERR_PTR. However, because the value is positive,
it's not a ERR_VALUE proper, and is therefore treated as a
valid struct devfreq_governor pointer, leading to a kernel oops.

Fix this by returning -EINVAL if request_module returns a positive
value.

Fixes: b53b0128052ff ("PM / devfreq: Fix static checker warning in 
try_then_request_governor")
Signed-off-by: Ezequiel Garcia 
Reviewed-by: Chanwoo Choi 
Signed-off-by: MyungJoo Ham 
Signed-off-by: Sasha Levin 
---
 drivers/devfreq/devfreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index ab22bf8a12d69..a0e19802149fc 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -254,7 +254,7 @@ static struct devfreq_governor 
*try_then_request_governor(const char *name)
/* Restore previous state before return */
mutex_lock(_list_lock);
if (err)
-   return ERR_PTR(err);
+   return (err < 0) ? ERR_PTR(err) : ERR_PTR(-EINVAL);
 
governor = find_devfreq_governor(name);
}
-- 
2.20.1





[PATCH 5.3 152/344] PM / devfreq: Fix kernel oops on governor module load

2019-10-03 Thread Greg Kroah-Hartman
From: Ezequiel Garcia 

[ Upstream commit 7544fd7f384591038646d3cd9efb311ab4509e24 ]

A bit unexpectedly (but still documented), request_module may
return a positive value, in case of a modprobe error.
This is currently causing issues in the devfreq framework.

When a request_module exits with a positive value, we currently
return that via ERR_PTR. However, because the value is positive,
it's not a ERR_VALUE proper, and is therefore treated as a
valid struct devfreq_governor pointer, leading to a kernel oops.

Fix this by returning -EINVAL if request_module returns a positive
value.

Fixes: b53b0128052ff ("PM / devfreq: Fix static checker warning in 
try_then_request_governor")
Signed-off-by: Ezequiel Garcia 
Reviewed-by: Chanwoo Choi 
Signed-off-by: MyungJoo Ham 
Signed-off-by: Sasha Levin 
---
 drivers/devfreq/devfreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index ab22bf8a12d69..a0e19802149fc 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -254,7 +254,7 @@ static struct devfreq_governor 
*try_then_request_governor(const char *name)
/* Restore previous state before return */
mutex_lock(_list_lock);
if (err)
-   return ERR_PTR(err);
+   return (err < 0) ? ERR_PTR(err) : ERR_PTR(-EINVAL);
 
governor = find_devfreq_governor(name);
}
-- 
2.20.1





[PATCH AUTOSEL 5.2 18/63] drm/amdgpu: Fix KFD-related kernel oops on Hawaii

2019-10-01 Thread Sasha Levin
From: Felix Kuehling 

[ Upstream commit dcafbd50f2e4d5cc964aae409fb5691b743fba23 ]

Hawaii needs to flush caches explicitly, submitting an IB in a user
VMID from kernel mode. There is no s_fence in this case.

Fixes: eb3961a57424 ("drm/amdgpu: remove fence context from the job")
Signed-off-by: Felix Kuehling 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index fe393a46f8811..5eed2423dbb5e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -141,7 +141,8 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
/* ring tests don't use a job */
if (job) {
vm = job->vm;
-   fence_ctx = job->base.s_fence->scheduled.context;
+   fence_ctx = job->base.s_fence ?
+   job->base.s_fence->scheduled.context : 0;
} else {
vm = NULL;
fence_ctx = 0;
-- 
2.20.1



[PATCH AUTOSEL 4.19 13/43] drm/amdgpu: Fix KFD-related kernel oops on Hawaii

2019-10-01 Thread Sasha Levin
From: Felix Kuehling 

[ Upstream commit dcafbd50f2e4d5cc964aae409fb5691b743fba23 ]

Hawaii needs to flush caches explicitly, submitting an IB in a user
VMID from kernel mode. There is no s_fence in this case.

Fixes: eb3961a57424 ("drm/amdgpu: remove fence context from the job")
Signed-off-by: Felix Kuehling 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 51b5e977ca885..f4e9d1b10e3ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -139,7 +139,8 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
/* ring tests don't use a job */
if (job) {
vm = job->vm;
-   fence_ctx = job->base.s_fence->scheduled.context;
+   fence_ctx = job->base.s_fence ?
+   job->base.s_fence->scheduled.context : 0;
} else {
vm = NULL;
fence_ctx = 0;
-- 
2.20.1



[PATCH AUTOSEL 5.3 119/203] PM / devfreq: Fix kernel oops on governor module load

2019-09-22 Thread Sasha Levin
From: Ezequiel Garcia 

[ Upstream commit 7544fd7f384591038646d3cd9efb311ab4509e24 ]

A bit unexpectedly (but still documented), request_module may
return a positive value, in case of a modprobe error.
This is currently causing issues in the devfreq framework.

When a request_module exits with a positive value, we currently
return that via ERR_PTR. However, because the value is positive,
it's not a ERR_VALUE proper, and is therefore treated as a
valid struct devfreq_governor pointer, leading to a kernel oops.

Fix this by returning -EINVAL if request_module returns a positive
value.

Fixes: b53b0128052ff ("PM / devfreq: Fix static checker warning in 
try_then_request_governor")
Signed-off-by: Ezequiel Garcia 
Reviewed-by: Chanwoo Choi 
Signed-off-by: MyungJoo Ham 
Signed-off-by: Sasha Levin 
---
 drivers/devfreq/devfreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index ab22bf8a12d69..a0e19802149fc 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -254,7 +254,7 @@ static struct devfreq_governor 
*try_then_request_governor(const char *name)
/* Restore previous state before return */
mutex_lock(_list_lock);
if (err)
-   return ERR_PTR(err);
+   return (err < 0) ? ERR_PTR(err) : ERR_PTR(-EINVAL);
 
governor = find_devfreq_governor(name);
}
-- 
2.20.1



[PATCH AUTOSEL 5.2 108/185] PM / devfreq: Fix kernel oops on governor module load

2019-09-22 Thread Sasha Levin
From: Ezequiel Garcia 

[ Upstream commit 7544fd7f384591038646d3cd9efb311ab4509e24 ]

A bit unexpectedly (but still documented), request_module may
return a positive value, in case of a modprobe error.
This is currently causing issues in the devfreq framework.

When a request_module exits with a positive value, we currently
return that via ERR_PTR. However, because the value is positive,
it's not a ERR_VALUE proper, and is therefore treated as a
valid struct devfreq_governor pointer, leading to a kernel oops.

Fix this by returning -EINVAL if request_module returns a positive
value.

Fixes: b53b0128052ff ("PM / devfreq: Fix static checker warning in 
try_then_request_governor")
Signed-off-by: Ezequiel Garcia 
Reviewed-by: Chanwoo Choi 
Signed-off-by: MyungJoo Ham 
Signed-off-by: Sasha Levin 
---
 drivers/devfreq/devfreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index ab22bf8a12d69..a0e19802149fc 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -254,7 +254,7 @@ static struct devfreq_governor 
*try_then_request_governor(const char *name)
/* Restore previous state before return */
mutex_lock(_list_lock);
if (err)
-   return ERR_PTR(err);
+   return (err < 0) ? ERR_PTR(err) : ERR_PTR(-EINVAL);
 
governor = find_devfreq_governor(name);
}
-- 
2.20.1



[PATCH 5.2 081/162] SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL

2019-08-27 Thread Greg Kroah-Hartman
[ Upstream commit ee9d66182392695535cc9fccfcb40c16f72de2a9 ]

Fix kernel oops when mounting a encryptData CIFS share with
CONFIG_DEBUG_VIRTUAL

Signed-off-by: Sebastien Tisserant 
Reviewed-by: Pavel Shilovsky 
Signed-off-by: Steve French 
Signed-off-by: Sasha Levin 
---
 fs/cifs/smb2ops.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index ae10d6e297c3a..42de31d206169 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -3439,7 +3439,15 @@ fill_transform_hdr(struct smb2_transform_hdr *tr_hdr, 
unsigned int orig_len,
 static inline void smb2_sg_set_buf(struct scatterlist *sg, const void *buf,
   unsigned int buflen)
 {
-   sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
+   void *addr;
+   /*
+* VMAP_STACK (at least) puts stack into the vmalloc address space
+*/
+   if (is_vmalloc_addr(buf))
+   addr = vmalloc_to_page(buf);
+   else
+   addr = virt_to_page(buf);
+   sg_set_page(sg, addr, buflen, offset_in_page(buf));
 }
 
 /* Assumes the first rqst has a transform header as the first iov.
-- 
2.20.1





[PATCH 4.19 40/98] SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL

2019-08-27 Thread Greg Kroah-Hartman
[ Upstream commit ee9d66182392695535cc9fccfcb40c16f72de2a9 ]

Fix kernel oops when mounting a encryptData CIFS share with
CONFIG_DEBUG_VIRTUAL

Signed-off-by: Sebastien Tisserant 
Reviewed-by: Pavel Shilovsky 
Signed-off-by: Steve French 
Signed-off-by: Sasha Levin 
---
 fs/cifs/smb2ops.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index 97fdbec54db97..cc9e846a38658 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -2545,7 +2545,15 @@ fill_transform_hdr(struct smb2_transform_hdr *tr_hdr, 
unsigned int orig_len,
 static inline void smb2_sg_set_buf(struct scatterlist *sg, const void *buf,
   unsigned int buflen)
 {
-   sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
+   void *addr;
+   /*
+* VMAP_STACK (at least) puts stack into the vmalloc address space
+*/
+   if (is_vmalloc_addr(buf))
+   addr = vmalloc_to_page(buf);
+   else
+   addr = virt_to_page(buf);
+   sg_set_page(sg, addr, buflen, offset_in_page(buf));
 }
 
 /* Assumes the first rqst has a transform header as the first iov.
-- 
2.20.1





[PATCH 4.14 23/62] SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL

2019-08-27 Thread Greg Kroah-Hartman
[ Upstream commit ee9d66182392695535cc9fccfcb40c16f72de2a9 ]

Fix kernel oops when mounting a encryptData CIFS share with
CONFIG_DEBUG_VIRTUAL

Signed-off-by: Sebastien Tisserant 
Reviewed-by: Pavel Shilovsky 
Signed-off-by: Steve French 
Signed-off-by: Sasha Levin 
---
 fs/cifs/smb2ops.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index 23326b0cd5628..58a502e622aa4 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -2168,7 +2168,15 @@ fill_transform_hdr(struct smb2_transform_hdr *tr_hdr, 
struct smb_rqst *old_rq)
 static inline void smb2_sg_set_buf(struct scatterlist *sg, const void *buf,
   unsigned int buflen)
 {
-   sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
+   void *addr;
+   /*
+* VMAP_STACK (at least) puts stack into the vmalloc address space
+*/
+   if (is_vmalloc_addr(buf))
+   addr = vmalloc_to_page(buf);
+   else
+   addr = virt_to_page(buf);
+   sg_set_page(sg, addr, buflen, offset_in_page(buf));
 }
 
 static struct scatterlist *
-- 
2.20.1





[PATCH AUTOSEL 5.2 091/123] SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL

2019-08-13 Thread Sasha Levin
From: Sebastien Tisserant 

[ Upstream commit ee9d66182392695535cc9fccfcb40c16f72de2a9 ]

Fix kernel oops when mounting a encryptData CIFS share with
CONFIG_DEBUG_VIRTUAL

Signed-off-by: Sebastien Tisserant 
Reviewed-by: Pavel Shilovsky 
Signed-off-by: Steve French 
Signed-off-by: Sasha Levin 
---
 fs/cifs/smb2ops.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index ae10d6e297c3a..42de31d206169 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -3439,7 +3439,15 @@ fill_transform_hdr(struct smb2_transform_hdr *tr_hdr, 
unsigned int orig_len,
 static inline void smb2_sg_set_buf(struct scatterlist *sg, const void *buf,
   unsigned int buflen)
 {
-   sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
+   void *addr;
+   /*
+* VMAP_STACK (at least) puts stack into the vmalloc address space
+*/
+   if (is_vmalloc_addr(buf))
+   addr = vmalloc_to_page(buf);
+   else
+   addr = virt_to_page(buf);
+   sg_set_page(sg, addr, buflen, offset_in_page(buf));
 }
 
 /* Assumes the first rqst has a transform header as the first iov.
-- 
2.20.1



[PATCH AUTOSEL 4.19 47/68] SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL

2019-08-13 Thread Sasha Levin
From: Sebastien Tisserant 

[ Upstream commit ee9d66182392695535cc9fccfcb40c16f72de2a9 ]

Fix kernel oops when mounting a encryptData CIFS share with
CONFIG_DEBUG_VIRTUAL

Signed-off-by: Sebastien Tisserant 
Reviewed-by: Pavel Shilovsky 
Signed-off-by: Steve French 
Signed-off-by: Sasha Levin 
---
 fs/cifs/smb2ops.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index 97fdbec54db97..cc9e846a38658 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -2545,7 +2545,15 @@ fill_transform_hdr(struct smb2_transform_hdr *tr_hdr, 
unsigned int orig_len,
 static inline void smb2_sg_set_buf(struct scatterlist *sg, const void *buf,
   unsigned int buflen)
 {
-   sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
+   void *addr;
+   /*
+* VMAP_STACK (at least) puts stack into the vmalloc address space
+*/
+   if (is_vmalloc_addr(buf))
+   addr = vmalloc_to_page(buf);
+   else
+   addr = virt_to_page(buf);
+   sg_set_page(sg, addr, buflen, offset_in_page(buf));
 }
 
 /* Assumes the first rqst has a transform header as the first iov.
-- 
2.20.1



[PATCH AUTOSEL 4.14 29/44] SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL

2019-08-13 Thread Sasha Levin
From: Sebastien Tisserant 

[ Upstream commit ee9d66182392695535cc9fccfcb40c16f72de2a9 ]

Fix kernel oops when mounting a encryptData CIFS share with
CONFIG_DEBUG_VIRTUAL

Signed-off-by: Sebastien Tisserant 
Reviewed-by: Pavel Shilovsky 
Signed-off-by: Steve French 
Signed-off-by: Sasha Levin 
---
 fs/cifs/smb2ops.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index 23326b0cd5628..58a502e622aa4 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -2168,7 +2168,15 @@ fill_transform_hdr(struct smb2_transform_hdr *tr_hdr, 
struct smb_rqst *old_rq)
 static inline void smb2_sg_set_buf(struct scatterlist *sg, const void *buf,
   unsigned int buflen)
 {
-   sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
+   void *addr;
+   /*
+* VMAP_STACK (at least) puts stack into the vmalloc address space
+*/
+   if (is_vmalloc_addr(buf))
+   addr = vmalloc_to_page(buf);
+   else
+   addr = virt_to_page(buf);
+   sg_set_page(sg, addr, buflen, offset_in_page(buf));
 }
 
 static struct scatterlist *
-- 
2.20.1



[PATCH 5.1 33/96] ASoC: Intel: cht_bsw_nau8824: fix kernel oops with platform_name override

2019-07-08 Thread Greg Kroah-Hartman
[ Upstream commit 096701e8131425044d2054a0c210d6ea24ee7386 ]

The platform override code uses devm_ functions to allocate memory for
the new name but the card device is not initialized. Fix by moving the
init earlier.

Fixes: 4506db8043341 ("ASoC: Intel: cht_bsw_nau8824: platform name fixup 
support")
Signed-off-by: Pierre-Louis Bossart 
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/intel/boards/cht_bsw_nau8824.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/boards/cht_bsw_nau8824.c 
b/sound/soc/intel/boards/cht_bsw_nau8824.c
index 02c2fa239331..20fae391c75a 100644
--- a/sound/soc/intel/boards/cht_bsw_nau8824.c
+++ b/sound/soc/intel/boards/cht_bsw_nau8824.c
@@ -257,6 +257,7 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
snd_soc_card_set_drvdata(_soc_card_cht, drv);
 
/* override plaform name, if required */
+   snd_soc_card_cht.dev = >dev;
mach = (>dev)->platform_data;
platform_name = mach->mach_params.platform;
 
@@ -266,7 +267,6 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
return ret_val;
 
/* register the soc card */
-   snd_soc_card_cht.dev = >dev;
ret_val = devm_snd_soc_register_card(>dev, _soc_card_cht);
if (ret_val) {
dev_err(>dev,
-- 
2.20.1





[PATCH 5.1 31/96] ASoC: Intel: cht_bsw_max98090: fix kernel oops with platform_name override

2019-07-08 Thread Greg Kroah-Hartman
[ Upstream commit fb54555134b9b17835545e4d096b5550c27eed64 ]

The platform override code uses devm_ functions to allocate memory for
the new name but the card device is not initialized. Fix by moving the
init earlier.

Fixes: 7e7e24d7c7ff0 ("ASoC: Intel: cht_bsw_max98090_ti: platform name fixup 
support")
Signed-off-by: Pierre-Louis Bossart 
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/intel/boards/cht_bsw_max98090_ti.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/boards/cht_bsw_max98090_ti.c 
b/sound/soc/intel/boards/cht_bsw_max98090_ti.c
index c0e0844f75b9..572e336ae0f9 100644
--- a/sound/soc/intel/boards/cht_bsw_max98090_ti.c
+++ b/sound/soc/intel/boards/cht_bsw_max98090_ti.c
@@ -454,6 +454,7 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
}
 
/* override plaform name, if required */
+   snd_soc_card_cht.dev = >dev;
mach = (>dev)->platform_data;
platform_name = mach->mach_params.platform;
 
@@ -463,7 +464,6 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
return ret_val;
 
/* register the soc card */
-   snd_soc_card_cht.dev = >dev;
snd_soc_card_set_drvdata(_soc_card_cht, drv);
 
if (drv->quirks & QUIRK_PMC_PLT_CLK_0)
-- 
2.20.1





[PATCH 5.1 34/96] ASoC: Intel: cht_bsw_rt5672: fix kernel oops with platform_name override

2019-07-08 Thread Greg Kroah-Hartman
[ Upstream commit 9bbc799318a34061703f2a980e2b6df7fc6760f0 ]

The platform override code uses devm_ functions to allocate memory for
the new name but the card device is not initialized. Fix by moving the
init earlier.

Fixes: f403906da05cd ("ASoC: Intel: cht_bsw_rt5672: platform name fixup 
support")
Signed-off-by: Pierre-Louis Bossart 
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/intel/boards/cht_bsw_rt5672.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/boards/cht_bsw_rt5672.c 
b/sound/soc/intel/boards/cht_bsw_rt5672.c
index 3d5a2b3a06f0..87ce3857376d 100644
--- a/sound/soc/intel/boards/cht_bsw_rt5672.c
+++ b/sound/soc/intel/boards/cht_bsw_rt5672.c
@@ -425,6 +425,7 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
}
 
/* override plaform name, if required */
+   snd_soc_card_cht.dev = >dev;
platform_name = mach->mach_params.platform;
 
ret_val = snd_soc_fixup_dai_links_platform_name(_soc_card_cht,
@@ -442,7 +443,6 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
snd_soc_card_set_drvdata(_soc_card_cht, drv);
 
/* register the soc card */
-   snd_soc_card_cht.dev = >dev;
ret_val = devm_snd_soc_register_card(>dev, _soc_card_cht);
if (ret_val) {
dev_err(>dev,
-- 
2.20.1





[PATCH 5.1 32/96] ASoC: Intel: bytcht_es8316: fix kernel oops with platform_name override

2019-07-08 Thread Greg Kroah-Hartman
[ Upstream commit 79136a016add1acb690fe8d96be50dd22a143d26 ]

The platform override code uses devm_ functions to allocate memory for
the new name but the card device is not initialized. Fix by moving the
init earlier.

Fixes: e4bc6b1195f64 ("ASoC: Intel: bytcht_es8316: platform name fixup support")
Signed-off-by: Pierre-Louis Bossart 
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/intel/boards/bytcht_es8316.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/boards/bytcht_es8316.c 
b/sound/soc/intel/boards/bytcht_es8316.c
index d2a7e6ba11ae..1c686f83220a 100644
--- a/sound/soc/intel/boards/bytcht_es8316.c
+++ b/sound/soc/intel/boards/bytcht_es8316.c
@@ -471,6 +471,7 @@ static int snd_byt_cht_es8316_mc_probe(struct 
platform_device *pdev)
}
 
/* override plaform name, if required */
+   byt_cht_es8316_card.dev = dev;
platform_name = mach->mach_params.platform;
 
ret = snd_soc_fixup_dai_links_platform_name(_cht_es8316_card,
@@ -538,7 +539,6 @@ static int snd_byt_cht_es8316_mc_probe(struct 
platform_device *pdev)
 (quirk & BYT_CHT_ES8316_MONO_SPEAKER) ? "mono" : "stereo",
 mic_name[BYT_CHT_ES8316_MAP(quirk)]);
byt_cht_es8316_card.long_name = long_name;
-   byt_cht_es8316_card.dev = dev;
snd_soc_card_set_drvdata(_cht_es8316_card, priv);
 
ret = devm_snd_soc_register_card(dev, _cht_es8316_card);
-- 
2.20.1





[PATCH AUTOSEL 5.1 30/51] ASoC: Intel: cht_bsw_rt5672: fix kernel oops with platform_name override

2019-06-25 Thread Sasha Levin
From: Pierre-Louis Bossart 

[ Upstream commit 9bbc799318a34061703f2a980e2b6df7fc6760f0 ]

The platform override code uses devm_ functions to allocate memory for
the new name but the card device is not initialized. Fix by moving the
init earlier.

Fixes: f403906da05cd ("ASoC: Intel: cht_bsw_rt5672: platform name fixup 
support")
Signed-off-by: Pierre-Louis Bossart 
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/intel/boards/cht_bsw_rt5672.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/boards/cht_bsw_rt5672.c 
b/sound/soc/intel/boards/cht_bsw_rt5672.c
index 3d5a2b3a06f0..87ce3857376d 100644
--- a/sound/soc/intel/boards/cht_bsw_rt5672.c
+++ b/sound/soc/intel/boards/cht_bsw_rt5672.c
@@ -425,6 +425,7 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
}
 
/* override plaform name, if required */
+   snd_soc_card_cht.dev = >dev;
platform_name = mach->mach_params.platform;
 
ret_val = snd_soc_fixup_dai_links_platform_name(_soc_card_cht,
@@ -442,7 +443,6 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
snd_soc_card_set_drvdata(_soc_card_cht, drv);
 
/* register the soc card */
-   snd_soc_card_cht.dev = >dev;
ret_val = devm_snd_soc_register_card(>dev, _soc_card_cht);
if (ret_val) {
dev_err(>dev,
-- 
2.20.1



[PATCH AUTOSEL 5.1 27/51] ASoC: Intel: cht_bsw_max98090: fix kernel oops with platform_name override

2019-06-25 Thread Sasha Levin
From: Pierre-Louis Bossart 

[ Upstream commit fb54555134b9b17835545e4d096b5550c27eed64 ]

The platform override code uses devm_ functions to allocate memory for
the new name but the card device is not initialized. Fix by moving the
init earlier.

Fixes: 7e7e24d7c7ff0 ("ASoC: Intel: cht_bsw_max98090_ti: platform name fixup 
support")
Signed-off-by: Pierre-Louis Bossart 
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/intel/boards/cht_bsw_max98090_ti.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/boards/cht_bsw_max98090_ti.c 
b/sound/soc/intel/boards/cht_bsw_max98090_ti.c
index c0e0844f75b9..572e336ae0f9 100644
--- a/sound/soc/intel/boards/cht_bsw_max98090_ti.c
+++ b/sound/soc/intel/boards/cht_bsw_max98090_ti.c
@@ -454,6 +454,7 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
}
 
/* override plaform name, if required */
+   snd_soc_card_cht.dev = >dev;
mach = (>dev)->platform_data;
platform_name = mach->mach_params.platform;
 
@@ -463,7 +464,6 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
return ret_val;
 
/* register the soc card */
-   snd_soc_card_cht.dev = >dev;
snd_soc_card_set_drvdata(_soc_card_cht, drv);
 
if (drv->quirks & QUIRK_PMC_PLT_CLK_0)
-- 
2.20.1



[PATCH AUTOSEL 5.1 28/51] ASoC: Intel: bytcht_es8316: fix kernel oops with platform_name override

2019-06-25 Thread Sasha Levin
From: Pierre-Louis Bossart 

[ Upstream commit 79136a016add1acb690fe8d96be50dd22a143d26 ]

The platform override code uses devm_ functions to allocate memory for
the new name but the card device is not initialized. Fix by moving the
init earlier.

Fixes: e4bc6b1195f64 ("ASoC: Intel: bytcht_es8316: platform name fixup support")
Signed-off-by: Pierre-Louis Bossart 
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/intel/boards/bytcht_es8316.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/boards/bytcht_es8316.c 
b/sound/soc/intel/boards/bytcht_es8316.c
index d2a7e6ba11ae..1c686f83220a 100644
--- a/sound/soc/intel/boards/bytcht_es8316.c
+++ b/sound/soc/intel/boards/bytcht_es8316.c
@@ -471,6 +471,7 @@ static int snd_byt_cht_es8316_mc_probe(struct 
platform_device *pdev)
}
 
/* override plaform name, if required */
+   byt_cht_es8316_card.dev = dev;
platform_name = mach->mach_params.platform;
 
ret = snd_soc_fixup_dai_links_platform_name(_cht_es8316_card,
@@ -538,7 +539,6 @@ static int snd_byt_cht_es8316_mc_probe(struct 
platform_device *pdev)
 (quirk & BYT_CHT_ES8316_MONO_SPEAKER) ? "mono" : "stereo",
 mic_name[BYT_CHT_ES8316_MAP(quirk)]);
byt_cht_es8316_card.long_name = long_name;
-   byt_cht_es8316_card.dev = dev;
snd_soc_card_set_drvdata(_cht_es8316_card, priv);
 
ret = devm_snd_soc_register_card(dev, _cht_es8316_card);
-- 
2.20.1



[PATCH AUTOSEL 5.1 29/51] ASoC: Intel: cht_bsw_nau8824: fix kernel oops with platform_name override

2019-06-25 Thread Sasha Levin
From: Pierre-Louis Bossart 

[ Upstream commit 096701e8131425044d2054a0c210d6ea24ee7386 ]

The platform override code uses devm_ functions to allocate memory for
the new name but the card device is not initialized. Fix by moving the
init earlier.

Fixes: 4506db8043341 ("ASoC: Intel: cht_bsw_nau8824: platform name fixup 
support")
Signed-off-by: Pierre-Louis Bossart 
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/intel/boards/cht_bsw_nau8824.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/boards/cht_bsw_nau8824.c 
b/sound/soc/intel/boards/cht_bsw_nau8824.c
index 02c2fa239331..20fae391c75a 100644
--- a/sound/soc/intel/boards/cht_bsw_nau8824.c
+++ b/sound/soc/intel/boards/cht_bsw_nau8824.c
@@ -257,6 +257,7 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
snd_soc_card_set_drvdata(_soc_card_cht, drv);
 
/* override plaform name, if required */
+   snd_soc_card_cht.dev = >dev;
mach = (>dev)->platform_data;
platform_name = mach->mach_params.platform;
 
@@ -266,7 +267,6 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
return ret_val;
 
/* register the soc card */
-   snd_soc_card_cht.dev = >dev;
ret_val = devm_snd_soc_register_card(>dev, _soc_card_cht);
if (ret_val) {
dev_err(>dev,
-- 
2.20.1



bpf: test_btf : kernel Oops: 207 : PC is at memcpy+0xc0/0x330

2019-06-04 Thread Naresh Kamboju
while running kernel selftest bpf: test_btf the following kernel oops
detected on beaglebone x15 board.
Linux version 5.2.0-rc3-next-20190604

Full test log link can be found below [1]

bpf: test_btf_ #

# BTF GET_INFO test[3] (Large bpf_btf_info) OK
GET_INFO: test[3]_(Large #
# BTF GET_INFO test[4] (BTF ID) OK
GET_INFO: test[4]_(BTF #
[  341.144885] 8<--- cut here ---
[  341.148164] Unable to handle kernel NULL pointer dereference at
virtual address 
[  341.156443] pgd = b0902156
[  341.159294] [] *pgd=9655e003, *pmd=ff918003
[  341.164229] Internal error: Oops: 207 [#1] SMP ARM
[  341.169052] Modules linked in: tun sha1_generic sha1_arm_neon
sha1_arm algif_hash af_alg snd_soc_simple_card
snd_soc_simple_card_utils snd_soc_core ac97_bus snd_pcm_dmaengine
snd_pcm snd_timer snd soundcore fuse
[  341.187962] CPU: 0 PID: 6773 Comm: test_sockmap Not tainted
5.2.0-rc3-next-20190604 #1
[  341.195923] Hardware name: Generic DRA74X (Flattened Device Tree)
[  341.202058] PC is at memcpy+0xc0/0x330
[  341.205836] LR is at bpf_msg_push_data+0x70c/0x728
[  341.210654] pc : []lr : []psr: 800b0013
[  341.216957] sp : e99ad6cc  ip : 0002  fp : e99ad83c
[  341.12] r10: d1bdc000  r9 : 0001  r8 : 
[  341.227467] r7 : cd1de000  r6 :   r5 : d1bdc000  r4 : 
[  341.234032] r3 :   r2 : 8000  r1 :   r0 : cd1de000
[  341.240597] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[  341.247771] Control: 30c5387d  Table: 91b19880  DAC: fffd
[  341.253553] Process test_sockmap (pid: 6773, stack limit = 0x3ad4028c)
[  341.260118] Stack: (0xe99ad6cc to 0xe99ae000)
[  341.264502] d6c0:cd1de000 
c10ea4a4  
[  341.272725] d6e0:     
 ea2759a0 0001
[  341.280948] d700:     
  
[  341.289171] d720:     
  
[  341.297394] d740:     
 e99ad78c d03f0580
[  341.305615] d760: 0004 d03f 0007 03a9 
  
[  341.313836] d780:     d03f
c0581f6c 0060 c1e09fd0
[  341.322060] d7a0: e99ad85c e99ad7b0 c04c1d1c  c06c3638
c1dc19b8 e99ad7ec d03f0540
[  341.330283] d7c0: 0002 d03f 0007 03a9 0001
d03f0560 d03f e3444ce4
[  341.338506] d7e0: 0060 c1e09fd0 e99ad8a4 e99ad7f8 cbc66e00
e99ad8a8 c1419868 c059b69c
[  341.346730] d800:   e99ad824 e99ad818 c04e3b7c
f006b240 e99ad8b8 c1419868
[  341.354954] d820: c10e9d98   c0581ddc e99ad894
e99ad840 c0581f6c c10e9da4
[  341.363175] d840:     
 f5388145 290412b8
[  341.371399] d860: c2432908  f08d7937 c1e08488 f006b028
0011 c11cd828 f006b000
[  341.379620] d880: e99ad9e4 c1fc9e37 e99ad934 e99ad898 c0584910
c0581e48  
[  341.387841] d8a0: 0005 0004 0003 0002 0001
 cbc66ee8 
[  341.396063] d8c0: d1bdc000    
  
[  341.404286] d8e0:   d1bdc000  cbc66ee0
 0007 0006
[  341.412509] d900: 0010 c1fc9e37 e99ad8b8  cf380840
c10fdef4 c11cd828 9fdbe7c7
[  341.420732] d920: d1bdc000 d1bdc000 e99ad984 e99ad938 c10fdf18
c05848d0  
[  341.428954] d940: c10fde14 c0459978 e99ad97c e99ad958 c056165c
e7bdd400 01ff d1bdc000
[  341.437178] d960: e7bdd400 0011 cf380840  e99ad9e4
c1fc9e37 e99ad9cc e99ad988
[  341.445407] d980: c11cd828 c10fde20 c11cda3c  
e99ad9a0  
[  341.453631] d9a0: 0001 e7bdd400 cf380840 eb6d1030 c1e08488
 0003 c1fc9e37
[  341.461855] d9c0: e99adcac e99ad9d0 c11cdb60 c11cd518 
 c11cd90c e8577024
[  341.470078] d9e0: 0020 0001   0001
0001  0001
[  341.478300] da00:     eb6d1030
 0001 
[  341.486522] da20:     
  
[  341.494742] da40:     
  
[  341.502965] da60:     
  
[  341.511184] da80:     
  
[  341.519406] daa0:     
0087 0001 d03f0500
[  341.527628] dac0: d03f c1e47f04  c1e09fd0 e99adb8c
e99adae0 c04c1d1c c04c10d0
[  341.535850] dae0:     
  0078
[  341.544074] db00: d03f04f0 0087   c2432908
c2638640 0087 d03f0500
[  341.552296] db20: c1e08488 c2418b30 406293ec 295bca2f 
 0

Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest

2019-05-20 Thread Aneesh Kumar K.V

On 5/20/19 8:25 PM, Nicholas Piggin wrote:

Bharata B Rao's on May 21, 2019 12:29 am:

On Mon, May 20, 2019 at 01:50:35PM +0530, Bharata B Rao wrote:

On Mon, May 20, 2019 at 05:00:21PM +1000, Nicholas Piggin wrote:

Bharata B Rao's on May 20, 2019 3:56 pm:

On Mon, May 20, 2019 at 02:48:35PM +1000, Nicholas Piggin wrote:

git bisect points to

commit 4231aba000f5a4583dd9f67057aadb68c3eca99d
Author: Nicholas Piggin 
Date:   Fri Jul 27 21:48:17 2018 +1000

 powerpc/64s: Fix page table fragment refcount race vs speculative 
references

 The page table fragment allocator uses the main page refcount racily
 with respect to speculative references. A customer observed a BUG due
 to page table page refcount underflow in the fragment allocator. This
 can be caused by the fragment allocator set_page_count stomping on a
 speculative reference, and then the speculative failure handler
 decrements the new reference, and the underflow eventually pops when
 the page tables are freed.

 Fix this by using a dedicated field in the struct page for the page
 table fragment allocator.

 Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory wastage")
 Cc: sta...@vger.kernel.org # v3.10+


That's the commit that added the BUG_ON(), so prior to that you won't
see the crash.


Right, but the commit says it fixes page table page refcount underflow by
introducing a new field >pt_frag_refcount. Now we are hitting the 
underflow
for this pt_frag_refcount.


The fixed underflow is caused by a bug (race on page count) that got
fixed by that patch. You are hitting a different underflow here. It's
not certain my patch caused it, I'm just trying to reproduce now.


Ok.


Can't reproduce I'm afraid, tried adding and removing 8GB memory from a
4GB guest (via host adding / removing memory device), and it just works.


Boot, add 8G, reboot, remove 8G is the sequence to reproduce.



It's likely to be an edge case like an off by one or rounding error
that just happens to trigger in your config. Might be easiest if you
could test with a debug patch.


Sure, I will continue debugging.


When the guest is rebooted after hotplug, the entire memory (which includes
the hotplugged memory) gets remapped again freshly. However at this time
since no slab is available yet, pt_frag_refcount never gets initialized as we
never do pte_fragment_alloc() for these mappings. So we right away hit the
underflow during the first unplug itself, it looks like.


Nice catch, good debugging work.


I will check how this can be fixed.


Tricky problem. What do you think? You might be able to make the early
page table allocations in the same pattern as the frag allocations, and
then fill in the struct page metadata when you have those.



I guess we need to do something similar to what x86 does. We need to 
walk the init_mm page table again and re-init struct page and other data 
structures backing the tables?


-aneesh



Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest

2019-05-20 Thread Bharata B Rao
On Tue, May 21, 2019 at 12:55:49AM +1000, Nicholas Piggin wrote:
> Bharata B Rao's on May 21, 2019 12:29 am:
> > On Mon, May 20, 2019 at 01:50:35PM +0530, Bharata B Rao wrote:
> >> On Mon, May 20, 2019 at 05:00:21PM +1000, Nicholas Piggin wrote:
> >> > Bharata B Rao's on May 20, 2019 3:56 pm:
> >> > > On Mon, May 20, 2019 at 02:48:35PM +1000, Nicholas Piggin wrote:
> >> > >> >> > git bisect points to
> >> > >> >> >
> >> > >> >> > commit 4231aba000f5a4583dd9f67057aadb68c3eca99d
> >> > >> >> > Author: Nicholas Piggin 
> >> > >> >> > Date:   Fri Jul 27 21:48:17 2018 +1000
> >> > >> >> >
> >> > >> >> > powerpc/64s: Fix page table fragment refcount race vs 
> >> > >> >> > speculative references
> >> > >> >> >
> >> > >> >> > The page table fragment allocator uses the main page 
> >> > >> >> > refcount racily
> >> > >> >> > with respect to speculative references. A customer observed 
> >> > >> >> > a BUG due
> >> > >> >> > to page table page refcount underflow in the fragment 
> >> > >> >> > allocator. This
> >> > >> >> > can be caused by the fragment allocator set_page_count 
> >> > >> >> > stomping on a
> >> > >> >> > speculative reference, and then the speculative failure 
> >> > >> >> > handler
> >> > >> >> > decrements the new reference, and the underflow eventually 
> >> > >> >> > pops when
> >> > >> >> > the page tables are freed.
> >> > >> >> >
> >> > >> >> > Fix this by using a dedicated field in the struct page for 
> >> > >> >> > the page
> >> > >> >> > table fragment allocator.
> >> > >> >> >
> >> > >> >> > Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory 
> >> > >> >> > wastage")
> >> > >> >> > Cc: sta...@vger.kernel.org # v3.10+
> >> > >> >> 
> >> > >> >> That's the commit that added the BUG_ON(), so prior to that you 
> >> > >> >> won't
> >> > >> >> see the crash.
> >> > >> > 
> >> > >> > Right, but the commit says it fixes page table page refcount 
> >> > >> > underflow by
> >> > >> > introducing a new field >pt_frag_refcount. Now we are hitting 
> >> > >> > the underflow
> >> > >> > for this pt_frag_refcount.
> >> > >> 
> >> > >> The fixed underflow is caused by a bug (race on page count) that got 
> >> > >> fixed by that patch. You are hitting a different underflow here. It's
> >> > >> not certain my patch caused it, I'm just trying to reproduce now.
> >> > > 
> >> > > Ok.
> >> > 
> >> > Can't reproduce I'm afraid, tried adding and removing 8GB memory from a
> >> > 4GB guest (via host adding / removing memory device), and it just works.
> >> 
> >> Boot, add 8G, reboot, remove 8G is the sequence to reproduce.
> >> 
> >> > 
> >> > It's likely to be an edge case like an off by one or rounding error
> >> > that just happens to trigger in your config. Might be easiest if you
> >> > could test with a debug patch.
> >> 
> >> Sure, I will continue debugging.
> > 
> > When the guest is rebooted after hotplug, the entire memory (which includes
> > the hotplugged memory) gets remapped again freshly. However at this time
> > since no slab is available yet, pt_frag_refcount never gets initialized as 
> > we
> > never do pte_fragment_alloc() for these mappings. So we right away hit the
> > underflow during the first unplug itself, it looks like.
> 
> Nice catch, good debugging work.

Thanks, with help from Aneesh.

> 
> > I will check how this can be fixed.
> 
> Tricky problem. What do you think? You might be able to make the early 
> page table allocations in the same pattern as the frag allocations, and 
> then fill in the struct page metadata when you have those.

Will explore.

> 
> Other option may be create a new set of page tables after mm comes up
> to replace the early page tables with. That's a bigger hammer though.

Will also check if similar scenario exists on x86 and if so, how and when
pte frag data is fixed there.

Regards,
Bharata.



Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest

2019-05-20 Thread Nicholas Piggin
Bharata B Rao's on May 21, 2019 12:29 am:
> On Mon, May 20, 2019 at 01:50:35PM +0530, Bharata B Rao wrote:
>> On Mon, May 20, 2019 at 05:00:21PM +1000, Nicholas Piggin wrote:
>> > Bharata B Rao's on May 20, 2019 3:56 pm:
>> > > On Mon, May 20, 2019 at 02:48:35PM +1000, Nicholas Piggin wrote:
>> > >> >> > git bisect points to
>> > >> >> >
>> > >> >> > commit 4231aba000f5a4583dd9f67057aadb68c3eca99d
>> > >> >> > Author: Nicholas Piggin 
>> > >> >> > Date:   Fri Jul 27 21:48:17 2018 +1000
>> > >> >> >
>> > >> >> > powerpc/64s: Fix page table fragment refcount race vs 
>> > >> >> > speculative references
>> > >> >> >
>> > >> >> > The page table fragment allocator uses the main page refcount 
>> > >> >> > racily
>> > >> >> > with respect to speculative references. A customer observed a 
>> > >> >> > BUG due
>> > >> >> > to page table page refcount underflow in the fragment 
>> > >> >> > allocator. This
>> > >> >> > can be caused by the fragment allocator set_page_count 
>> > >> >> > stomping on a
>> > >> >> > speculative reference, and then the speculative failure handler
>> > >> >> > decrements the new reference, and the underflow eventually 
>> > >> >> > pops when
>> > >> >> > the page tables are freed.
>> > >> >> >
>> > >> >> > Fix this by using a dedicated field in the struct page for the 
>> > >> >> > page
>> > >> >> > table fragment allocator.
>> > >> >> >
>> > >> >> > Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory 
>> > >> >> > wastage")
>> > >> >> > Cc: sta...@vger.kernel.org # v3.10+
>> > >> >> 
>> > >> >> That's the commit that added the BUG_ON(), so prior to that you won't
>> > >> >> see the crash.
>> > >> > 
>> > >> > Right, but the commit says it fixes page table page refcount 
>> > >> > underflow by
>> > >> > introducing a new field >pt_frag_refcount. Now we are hitting 
>> > >> > the underflow
>> > >> > for this pt_frag_refcount.
>> > >> 
>> > >> The fixed underflow is caused by a bug (race on page count) that got 
>> > >> fixed by that patch. You are hitting a different underflow here. It's
>> > >> not certain my patch caused it, I'm just trying to reproduce now.
>> > > 
>> > > Ok.
>> > 
>> > Can't reproduce I'm afraid, tried adding and removing 8GB memory from a
>> > 4GB guest (via host adding / removing memory device), and it just works.
>> 
>> Boot, add 8G, reboot, remove 8G is the sequence to reproduce.
>> 
>> > 
>> > It's likely to be an edge case like an off by one or rounding error
>> > that just happens to trigger in your config. Might be easiest if you
>> > could test with a debug patch.
>> 
>> Sure, I will continue debugging.
> 
> When the guest is rebooted after hotplug, the entire memory (which includes
> the hotplugged memory) gets remapped again freshly. However at this time
> since no slab is available yet, pt_frag_refcount never gets initialized as we
> never do pte_fragment_alloc() for these mappings. So we right away hit the
> underflow during the first unplug itself, it looks like.

Nice catch, good debugging work.

> I will check how this can be fixed.

Tricky problem. What do you think? You might be able to make the early 
page table allocations in the same pattern as the frag allocations, and 
then fill in the struct page metadata when you have those.

Other option may be create a new set of page tables after mm comes up
to replace the early page tables with. That's a bigger hammer though.

Thanks,
Nick



Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest

2019-05-20 Thread Bharata B Rao
On Mon, May 20, 2019 at 01:50:35PM +0530, Bharata B Rao wrote:
> On Mon, May 20, 2019 at 05:00:21PM +1000, Nicholas Piggin wrote:
> > Bharata B Rao's on May 20, 2019 3:56 pm:
> > > On Mon, May 20, 2019 at 02:48:35PM +1000, Nicholas Piggin wrote:
> > >> >> > git bisect points to
> > >> >> >
> > >> >> > commit 4231aba000f5a4583dd9f67057aadb68c3eca99d
> > >> >> > Author: Nicholas Piggin 
> > >> >> > Date:   Fri Jul 27 21:48:17 2018 +1000
> > >> >> >
> > >> >> > powerpc/64s: Fix page table fragment refcount race vs 
> > >> >> > speculative references
> > >> >> >
> > >> >> > The page table fragment allocator uses the main page refcount 
> > >> >> > racily
> > >> >> > with respect to speculative references. A customer observed a 
> > >> >> > BUG due
> > >> >> > to page table page refcount underflow in the fragment 
> > >> >> > allocator. This
> > >> >> > can be caused by the fragment allocator set_page_count stomping 
> > >> >> > on a
> > >> >> > speculative reference, and then the speculative failure handler
> > >> >> > decrements the new reference, and the underflow eventually pops 
> > >> >> > when
> > >> >> > the page tables are freed.
> > >> >> >
> > >> >> > Fix this by using a dedicated field in the struct page for the 
> > >> >> > page
> > >> >> > table fragment allocator.
> > >> >> >
> > >> >> > Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory wastage")
> > >> >> > Cc: sta...@vger.kernel.org # v3.10+
> > >> >> 
> > >> >> That's the commit that added the BUG_ON(), so prior to that you won't
> > >> >> see the crash.
> > >> > 
> > >> > Right, but the commit says it fixes page table page refcount underflow 
> > >> > by
> > >> > introducing a new field >pt_frag_refcount. Now we are hitting 
> > >> > the underflow
> > >> > for this pt_frag_refcount.
> > >> 
> > >> The fixed underflow is caused by a bug (race on page count) that got 
> > >> fixed by that patch. You are hitting a different underflow here. It's
> > >> not certain my patch caused it, I'm just trying to reproduce now.
> > > 
> > > Ok.
> > 
> > Can't reproduce I'm afraid, tried adding and removing 8GB memory from a
> > 4GB guest (via host adding / removing memory device), and it just works.
> 
> Boot, add 8G, reboot, remove 8G is the sequence to reproduce.
> 
> > 
> > It's likely to be an edge case like an off by one or rounding error
> > that just happens to trigger in your config. Might be easiest if you
> > could test with a debug patch.
> 
> Sure, I will continue debugging.

When the guest is rebooted after hotplug, the entire memory (which includes
the hotplugged memory) gets remapped again freshly. However at this time
since no slab is available yet, pt_frag_refcount never gets initialized as we
never do pte_fragment_alloc() for these mappings. So we right away hit the
underflow during the first unplug itself, it looks like.

I will check how this can be fixed.

> 
> Regards,
> Bharata.



Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest

2019-05-20 Thread Bharata B Rao
On Mon, May 20, 2019 at 05:00:21PM +1000, Nicholas Piggin wrote:
> Bharata B Rao's on May 20, 2019 3:56 pm:
> > On Mon, May 20, 2019 at 02:48:35PM +1000, Nicholas Piggin wrote:
> >> >> > git bisect points to
> >> >> >
> >> >> > commit 4231aba000f5a4583dd9f67057aadb68c3eca99d
> >> >> > Author: Nicholas Piggin 
> >> >> > Date:   Fri Jul 27 21:48:17 2018 +1000
> >> >> >
> >> >> > powerpc/64s: Fix page table fragment refcount race vs speculative 
> >> >> > references
> >> >> >
> >> >> > The page table fragment allocator uses the main page refcount 
> >> >> > racily
> >> >> > with respect to speculative references. A customer observed a BUG 
> >> >> > due
> >> >> > to page table page refcount underflow in the fragment allocator. 
> >> >> > This
> >> >> > can be caused by the fragment allocator set_page_count stomping 
> >> >> > on a
> >> >> > speculative reference, and then the speculative failure handler
> >> >> > decrements the new reference, and the underflow eventually pops 
> >> >> > when
> >> >> > the page tables are freed.
> >> >> >
> >> >> > Fix this by using a dedicated field in the struct page for the 
> >> >> > page
> >> >> > table fragment allocator.
> >> >> >
> >> >> > Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory wastage")
> >> >> > Cc: sta...@vger.kernel.org # v3.10+
> >> >> 
> >> >> That's the commit that added the BUG_ON(), so prior to that you won't
> >> >> see the crash.
> >> > 
> >> > Right, but the commit says it fixes page table page refcount underflow by
> >> > introducing a new field >pt_frag_refcount. Now we are hitting the 
> >> > underflow
> >> > for this pt_frag_refcount.
> >> 
> >> The fixed underflow is caused by a bug (race on page count) that got 
> >> fixed by that patch. You are hitting a different underflow here. It's
> >> not certain my patch caused it, I'm just trying to reproduce now.
> > 
> > Ok.
> 
> Can't reproduce I'm afraid, tried adding and removing 8GB memory from a
> 4GB guest (via host adding / removing memory device), and it just works.

Boot, add 8G, reboot, remove 8G is the sequence to reproduce.

> 
> It's likely to be an edge case like an off by one or rounding error
> that just happens to trigger in your config. Might be easiest if you
> could test with a debug patch.

Sure, I will continue debugging.

Regards,
Bharata.



Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest

2019-05-20 Thread Nicholas Piggin
Bharata B Rao's on May 20, 2019 3:56 pm:
> On Mon, May 20, 2019 at 02:48:35PM +1000, Nicholas Piggin wrote:
>> >> > git bisect points to
>> >> >
>> >> > commit 4231aba000f5a4583dd9f67057aadb68c3eca99d
>> >> > Author: Nicholas Piggin 
>> >> > Date:   Fri Jul 27 21:48:17 2018 +1000
>> >> >
>> >> > powerpc/64s: Fix page table fragment refcount race vs speculative 
>> >> > references
>> >> >
>> >> > The page table fragment allocator uses the main page refcount racily
>> >> > with respect to speculative references. A customer observed a BUG 
>> >> > due
>> >> > to page table page refcount underflow in the fragment allocator. 
>> >> > This
>> >> > can be caused by the fragment allocator set_page_count stomping on a
>> >> > speculative reference, and then the speculative failure handler
>> >> > decrements the new reference, and the underflow eventually pops when
>> >> > the page tables are freed.
>> >> >
>> >> > Fix this by using a dedicated field in the struct page for the page
>> >> > table fragment allocator.
>> >> >
>> >> > Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory wastage")
>> >> > Cc: sta...@vger.kernel.org # v3.10+
>> >> 
>> >> That's the commit that added the BUG_ON(), so prior to that you won't
>> >> see the crash.
>> > 
>> > Right, but the commit says it fixes page table page refcount underflow by
>> > introducing a new field >pt_frag_refcount. Now we are hitting the 
>> > underflow
>> > for this pt_frag_refcount.
>> 
>> The fixed underflow is caused by a bug (race on page count) that got 
>> fixed by that patch. You are hitting a different underflow here. It's
>> not certain my patch caused it, I'm just trying to reproduce now.
> 
> Ok.

Can't reproduce I'm afraid, tried adding and removing 8GB memory from a
4GB guest (via host adding / removing memory device), and it just works.

It's likely to be an edge case like an off by one or rounding error
that just happens to trigger in your config. Might be easiest if you
could test with a debug patch.

Thanks,
Nick



Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest

2019-05-19 Thread Bharata B Rao
On Mon, May 20, 2019 at 02:48:35PM +1000, Nicholas Piggin wrote:
> >> > git bisect points to
> >> >
> >> > commit 4231aba000f5a4583dd9f67057aadb68c3eca99d
> >> > Author: Nicholas Piggin 
> >> > Date:   Fri Jul 27 21:48:17 2018 +1000
> >> >
> >> > powerpc/64s: Fix page table fragment refcount race vs speculative 
> >> > references
> >> >
> >> > The page table fragment allocator uses the main page refcount racily
> >> > with respect to speculative references. A customer observed a BUG due
> >> > to page table page refcount underflow in the fragment allocator. This
> >> > can be caused by the fragment allocator set_page_count stomping on a
> >> > speculative reference, and then the speculative failure handler
> >> > decrements the new reference, and the underflow eventually pops when
> >> > the page tables are freed.
> >> >
> >> > Fix this by using a dedicated field in the struct page for the page
> >> > table fragment allocator.
> >> >
> >> > Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory wastage")
> >> > Cc: sta...@vger.kernel.org # v3.10+
> >> 
> >> That's the commit that added the BUG_ON(), so prior to that you won't
> >> see the crash.
> > 
> > Right, but the commit says it fixes page table page refcount underflow by
> > introducing a new field >pt_frag_refcount. Now we are hitting the 
> > underflow
> > for this pt_frag_refcount.
> 
> The fixed underflow is caused by a bug (race on page count) that got 
> fixed by that patch. You are hitting a different underflow here. It's
> not certain my patch caused it, I'm just trying to reproduce now.

Ok.

> 
> > 
> > BTW, if I go below this commit, I don't hit the pagecount
> > 
> > VM_BUG_ON_PAGE(page_ref_count(page) == 0, page);
> > 
> > which is in pte_fragment_free() path.
> 
> Do you have CONFIG_DEBUG_VM=y?

Yes.

Regards,
Bharata.



Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest

2019-05-19 Thread Nicholas Piggin
Bharata B Rao's on May 20, 2019 2:25 pm:
> On Mon, May 20, 2019 at 12:02:23PM +1000, Michael Ellerman wrote:
>> Bharata B Rao  writes:
>> > On Thu, May 16, 2019 at 07:44:20PM +0530, srikanth wrote:
>> >> Hello,
>> >> 
>> >> On power9 host, performing memory hotunplug from ppc64le guest results in
>> >> kernel oops.
>> >> 
>> >> Kernel used : https://github.com/torvalds/linux/tree/v5.1 built using
>> >> ppc64le_defconfig for host and ppc64le_guest_defconfig for guest.
>> >> 
>> >> Recreation steps:
>> >> 
>> >> 1. Boot a guest with below mem configuration:
>> >>   33554432
>> >>   8388608
>> >>   4194304
>> >>   
>> >>     
>> >>   
>> >>     
>> >>   
>> >> 
>> >> 2. From host hotplug 8G memory -> verify memory hotadded succesfully -> 
>> >> now
>> >> reboot guest -> once guest comes back try to unplug 8G memory
>> >> 
>> >> mem.xml used:
>> >> 
>> >> 
>> >> 8
>> >> 0
>> >> 
>> >> 
>> >> 
>> >> Memory attach and detach commands used:
>> >>     virsh attach-device vm1 ./mem.xml --live
>> >>     virsh detach-device vm1 ./mem.xml --live
>> >> 
>> >> Trace seen inside guest after unplug, guest just hangs there forever:
>> >> 
>> >> [   21.962986] kernel BUG at arch/powerpc/mm/pgtable-frag.c:113!
>> >> [   21.963064] Oops: Exception in kernel mode, sig: 5 [#1]
>> >> [   21.963090] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA
>> >> pSeries
>> >> [   21.963131] Modules linked in: xt_tcpudp iptable_filter squashfs fuse
>> >> vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_core libiscsi 
>> >> scsi_transport_iscsi
>> >> ip_tables x_tables autofs4 btrfs zstd_decompress zstd_compress 
>> >> lzo_compress
>> >> raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
>> >> xor raid6_pq multipath crc32c_vpmsum
>> >> [   21.963281] CPU: 11 PID: 316 Comm: kworker/u64:5 Kdump: loaded Not
>> >> tainted 5.1.0-dirty #2
>> >> [   21.963323] Workqueue: pseries hotplug workque pseries_hp_work_fn
>> >> [   21.963355] NIP:  c0079e18 LR: c0c79308 CTR:
>> >> 8000
>> >> [   21.963392] REGS: c003f88034f0 TRAP: 0700   Not tainted 
>> >> (5.1.0-dirty)
>> >> [   21.963422] MSR:  8282b033   
>> >> CR:
>> >> 28002884  XER: 2004
>> >> [   21.963470] CFAR: c0c79304 IRQMASK: 0
>> >> [   21.963470] GPR00: c0c79308 c003f8803780 c1521000
>> >> 00fff8c0
>> >> [   21.963470] GPR04: 0001 ffe30005 0005
>> >> 0020
>> >> [   21.963470] GPR08:  0001 c00a00fff8e0
>> >> c16d21a0
>> >> [   21.963470] GPR12: c16e7b90 c7ff2700 c00a00a0
>> >> c003ffe30100
>> >> [   21.963470] GPR16: c003ffe3 c14aa4de c00a009f
>> >> c16d21b0
>> >> [   21.963470] GPR20: c14de588 0001 c16d21b8
>> >> c00a00a0
>> >> [   21.963470] GPR24:   c00a00a0
>> >> c003ffe96000
>> >> [   21.963470] GPR28: c00a00a0 c00a00a0 c003fffec000
>> >> c00a00fff8c0
>> >> [   21.963802] NIP [c0079e18] pte_fragment_free+0x48/0xd0
>> >> [   21.963838] LR [c0c79308] remove_pagetable+0x49c/0x5b4
>> >> [   21.963873] Call Trace:
>> >> [   21.963890] [c003f8803780] [c003ffe997f0] 0xc003ffe997f0
>> >> (unreliable)
>> >> [   21.963933] [c003f88037b0] [] (null)
>> >> [   21.963969] [c003f88038c0] [c006f038]
>> >> vmemmap_free+0x218/0x2e0
>> >> [   21.964006] [c003f8803940] [c036f100]
>> >> sparse_remove_one_section+0xd0/0x138
>> >> [   21.964050] [c003f8803980] [c0383a50]
>> >> __remove_pages+0x410/0x560
>> >> [   21.964093] [c003f8803a90] [c0c784d8]
>> >> arch_remove_memory+0x68/0xdc
>> >> [   21.964136] [c003f8803ad0] [c0385d74]
>> >&

Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest

2019-05-19 Thread Bharata B Rao
On Mon, May 20, 2019 at 12:02:23PM +1000, Michael Ellerman wrote:
> Bharata B Rao  writes:
> > On Thu, May 16, 2019 at 07:44:20PM +0530, srikanth wrote:
> >> Hello,
> >> 
> >> On power9 host, performing memory hotunplug from ppc64le guest results in
> >> kernel oops.
> >> 
> >> Kernel used : https://github.com/torvalds/linux/tree/v5.1 built using
> >> ppc64le_defconfig for host and ppc64le_guest_defconfig for guest.
> >> 
> >> Recreation steps:
> >> 
> >> 1. Boot a guest with below mem configuration:
> >>   33554432
> >>   8388608
> >>   4194304
> >>   
> >>     
> >>   
> >>     
> >>   
> >> 
> >> 2. From host hotplug 8G memory -> verify memory hotadded succesfully -> now
> >> reboot guest -> once guest comes back try to unplug 8G memory
> >> 
> >> mem.xml used:
> >> 
> >> 
> >> 8
> >> 0
> >> 
> >> 
> >> 
> >> Memory attach and detach commands used:
> >>     virsh attach-device vm1 ./mem.xml --live
> >>     virsh detach-device vm1 ./mem.xml --live
> >> 
> >> Trace seen inside guest after unplug, guest just hangs there forever:
> >> 
> >> [   21.962986] kernel BUG at arch/powerpc/mm/pgtable-frag.c:113!
> >> [   21.963064] Oops: Exception in kernel mode, sig: 5 [#1]
> >> [   21.963090] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA
> >> pSeries
> >> [   21.963131] Modules linked in: xt_tcpudp iptable_filter squashfs fuse
> >> vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_core libiscsi 
> >> scsi_transport_iscsi
> >> ip_tables x_tables autofs4 btrfs zstd_decompress zstd_compress lzo_compress
> >> raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
> >> xor raid6_pq multipath crc32c_vpmsum
> >> [   21.963281] CPU: 11 PID: 316 Comm: kworker/u64:5 Kdump: loaded Not
> >> tainted 5.1.0-dirty #2
> >> [   21.963323] Workqueue: pseries hotplug workque pseries_hp_work_fn
> >> [   21.963355] NIP:  c0079e18 LR: c0c79308 CTR:
> >> 8000
> >> [   21.963392] REGS: c003f88034f0 TRAP: 0700   Not tainted 
> >> (5.1.0-dirty)
> >> [   21.963422] MSR:  8282b033   
> >> CR:
> >> 28002884  XER: 2004
> >> [   21.963470] CFAR: c0c79304 IRQMASK: 0
> >> [   21.963470] GPR00: c0c79308 c003f8803780 c1521000
> >> 00fff8c0
> >> [   21.963470] GPR04: 0001 ffe30005 0005
> >> 0020
> >> [   21.963470] GPR08:  0001 c00a00fff8e0
> >> c16d21a0
> >> [   21.963470] GPR12: c16e7b90 c7ff2700 c00a00a0
> >> c003ffe30100
> >> [   21.963470] GPR16: c003ffe3 c14aa4de c00a009f
> >> c16d21b0
> >> [   21.963470] GPR20: c14de588 0001 c16d21b8
> >> c00a00a0
> >> [   21.963470] GPR24:   c00a00a0
> >> c003ffe96000
> >> [   21.963470] GPR28: c00a00a0 c00a00a0 c003fffec000
> >> c00a00fff8c0
> >> [   21.963802] NIP [c0079e18] pte_fragment_free+0x48/0xd0
> >> [   21.963838] LR [c0c79308] remove_pagetable+0x49c/0x5b4
> >> [   21.963873] Call Trace:
> >> [   21.963890] [c003f8803780] [c003ffe997f0] 0xc003ffe997f0
> >> (unreliable)
> >> [   21.963933] [c003f88037b0] [] (null)
> >> [   21.963969] [c003f88038c0] [c006f038]
> >> vmemmap_free+0x218/0x2e0
> >> [   21.964006] [c003f8803940] [c036f100]
> >> sparse_remove_one_section+0xd0/0x138
> >> [   21.964050] [c003f8803980] [c0383a50]
> >> __remove_pages+0x410/0x560
> >> [   21.964093] [c003f8803a90] [c0c784d8]
> >> arch_remove_memory+0x68/0xdc
> >> [   21.964136] [c003f8803ad0] [c0385d74]
> >> __remove_memory+0xc4/0x110
> >> [   21.964180] [c003f8803b10] [c00d44e4]
> >> dlpar_remove_lmb+0x94/0x140
> >> [   21.964223] [c003f8803b50] [c00d52b4]
> >> dlpar_memory+0x464/0xd00
> >> [   21.964259] [c003f8803be0] [c00cd5c0]
> >> handle_dlpar_errorlog+0xc0/0x190
> >> [   21.964303] [c003f8803c50] [c00cd6bc]
> >

Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest

2019-05-19 Thread Michael Ellerman
Bharata B Rao  writes:
> On Thu, May 16, 2019 at 07:44:20PM +0530, srikanth wrote:
>> Hello,
>> 
>> On power9 host, performing memory hotunplug from ppc64le guest results in
>> kernel oops.
>> 
>> Kernel used : https://github.com/torvalds/linux/tree/v5.1 built using
>> ppc64le_defconfig for host and ppc64le_guest_defconfig for guest.
>> 
>> Recreation steps:
>> 
>> 1. Boot a guest with below mem configuration:
>>   33554432
>>   8388608
>>   4194304
>>   
>>     
>>   
>>     
>>   
>> 
>> 2. From host hotplug 8G memory -> verify memory hotadded succesfully -> now
>> reboot guest -> once guest comes back try to unplug 8G memory
>> 
>> mem.xml used:
>> 
>> 
>> 8
>> 0
>> 
>> 
>> 
>> Memory attach and detach commands used:
>>     virsh attach-device vm1 ./mem.xml --live
>>     virsh detach-device vm1 ./mem.xml --live
>> 
>> Trace seen inside guest after unplug, guest just hangs there forever:
>> 
>> [   21.962986] kernel BUG at arch/powerpc/mm/pgtable-frag.c:113!
>> [   21.963064] Oops: Exception in kernel mode, sig: 5 [#1]
>> [   21.963090] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA
>> pSeries
>> [   21.963131] Modules linked in: xt_tcpudp iptable_filter squashfs fuse
>> vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_core libiscsi scsi_transport_iscsi
>> ip_tables x_tables autofs4 btrfs zstd_decompress zstd_compress lzo_compress
>> raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
>> xor raid6_pq multipath crc32c_vpmsum
>> [   21.963281] CPU: 11 PID: 316 Comm: kworker/u64:5 Kdump: loaded Not
>> tainted 5.1.0-dirty #2
>> [   21.963323] Workqueue: pseries hotplug workque pseries_hp_work_fn
>> [   21.963355] NIP:  c0079e18 LR: c0c79308 CTR:
>> 8000
>> [   21.963392] REGS: c003f88034f0 TRAP: 0700   Not tainted (5.1.0-dirty)
>> [   21.963422] MSR:  8282b033   CR:
>> 28002884  XER: 2004
>> [   21.963470] CFAR: c0c79304 IRQMASK: 0
>> [   21.963470] GPR00: c0c79308 c003f8803780 c1521000
>> 00fff8c0
>> [   21.963470] GPR04: 0001 ffe30005 0005
>> 0020
>> [   21.963470] GPR08:  0001 c00a00fff8e0
>> c16d21a0
>> [   21.963470] GPR12: c16e7b90 c7ff2700 c00a00a0
>> c003ffe30100
>> [   21.963470] GPR16: c003ffe3 c14aa4de c00a009f
>> c16d21b0
>> [   21.963470] GPR20: c14de588 0001 c16d21b8
>> c00a00a0
>> [   21.963470] GPR24:   c00a00a0
>> c003ffe96000
>> [   21.963470] GPR28: c00a00a0 c00a00a0 c003fffec000
>> c00a00fff8c0
>> [   21.963802] NIP [c0079e18] pte_fragment_free+0x48/0xd0
>> [   21.963838] LR [c0c79308] remove_pagetable+0x49c/0x5b4
>> [   21.963873] Call Trace:
>> [   21.963890] [c003f8803780] [c003ffe997f0] 0xc003ffe997f0
>> (unreliable)
>> [   21.963933] [c003f88037b0] [] (null)
>> [   21.963969] [c003f88038c0] [c006f038]
>> vmemmap_free+0x218/0x2e0
>> [   21.964006] [c003f8803940] [c036f100]
>> sparse_remove_one_section+0xd0/0x138
>> [   21.964050] [c003f8803980] [c0383a50]
>> __remove_pages+0x410/0x560
>> [   21.964093] [c003f8803a90] [c0c784d8]
>> arch_remove_memory+0x68/0xdc
>> [   21.964136] [c003f8803ad0] [c0385d74]
>> __remove_memory+0xc4/0x110
>> [   21.964180] [c003f8803b10] [c00d44e4]
>> dlpar_remove_lmb+0x94/0x140
>> [   21.964223] [c003f8803b50] [c00d52b4]
>> dlpar_memory+0x464/0xd00
>> [   21.964259] [c003f8803be0] [c00cd5c0]
>> handle_dlpar_errorlog+0xc0/0x190
>> [   21.964303] [c003f8803c50] [c00cd6bc]
>> pseries_hp_work_fn+0x2c/0x60
>> [   21.964346] [c003f8803c80] [c013a4a0]
>> process_one_work+0x2b0/0x5a0
>> [   21.964388] [c003f8803d10] [c013a818]
>> worker_thread+0x88/0x610
>> [   21.964434] [c003f8803db0] [c0143884] kthread+0x1a4/0x1b0
>> [   21.964468] [c003f8803e20] [c000bdc4]
>> ret_from_kernel_thread+0x5c/0x78
>> [   21.964506] Instruction dump:
>> [   21.964527] fbe1fff8 f821ffd1 78638502 78633664 ebe9 7fff1a14
>> 395f0020 813f0020
>> [   21.964569] 7d2907b4 7d2900d0 79290fe0 69290001 &

Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest

2019-05-18 Thread Bharata B Rao
On Thu, May 16, 2019 at 07:44:20PM +0530, srikanth wrote:
> Hello,
> 
> On power9 host, performing memory hotunplug from ppc64le guest results in
> kernel oops.
> 
> Kernel used : https://github.com/torvalds/linux/tree/v5.1 built using
> ppc64le_defconfig for host and ppc64le_guest_defconfig for guest.
> 
> Recreation steps:
> 
> 1. Boot a guest with below mem configuration:
>   33554432
>   8388608
>   4194304
>   
>     
>   
>     
>   
> 
> 2. From host hotplug 8G memory -> verify memory hotadded succesfully -> now
> reboot guest -> once guest comes back try to unplug 8G memory
> 
> mem.xml used:
> 
> 
> 8
> 0
> 
> 
> 
> Memory attach and detach commands used:
>     virsh attach-device vm1 ./mem.xml --live
>     virsh detach-device vm1 ./mem.xml --live
> 
> Trace seen inside guest after unplug, guest just hangs there forever:
> 
> [   21.962986] kernel BUG at arch/powerpc/mm/pgtable-frag.c:113!
> [   21.963064] Oops: Exception in kernel mode, sig: 5 [#1]
> [   21.963090] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA
> pSeries
> [   21.963131] Modules linked in: xt_tcpudp iptable_filter squashfs fuse
> vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_core libiscsi scsi_transport_iscsi
> ip_tables x_tables autofs4 btrfs zstd_decompress zstd_compress lzo_compress
> raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
> xor raid6_pq multipath crc32c_vpmsum
> [   21.963281] CPU: 11 PID: 316 Comm: kworker/u64:5 Kdump: loaded Not
> tainted 5.1.0-dirty #2
> [   21.963323] Workqueue: pseries hotplug workque pseries_hp_work_fn
> [   21.963355] NIP:  c0079e18 LR: c0c79308 CTR:
> 8000
> [   21.963392] REGS: c003f88034f0 TRAP: 0700   Not tainted (5.1.0-dirty)
> [   21.963422] MSR:  8282b033   CR:
> 28002884  XER: 2004
> [   21.963470] CFAR: c0c79304 IRQMASK: 0
> [   21.963470] GPR00: c0c79308 c003f8803780 c1521000
> 00fff8c0
> [   21.963470] GPR04: 0001 ffe30005 0005
> 0020
> [   21.963470] GPR08:  0001 c00a00fff8e0
> c16d21a0
> [   21.963470] GPR12: c16e7b90 c7ff2700 c00a00a0
> c003ffe30100
> [   21.963470] GPR16: c003ffe3 c14aa4de c00a009f
> c16d21b0
> [   21.963470] GPR20: c14de588 0001 c16d21b8
> c00a00a0
> [   21.963470] GPR24:   c00a00a0
> c003ffe96000
> [   21.963470] GPR28: c00a00a0 c00a00a0 c003fffec000
> c00a00fff8c0
> [   21.963802] NIP [c0079e18] pte_fragment_free+0x48/0xd0
> [   21.963838] LR [c0c79308] remove_pagetable+0x49c/0x5b4
> [   21.963873] Call Trace:
> [   21.963890] [c003f8803780] [c003ffe997f0] 0xc003ffe997f0
> (unreliable)
> [   21.963933] [c003f88037b0] [] (null)
> [   21.963969] [c003f88038c0] [c006f038]
> vmemmap_free+0x218/0x2e0
> [   21.964006] [c003f8803940] [c036f100]
> sparse_remove_one_section+0xd0/0x138
> [   21.964050] [c003f8803980] [c0383a50]
> __remove_pages+0x410/0x560
> [   21.964093] [c003f8803a90] [c0c784d8]
> arch_remove_memory+0x68/0xdc
> [   21.964136] [c003f8803ad0] [c0385d74]
> __remove_memory+0xc4/0x110
> [   21.964180] [c003f8803b10] [c00d44e4]
> dlpar_remove_lmb+0x94/0x140
> [   21.964223] [c003f8803b50] [c00d52b4]
> dlpar_memory+0x464/0xd00
> [   21.964259] [c003f8803be0] [c00cd5c0]
> handle_dlpar_errorlog+0xc0/0x190
> [   21.964303] [c003f8803c50] [c00cd6bc]
> pseries_hp_work_fn+0x2c/0x60
> [   21.964346] [c003f8803c80] [c013a4a0]
> process_one_work+0x2b0/0x5a0
> [   21.964388] [c003f8803d10] [c013a818]
> worker_thread+0x88/0x610
> [   21.964434] [c003f8803db0] [c0143884] kthread+0x1a4/0x1b0
> [   21.964468] [c003f8803e20] [c000bdc4]
> ret_from_kernel_thread+0x5c/0x78
> [   21.964506] Instruction dump:
> [   21.964527] fbe1fff8 f821ffd1 78638502 78633664 ebe9 7fff1a14
> 395f0020 813f0020
> [   21.964569] 7d2907b4 7d2900d0 79290fe0 69290001 <0b09> 7c0004ac
> 7d205028 3129
> [   21.964613] ---[ end trace aaa571aa1636fee6 ]---
> [   21.966349]
> [   21.966383] Sending IPI to other CPUs
> [   21.978335] IPI complete
> [   21.981354] kexec: Starting switchover sequence.
> I'm in purgatory

git bisect points to

commit 4231aba000f5a4583dd9f67057aadb68c3eca99d
Author: Nicholas Piggin 
Date:   Fri Jul 27 21:48:17 2018 +1000

powerpc/64s:

Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest

2019-05-17 Thread Michael Ellerman
srikanth  writes:
> Hello,
>
> On power9 host, performing memory hotunplug from ppc64le guest results 
> in kernel oops.

Thanks for the report.

Did this used to work in the past? If so what is the last version that
worked?

> Kernel used : https://github.com/torvalds/linux/tree/v5.1 built using 
> ppc64le_defconfig for host and ppc64le_guest_defconfig for guest.
>
> Recreation steps:
>
> 1. Boot a guest with below mem configuration:
>    33554432
>    8388608
>    4194304
>    
>      
>    
>      
>    
>
> 2. From host hotplug 8G memory -> verify memory hotadded succesfully -> 
> now reboot guest -> once guest comes back try to unplug 8G memory

I assume the reboot is required to trigger the bug? ie. if you unplug
without rebooting it doesn't crash?

> mem.xml used:
> 
> 
> 8
> 0
> 
> 
>
> Memory attach and detach commands used:
>      virsh attach-device vm1 ./mem.xml --live
>      virsh detach-device vm1 ./mem.xml --live
>
> Trace seen inside guest after unplug, guest just hangs there forever:
>
> [   21.962986] kernel BUG at arch/powerpc/mm/pgtable-frag.c:113!
> [   21.963064] Oops: Exception in kernel mode, sig: 5 [#1]
> [   21.963090] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA 
> pSeries
> [   21.963131] Modules linked in: xt_tcpudp iptable_filter squashfs fuse 
> vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_core libiscsi 
> scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_decompress 
> zstd_compress lzo_compress raid10 raid456 async_raid6_recov async_memcpy 
> async_pq async_xor async_tx xor raid6_pq multipath crc32c_vpmsum
> [   21.963281] CPU: 11 PID: 316 Comm: kworker/u64:5 Kdump: loaded Not 
> tainted 5.1.0-dirty #2
> [   21.963323] Workqueue: pseries hotplug workque pseries_hp_work_fn
> [   21.963355] NIP:  c0079e18 LR: c0c79308 CTR: 
> 8000
> [   21.963392] REGS: c003f88034f0 TRAP: 0700   Not tainted (5.1.0-dirty)
> [   21.963422] MSR:  8282b033   
> CR: 28002884  XER: 2004
> [   21.963470] CFAR: c0c79304 IRQMASK: 0
> [   21.963470] GPR00: c0c79308 c003f8803780 c1521000 
> 00fff8c0

Can you try not to word wrap these, it makes them much harder to read.

There's some instructions here on configuring Thunderbird:
  
https://www.kernel.org/doc/html/latest/process/email-clients.html#thunderbird-gui

> [   21.963470] GPR04: 0001 ffe30005 0005 
> 0020
> [   21.963470] GPR08:  0001 c00a00fff8e0 
> c16d21a0
> [   21.963470] GPR12: c16e7b90 c7ff2700 c00a00a0 
> c003ffe30100
> [   21.963470] GPR16: c003ffe3 c14aa4de c00a009f 
> c16d21b0
> [   21.963470] GPR20: c14de588 0001 c16d21b8 
> c00a00a0
> [   21.963470] GPR24:   c00a00a0 
> c003ffe96000
> [   21.963470] GPR28: c00a00a0 c00a00a0 c003fffec000 
> c00a00fff8c0
> [   21.963802] NIP [c0079e18] pte_fragment_free+0x48/0xd0
> [   21.963838] LR [c0c79308] remove_pagetable+0x49c/0x5b4
> [   21.963873] Call Trace:
> [   21.963890] [c003f8803780] [c003ffe997f0] 0xc003ffe997f0 
> (unreliable)
> [   21.963933] [c003f88037b0] [] (null)
> [   21.963969] [c003f88038c0] [c006f038] 
> vmemmap_free+0x218/0x2e0
> [   21.964006] [c003f8803940] [c036f100] 
> sparse_remove_one_section+0xd0/0x138
> [   21.964050] [c003f8803980] [c0383a50] 
> __remove_pages+0x410/0x560
> [   21.964093] [c003f8803a90] [c0c784d8] 
> arch_remove_memory+0x68/0xdc
> [   21.964136] [c003f8803ad0] [c0385d74] 
> __remove_memory+0xc4/0x110
> [   21.964180] [c003f8803b10] [c00d44e4] 
> dlpar_remove_lmb+0x94/0x140
> [   21.964223] [c003f8803b50] [c00d52b4] 
> dlpar_memory+0x464/0xd00
> [   21.964259] [c003f8803be0] [c00cd5c0] 
> handle_dlpar_errorlog+0xc0/0x190
> [   21.964303] [c003f8803c50] [c00cd6bc] 
> pseries_hp_work_fn+0x2c/0x60
> [   21.964346] [c003f8803c80] [c013a4a0] 
> process_one_work+0x2b0/0x5a0
> [   21.964388] [c003f8803d10] [c013a818] 
> worker_thread+0x88/0x610
> [   21.964434] [c003f8803db0] [c0143884] kthread+0x1a4/0x1b0
> [   21.964468] [c003f8803e20] [c000bdc4] 
> ret_from_kernel_thread+0x5c/0x78
> [   21.964506] Instruction dump:
> [   21.964527] fbe1fff8 f821ffd1 78638502 78633664 ebe9 7fff1a14 
> 395f0020 813f0020
> [   21.964569] 7d2907b4 7d2900d0 79290fe0 69290001 <0b09> 7c0004ac 
> 7d20502

PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest

2019-05-16 Thread srikanth

Hello,

On power9 host, performing memory hotunplug from ppc64le guest results 
in kernel oops.


Kernel used : https://github.com/torvalds/linux/tree/v5.1 built using 
ppc64le_defconfig for host and ppc64le_guest_defconfig for guest.


Recreation steps:

1. Boot a guest with below mem configuration:
  33554432
  8388608
  4194304
  
    
  
    
  

2. From host hotplug 8G memory -> verify memory hotadded succesfully -> 
now reboot guest -> once guest comes back try to unplug 8G memory


mem.xml used:


8
0



Memory attach and detach commands used:
    virsh attach-device vm1 ./mem.xml --live
    virsh detach-device vm1 ./mem.xml --live

Trace seen inside guest after unplug, guest just hangs there forever:

[   21.962986] kernel BUG at arch/powerpc/mm/pgtable-frag.c:113!
[   21.963064] Oops: Exception in kernel mode, sig: 5 [#1]
[   21.963090] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA 
pSeries
[   21.963131] Modules linked in: xt_tcpudp iptable_filter squashfs fuse 
vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_core libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_decompress 
zstd_compress lzo_compress raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq multipath crc32c_vpmsum
[   21.963281] CPU: 11 PID: 316 Comm: kworker/u64:5 Kdump: loaded Not 
tainted 5.1.0-dirty #2

[   21.963323] Workqueue: pseries hotplug workque pseries_hp_work_fn
[   21.963355] NIP:  c0079e18 LR: c0c79308 CTR: 
8000

[   21.963392] REGS: c003f88034f0 TRAP: 0700   Not tainted (5.1.0-dirty)
[   21.963422] MSR:  8282b033   
CR: 28002884  XER: 2004

[   21.963470] CFAR: c0c79304 IRQMASK: 0
[   21.963470] GPR00: c0c79308 c003f8803780 c1521000 
00fff8c0
[   21.963470] GPR04: 0001 ffe30005 0005 
0020
[   21.963470] GPR08:  0001 c00a00fff8e0 
c16d21a0
[   21.963470] GPR12: c16e7b90 c7ff2700 c00a00a0 
c003ffe30100
[   21.963470] GPR16: c003ffe3 c14aa4de c00a009f 
c16d21b0
[   21.963470] GPR20: c14de588 0001 c16d21b8 
c00a00a0
[   21.963470] GPR24:   c00a00a0 
c003ffe96000
[   21.963470] GPR28: c00a00a0 c00a00a0 c003fffec000 
c00a00fff8c0

[   21.963802] NIP [c0079e18] pte_fragment_free+0x48/0xd0
[   21.963838] LR [c0c79308] remove_pagetable+0x49c/0x5b4
[   21.963873] Call Trace:
[   21.963890] [c003f8803780] [c003ffe997f0] 0xc003ffe997f0 
(unreliable)

[   21.963933] [c003f88037b0] [] (null)
[   21.963969] [c003f88038c0] [c006f038] 
vmemmap_free+0x218/0x2e0
[   21.964006] [c003f8803940] [c036f100] 
sparse_remove_one_section+0xd0/0x138
[   21.964050] [c003f8803980] [c0383a50] 
__remove_pages+0x410/0x560
[   21.964093] [c003f8803a90] [c0c784d8] 
arch_remove_memory+0x68/0xdc
[   21.964136] [c003f8803ad0] [c0385d74] 
__remove_memory+0xc4/0x110
[   21.964180] [c003f8803b10] [c00d44e4] 
dlpar_remove_lmb+0x94/0x140
[   21.964223] [c003f8803b50] [c00d52b4] 
dlpar_memory+0x464/0xd00
[   21.964259] [c003f8803be0] [c00cd5c0] 
handle_dlpar_errorlog+0xc0/0x190
[   21.964303] [c003f8803c50] [c00cd6bc] 
pseries_hp_work_fn+0x2c/0x60
[   21.964346] [c003f8803c80] [c013a4a0] 
process_one_work+0x2b0/0x5a0
[   21.964388] [c003f8803d10] [c013a818] 
worker_thread+0x88/0x610

[   21.964434] [c003f8803db0] [c0143884] kthread+0x1a4/0x1b0
[   21.964468] [c003f8803e20] [c000bdc4] 
ret_from_kernel_thread+0x5c/0x78

[   21.964506] Instruction dump:
[   21.964527] fbe1fff8 f821ffd1 78638502 78633664 ebe9 7fff1a14 
395f0020 813f0020
[   21.964569] 7d2907b4 7d2900d0 79290fe0 69290001 <0b09> 7c0004ac 
7d205028 3129

[   21.964613] ---[ end trace aaa571aa1636fee6 ]---
[   21.966349]
[   21.966383] Sending IPI to other CPUs
[   21.978335] IPI complete
[   21.981354] kexec: Starting switchover sequence.
I'm in purgatory



[PATCH 3.16 014/202] ASoC: tlv320aic32x4: Kernel OOPS while entering DAPM standby mode

2019-04-27 Thread Ben Hutchings
3.16.66-rc1 review patch.  If anyone has any objections, please let me know.

--

From: b-ak 

commit 667e9334fa64da2273e36ce131b05ac9e47c5769 upstream.

During the bootup of the kernel, the DAPM bias level is in the OFF
state. As soon as the DAPM framework kicks in it pushes the codec
into STANDBY state.

The probe function doesn't prepare the clock, and STANDBY state
does a clk_disable_unprepare() without checking the previous state.
This leads to an OOPS.

Not transitioning from an OFF state to the STANDBY state fixes the
problem.

Signed-off-by: b-ak 
Signed-off-by: Mark Brown 
[bwh: Backported to 3.16:
 - Open-code snd_soc_component_get_bias_level()
 - Adjust context]
Signed-off-by: Ben Hutchings 
---
 sound/soc/codecs/tlv320aic32x4.c | 4 
 1 file changed, 4 insertions(+)

--- a/sound/soc/codecs/tlv320aic32x4.c
+++ b/sound/soc/codecs/tlv320aic32x4.c
@@ -534,6 +534,10 @@ static int aic32x4_set_bias_level(struct
case SND_SOC_BIAS_PREPARE:
break;
case SND_SOC_BIAS_STANDBY:
+   /* Initial cold start */
+   if (codec->dapm.bias_level == SND_SOC_BIAS_OFF)
+   break;
+
/* Switch off BCLK_N Divider */
snd_soc_update_bits(codec, AIC32X4_BCLKN,
AIC32X4_BCLKEN, 0);



[PATCH 4.19 053/110] drm/cirrus: Use drm_framebuffer_put to avoid kernel oops in clean-up

2019-04-18 Thread Greg Kroah-Hartman
[ Upstream commit abf7b30d7f61d981bfcca65d1e8331b27021b475 ]

In the Cirrus driver, the regular clean-up code also performs the clean-up
of a failed initialization. If the fbdev's framebuffer was not initialized,
the clean-up will fail within drm_framebuffer_unregister_private. Booting
with cirrus.bpp=16 triggers this bug.

The framebuffer is currently stored directly within struct cirrus_fbdev. To
fix the bug, we turn it into a pointer that is only set for initialized
framebuffers. The fbdev's clean-up code skips uninitialized framebuffers.

The memory for struct drm_framebuffer is allocated dynamically. This requires
additional error handling within cirrusfb_create. The framebuffer clean-up is
now performed by drm_framebuffer_put, which also frees the data strcuture's
memory.

Link: https://bugzilla.suse.com/show_bug.cgi?id=1101822
Signed-off-by: Thomas Zimmermann 
Link: 
http://patchwork.freedesktop.org/patch/msgid/20180720112743.27159-1-tzimmerm...@suse.de
Signed-off-by: Gerd Hoffmann 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/cirrus/cirrus_drv.h   |  2 +-
 drivers/gpu/drm/cirrus/cirrus_fbdev.c | 48 +++
 drivers/gpu/drm/cirrus/cirrus_mode.c  |  2 +-
 3 files changed, 29 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/cirrus/cirrus_drv.h 
b/drivers/gpu/drm/cirrus/cirrus_drv.h
index ce9db7aab225..a29f87e98d9d 100644
--- a/drivers/gpu/drm/cirrus/cirrus_drv.h
+++ b/drivers/gpu/drm/cirrus/cirrus_drv.h
@@ -146,7 +146,7 @@ struct cirrus_device {
 
 struct cirrus_fbdev {
struct drm_fb_helper helper;
-   struct drm_framebuffer gfb;
+   struct drm_framebuffer *gfb;
void *sysram;
int size;
int x1, y1, x2, y2; /* dirty rect */
diff --git a/drivers/gpu/drm/cirrus/cirrus_fbdev.c 
b/drivers/gpu/drm/cirrus/cirrus_fbdev.c
index b643ac92801c..82cc82e0bd80 100644
--- a/drivers/gpu/drm/cirrus/cirrus_fbdev.c
+++ b/drivers/gpu/drm/cirrus/cirrus_fbdev.c
@@ -22,14 +22,14 @@ static void cirrus_dirty_update(struct cirrus_fbdev *afbdev,
struct drm_gem_object *obj;
struct cirrus_bo *bo;
int src_offset, dst_offset;
-   int bpp = afbdev->gfb.format->cpp[0];
+   int bpp = afbdev->gfb->format->cpp[0];
int ret = -EBUSY;
bool unmap = false;
bool store_for_later = false;
int x2, y2;
unsigned long flags;
 
-   obj = afbdev->gfb.obj[0];
+   obj = afbdev->gfb->obj[0];
bo = gem_to_cirrus_bo(obj);
 
/*
@@ -82,7 +82,7 @@ static void cirrus_dirty_update(struct cirrus_fbdev *afbdev,
}
for (i = y; i < y + height; i++) {
/* assume equal stride for now */
-   src_offset = dst_offset = i * afbdev->gfb.pitches[0] + (x * 
bpp);
+   src_offset = dst_offset = i * afbdev->gfb->pitches[0] + (x * 
bpp);
memcpy_toio(bo->kmap.virtual + src_offset, afbdev->sysram + 
src_offset, width * bpp);
 
}
@@ -192,23 +192,26 @@ static int cirrusfb_create(struct drm_fb_helper *helper,
return -ENOMEM;
 
info = drm_fb_helper_alloc_fbi(helper);
-   if (IS_ERR(info))
-   return PTR_ERR(info);
+   if (IS_ERR(info)) {
+   ret = PTR_ERR(info);
+   goto err_vfree;
+   }
 
info->par = gfbdev;
 
-   ret = cirrus_framebuffer_init(cdev->dev, >gfb, _cmd, gobj);
+   fb = kzalloc(sizeof(*fb), GFP_KERNEL);
+   if (!fb) {
+   ret = -ENOMEM;
+   goto err_drm_gem_object_put_unlocked;
+   }
+
+   ret = cirrus_framebuffer_init(cdev->dev, fb, _cmd, gobj);
if (ret)
-   return ret;
+   goto err_kfree;
 
gfbdev->sysram = sysram;
gfbdev->size = size;
-
-   fb = >gfb;
-   if (!fb) {
-   DRM_INFO("fb is NULL\n");
-   return -EINVAL;
-   }
+   gfbdev->gfb = fb;
 
/* setup helper */
gfbdev->helper.fb = fb;
@@ -241,24 +244,27 @@ static int cirrusfb_create(struct drm_fb_helper *helper,
DRM_INFO("   pitch is %d\n", fb->pitches[0]);
 
return 0;
+
+err_kfree:
+   kfree(fb);
+err_drm_gem_object_put_unlocked:
+   drm_gem_object_put_unlocked(gobj);
+err_vfree:
+   vfree(sysram);
+   return ret;
 }
 
 static int cirrus_fbdev_destroy(struct drm_device *dev,
struct cirrus_fbdev *gfbdev)
 {
-   struct drm_framebuffer *gfb = >gfb;
+   struct drm_framebuffer *gfb = gfbdev->gfb;
 
drm_fb_helper_unregister_fbi(>helper);
 
-   if (gfb->obj[0]) {
-   drm_gem_object_put_unlocked(gfb->obj[0]);
-   gfb->obj[0] = NULL;
-   }
-
vfree(gfbdev->sysram);
drm_fb_helper_fini(>helper);
-   drm_framebuffer_unregister_private(gfb);
-   drm_framebuffer_cleanup(gfb);
+   if (gfb)
+   drm_framebuffer_put(gfb);
 
return 0;
 }
diff --git a/drivers/gpu/drm/cirrus/cirrus_mode.c 

[PATCH 4.19 030/103] ASoC: tlv320aic32x4: Kernel OOPS while entering DAPM standby mode

2019-01-29 Thread Greg Kroah-Hartman
4.19-stable review patch.  If anyone has any objections, please let me know.

--

From: b-ak 

commit 667e9334fa64da2273e36ce131b05ac9e47c5769 upstream.

During the bootup of the kernel, the DAPM bias level is in the OFF
state. As soon as the DAPM framework kicks in it pushes the codec
into STANDBY state.

The probe function doesn't prepare the clock, and STANDBY state
does a clk_disable_unprepare() without checking the previous state.
This leads to an OOPS.

Not transitioning from an OFF state to the STANDBY state fixes the
problem.

Signed-off-by: b-ak 
Signed-off-by: Mark Brown 
Cc: sta...@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman 

---
 sound/soc/codecs/tlv320aic32x4.c |4 
 1 file changed, 4 insertions(+)

--- a/sound/soc/codecs/tlv320aic32x4.c
+++ b/sound/soc/codecs/tlv320aic32x4.c
@@ -822,6 +822,10 @@ static int aic32x4_set_bias_level(struct
case SND_SOC_BIAS_PREPARE:
break;
case SND_SOC_BIAS_STANDBY:
+   /* Initial cold start */
+   if (snd_soc_component_get_bias_level(component) == 
SND_SOC_BIAS_OFF)
+   break;
+
/* Switch off BCLK_N Divider */
snd_soc_component_update_bits(component, AIC32X4_BCLKN,
AIC32X4_BCLKEN, 0);




[PATCH 4.20 034/117] ASoC: tlv320aic32x4: Kernel OOPS while entering DAPM standby mode

2019-01-29 Thread Greg Kroah-Hartman
4.20-stable review patch.  If anyone has any objections, please let me know.

--

From: b-ak 

commit 667e9334fa64da2273e36ce131b05ac9e47c5769 upstream.

During the bootup of the kernel, the DAPM bias level is in the OFF
state. As soon as the DAPM framework kicks in it pushes the codec
into STANDBY state.

The probe function doesn't prepare the clock, and STANDBY state
does a clk_disable_unprepare() without checking the previous state.
This leads to an OOPS.

Not transitioning from an OFF state to the STANDBY state fixes the
problem.

Signed-off-by: b-ak 
Signed-off-by: Mark Brown 
Cc: sta...@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman 

---
 sound/soc/codecs/tlv320aic32x4.c |4 
 1 file changed, 4 insertions(+)

--- a/sound/soc/codecs/tlv320aic32x4.c
+++ b/sound/soc/codecs/tlv320aic32x4.c
@@ -822,6 +822,10 @@ static int aic32x4_set_bias_level(struct
case SND_SOC_BIAS_PREPARE:
break;
case SND_SOC_BIAS_STANDBY:
+   /* Initial cold start */
+   if (snd_soc_component_get_bias_level(component) == 
SND_SOC_BIAS_OFF)
+   break;
+
/* Switch off BCLK_N Divider */
snd_soc_component_update_bits(component, AIC32X4_BCLKN,
AIC32X4_BCLKEN, 0);




[PATCH v2] ASoC: tlv320aic32x4: Kernel OOPS while entering DAPM standby mode

2019-01-07 Thread b-ak
During the bootup of the kernel, the DAPM bias level is in the OFF
state. As soon as the DAPM framework kicks in it pushes the codec
into STANDBY state.

The probe function doesn't prepare the clock, and STANDBY state
does a clk_disable_unprepare() without checking the previous state.
This leads to an OOPS.

Not transitioning from an OFF state to the STANDBY state fixes the
problem.

Signed-off-by: b-ak 
---
 sound/soc/codecs/tlv320aic32x4.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/sound/soc/codecs/tlv320aic32x4.c b/sound/soc/codecs/tlv320aic32x4.c
index e2b5a11b16d1..f03195d2ab2e 100644
--- a/sound/soc/codecs/tlv320aic32x4.c
+++ b/sound/soc/codecs/tlv320aic32x4.c
@@ -822,6 +822,10 @@ static int aic32x4_set_bias_level(struct snd_soc_component 
*component,
case SND_SOC_BIAS_PREPARE:
break;
case SND_SOC_BIAS_STANDBY:
+   /* Initial cold start */
+   if (snd_soc_component_get_bias_level(component) == 
SND_SOC_BIAS_OFF)
+   break;
+
/* Switch off BCLK_N Divider */
snd_soc_component_update_bits(component, AIC32X4_BCLKN,
AIC32X4_BCLKEN, 0);
-- 
2.19.1



Re: [PATCH] ASoC: tlv320aic32x4: Kernel OOPS while entering DAPM standby mode

2019-01-07 Thread b-ak
On Mon, Jan 07, 2019 at 12:59:07PM +, Mark Brown wrote:
> On Sat, Jan 05, 2019 at 10:16:22AM +0530, b-ak wrote:
> 
> > 
> > Hi Mark,
> > 
> > Fixed the build error.
> > 
> > Thanks,
> > Bhargav
> > 
> 
> Please submit patches following the process covered in
> submitting-patches.rst, don't send them as attachments to replies in the
> middle of threads.  Doing that confuses all the tooling for handling
> patches.

Ok. I made a mistake while sending it with Mutt.
Will be sending it inline now.



[PATCH v2] ASoC: tlv320aic32x4: Kernel OOPS while entering DAPM standby mode

2019-01-07 Thread b-ak
During the bootup of the kernel, the DAPM bias level is in the OFF
state. As soon as the DAPM framework kicks in it pushes the codec
into STANDBY state.

The probe function doesn't prepare the clock, and STANDBY state
does a clk_disable_unprepare() without checking the previous state.
This leads to an OOPS.

Not transitioning from an OFF state to the STANDBY state fixes the
problem.

Signed-off-by: b-ak 
---
 sound/soc/codecs/tlv320aic32x4.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/sound/soc/codecs/tlv320aic32x4.c b/sound/soc/codecs/tlv320aic32x4.c
index e2b5a11b16d1..f03195d2ab2e 100644
--- a/sound/soc/codecs/tlv320aic32x4.c
+++ b/sound/soc/codecs/tlv320aic32x4.c
@@ -822,6 +822,10 @@ static int aic32x4_set_bias_level(struct snd_soc_component 
*component,
case SND_SOC_BIAS_PREPARE:
break;
case SND_SOC_BIAS_STANDBY:
+   /* Initial cold start */
+   if (snd_soc_component_get_bias_level(component) == 
SND_SOC_BIAS_OFF)
+   break;
+
/* Switch off BCLK_N Divider */
snd_soc_component_update_bits(component, AIC32X4_BCLKN,
AIC32X4_BCLKEN, 0);
-- 
2.19.1



Re: [PATCH] ASoC: tlv320aic32x4: Kernel OOPS while entering DAPM standby mode

2019-01-07 Thread Mark Brown
On Sat, Jan 05, 2019 at 10:16:22AM +0530, b-ak wrote:

> 
> Hi Mark,
> 
> Fixed the build error.
> 
> Thanks,
> Bhargav
> 

Please submit patches following the process covered in
submitting-patches.rst, don't send them as attachments to replies in the
middle of threads.  Doing that confuses all the tooling for handling
patches.


signature.asc
Description: PGP signature


  1   2   3   4   5   6   7   8   9   10   >