Re: [PATCH] net/mlx5: Fix a potential use after free in mlx5e_ktls_del_rx

2021-03-23 Thread Maxim Mikityanskiy

On 2021-03-22 16:21, Lv Yunlong wrote:

My static analyzer tool reported a potential uaf in
mlx5e_ktls_del_rx. In this function, if the condition
cancel_work_sync(>work) is true, and then
priv_rx could be freed. But priv_rx is used later.

I'm unfamiliar with how this function works. Maybe the
maintainer forgot to add return after freeing priv_rx?


Thanks for running a static analyzer over our code! Sadly, the fix is 
not correct and breaks stuff, and there is no problem with this code.


First of all, mlx5e_ktls_priv_rx_put doesn't necessarily free priv_rx. 
It decrements the refcount and frees the object only when the refcount 
goes to zero. Unless there are other bugs, the refcount in this branch 
is not expected to go to zero, so there is no use-after-free in the code 
below. The corresponding elevation of the refcount happens before 
queue_work of resync->work. So, no, we haven't forgot to add a return, 
we just expect priv_rx to stay alive after this call, and we want to run 
the cleanup code below this `if`, while your fix skips the cleanup and 
skips the second mlx5e_ktls_priv_rx_put in the end of this function, 
leading to a memory leak.


If you'd like to calm down the static analyzer, you could try to add a 
WARN_ON assertion to check that mlx5e_ktls_priv_rx_put returns false in 
that `if` (meaning that the object hasn't been freed). If would be nice 
to have this WARN_ON regardless of static analyzers.



Fixes: b850bbff96512 ("net/mlx5e: kTLS, Use refcounts to free kTLS RX priv 
context")
Signed-off-by: Lv Yunlong 
---
  drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
index d06532d0baa4..54a77df42316 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
@@ -663,8 +663,10 @@ void mlx5e_ktls_del_rx(struct net_device *netdev, struct 
tls_context *tls_ctx)
 */
wait_for_completion(_rx->add_ctx);
resync = _rx->resync;
-   if (cancel_work_sync(>work))
+   if (cancel_work_sync(>work)) {
mlx5e_ktls_priv_rx_put(priv_rx);
+   return;
+   }
  
  	priv_rx->stats->tls_del++;

if (priv_rx->rule.rule)





Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue

2021-03-10 Thread Maxim Mikityanskiy

On 2021-03-10 19:03, Eric Dumazet wrote:



On 3/10/21 3:54 PM, Maxim Mikityanskiy wrote:

On 2021-03-09 17:20, Eric Dumazet wrote:



On 3/9/21 4:13 PM, syzbot wrote:

Hello,

syzbot found the following issue on:

HEAD commit:    38b5133a octeontx2-pf: Fix otx2_get_fecparam()
git tree:   net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=166288a8d0
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbc1ca9e55dc1f9f
dashboard link: https://syzkaller.appspot.com/bug?extid=b53a709f04722ca12a3c
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=119454ccd0

The issue was bisected to:

commit d03b195b5aa015f6c11988b86a3625f8d5dbac52
Author: Maxim Mikityanskiy 
Date:   Tue Jan 19 12:08:13 2021 +

  sch_htb: Hierarchical QoS hardware offload

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13ab12ecd0
final oops: https://syzkaller.appspot.com/x/report.txt?x=106b12ecd0
console output: https://syzkaller.appspot.com/x/log.txt?x=17ab12ecd0

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+b53a709f04722ca12...@syzkaller.appspotmail.com
Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload")

BUG: kernel NULL pointer dereference, address: 
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 183fe067 P4D 183fe067 PUD 21aef067 PMD 0
Oops: 0010 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 10125 Comm: syz-executor.0 Not tainted 5.11.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
RIP: 0010:0x0
Code: Unable to access opcode bytes at RIP 0xffd6.
RSP: 0018:c9000a9c74e8 EFLAGS: 00010246
RAX: dc00 RBX: 192001538e9e RCX: 
RDX: c9000a9c7520 RSI: 0012 RDI: 88802d158000
RBP: 88802d158000 R08: fff1 R09: 0400
R10: 871631c4 R11:  R12: 89ea6b40
R13: dc00 R14: 888012b79c00 R15: 
FS:  7f73f9698700() GS:8880b9c0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: ffd6 CR3: 173b5000 CR4: 001506f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
   htb_offload net/sched/sch_htb.c:1011 [inline]
   htb_select_queue+0x17f/0x2c0 net/sched/sch_htb.c:1349
   tc_modify_qdisc+0x44a/0x1a50 net/sched/sch_api.c:1657
   rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553
   netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502
   netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
   netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
   netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
   sock_sendmsg_nosec net/socket.c:652 [inline]
   sock_sendmsg+0xcf/0x120 net/socket.c:672
   sys_sendmsg+0x6e8/0x810 net/socket.c:2348
   ___sys_sendmsg+0xf3/0x170 net/socket.c:2402
   __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435
   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x466019
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 
48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 
c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:7f73f9698188 EFLAGS: 0246 ORIG_RAX: 002e
RAX: ffda RBX: 0056bf60 RCX: 00466019
RDX:  RSI: 27c0 RDI: 0004
RBP: 004bd067 R08:  R09: 
R10:  R11: 0246 R12: 0056bf60
R13: 7fffefccc11f R14: 7f73f9698300 R15: 00022000
Modules linked in:
CR2: 
---[ end trace e1544e8206616773 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at RIP 0xffd6.
RSP: 0018:c9000a9c74e8 EFLAGS: 00010246
RAX: dc00 RBX: 192001538e9e RCX: 
RDX: c9000a9c7520 RSI: 0012 RDI: 88802d158000
RBP: 88802d158000 R08: fff1 R09: 0400
R10: 871631c4 R11:  R12: 89ea6b40
R13: dc00 R14: 888012b79c00 R15: 
FS:  7f73f9698700() GS:8880b9d0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: ffd6 CR3: 173b5000 CR4: 001506e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for ho

Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue

2021-03-10 Thread Maxim Mikityanskiy

On 2021-03-09 17:20, Eric Dumazet wrote:



On 3/9/21 4:13 PM, syzbot wrote:

Hello,

syzbot found the following issue on:

HEAD commit:38b5133a octeontx2-pf: Fix otx2_get_fecparam()
git tree:   net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=166288a8d0
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbc1ca9e55dc1f9f
dashboard link: https://syzkaller.appspot.com/bug?extid=b53a709f04722ca12a3c
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=119454ccd0

The issue was bisected to:

commit d03b195b5aa015f6c11988b86a3625f8d5dbac52
Author: Maxim Mikityanskiy 
Date:   Tue Jan 19 12:08:13 2021 +

 sch_htb: Hierarchical QoS hardware offload

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13ab12ecd0
final oops: https://syzkaller.appspot.com/x/report.txt?x=106b12ecd0
console output: https://syzkaller.appspot.com/x/log.txt?x=17ab12ecd0

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+b53a709f04722ca12...@syzkaller.appspotmail.com
Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload")

BUG: kernel NULL pointer dereference, address: 
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 183fe067 P4D 183fe067 PUD 21aef067 PMD 0
Oops: 0010 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 10125 Comm: syz-executor.0 Not tainted 5.11.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
RIP: 0010:0x0
Code: Unable to access opcode bytes at RIP 0xffd6.
RSP: 0018:c9000a9c74e8 EFLAGS: 00010246
RAX: dc00 RBX: 192001538e9e RCX: 
RDX: c9000a9c7520 RSI: 0012 RDI: 88802d158000
RBP: 88802d158000 R08: fff1 R09: 0400
R10: 871631c4 R11:  R12: 89ea6b40
R13: dc00 R14: 888012b79c00 R15: 
FS:  7f73f9698700() GS:8880b9c0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: ffd6 CR3: 173b5000 CR4: 001506f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
  htb_offload net/sched/sch_htb.c:1011 [inline]
  htb_select_queue+0x17f/0x2c0 net/sched/sch_htb.c:1349
  tc_modify_qdisc+0x44a/0x1a50 net/sched/sch_api.c:1657
  rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553
  netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502
  netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
  netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
  netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
  sock_sendmsg_nosec net/socket.c:652 [inline]
  sock_sendmsg+0xcf/0x120 net/socket.c:672
  sys_sendmsg+0x6e8/0x810 net/socket.c:2348
  ___sys_sendmsg+0xf3/0x170 net/socket.c:2402
  __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435
  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x466019
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 
48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 
c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:7f73f9698188 EFLAGS: 0246 ORIG_RAX: 002e
RAX: ffda RBX: 0056bf60 RCX: 00466019
RDX:  RSI: 27c0 RDI: 0004
RBP: 004bd067 R08:  R09: 
R10:  R11: 0246 R12: 0056bf60
R13: 7fffefccc11f R14: 7f73f9698300 R15: 00022000
Modules linked in:
CR2: 
---[ end trace e1544e8206616773 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at RIP 0xffd6.
RSP: 0018:c9000a9c74e8 EFLAGS: 00010246
RAX: dc00 RBX: 192001538e9e RCX: 
RDX: c9000a9c7520 RSI: 0012 RDI: 88802d158000
RBP: 88802d158000 R08: fff1 R09: 0400
R10: 871631c4 R11:  R12: 89ea6b40
R13: dc00 R14: 888012b79c00 R15: 
FS:  7f73f9698700() GS:8880b9d0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: ffd6 CR3: 173b5000 CR4: 001506e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot 

[PATCH] HID: plantronics: Workaround for double volume key presses

2021-02-07 Thread Maxim Mikityanskiy
Plantronics Blackwire 3220 Series (047f:c056) sends HID reports twice
for each volume key press. This patch adds a quirk to hid-plantronics
for this product ID, which will ignore the second volume key press if
it happens within 5 ms from the last one that was handled.

The patch was tested on the mentioned model only, it shouldn't affect
other models, however, this quirk might be needed for them too.
Auto-repeat (when a key is held pressed) is not affected, because the
rate is about 3 times per second, which is far less frequent than once
in 5 ms.

Fixes: 81bb773faed7 ("HID: plantronics: Update to map volume up/down controls")
Signed-off-by: Maxim Mikityanskiy 
---
People from Plantronics, maybe you could advise on a better fix than
filtering duplicate events on driver level? Do you happen to know why
they occur in the first place? Are any other headsets affected?

 drivers/hid/hid-ids.h |  1 +
 drivers/hid/hid-plantronics.c | 60 +--
 include/linux/hid.h   |  2 ++
 3 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
index 5ba0aa1d2335..5ac52b4d900d 100644
--- a/drivers/hid/hid-ids.h
+++ b/drivers/hid/hid-ids.h
@@ -937,6 +937,7 @@
 #define USB_DEVICE_ID_ORTEK_IHOME_IMAC_A210S   0x8003
 
 #define USB_VENDOR_ID_PLANTRONICS  0x047f
+#define USB_DEVICE_ID_PLANTRONICS_BLACKWIRE_3220_SERIES0xc056
 
 #define USB_VENDOR_ID_PANASONIC0x04da
 #define USB_DEVICE_ID_PANABOARD_UBT780 0x1044
diff --git a/drivers/hid/hid-plantronics.c b/drivers/hid/hid-plantronics.c
index 85b685efc12f..e81b7cec2d12 100644
--- a/drivers/hid/hid-plantronics.c
+++ b/drivers/hid/hid-plantronics.c
@@ -13,6 +13,7 @@
 
 #include 
 #include 
+#include 
 
 #define PLT_HID_1_0_PAGE   0xffa0
 #define PLT_HID_2_0_PAGE   0xffa2
@@ -36,6 +37,16 @@
 #define PLT_ALLOW_CONSUMER (field->application == HID_CP_CONSUMERCONTROL && \
(usage->hid & HID_USAGE_PAGE) == HID_UP_CONSUMER)
 
+#define PLT_QUIRK_DOUBLE_VOLUME_KEYS BIT(0)
+
+#define PLT_DOUBLE_KEY_TIMEOUT 5 /* ms */
+
+struct plt_drv_data {
+   unsigned long device_type;
+   unsigned long last_volume_key_ts;
+   u32 quirks;
+};
+
 static int plantronics_input_mapping(struct hid_device *hdev,
 struct hid_input *hi,
 struct hid_field *field,
@@ -43,7 +54,8 @@ static int plantronics_input_mapping(struct hid_device *hdev,
 unsigned long **bit, int *max)
 {
unsigned short mapped_key;
-   unsigned long plt_type = (unsigned long)hid_get_drvdata(hdev);
+   struct plt_drv_data *drv_data = hid_get_drvdata(hdev);
+   unsigned long plt_type = drv_data->device_type;
 
/* special case for PTT products */
if (field->application == HID_GD_JOYSTICK)
@@ -105,6 +117,30 @@ static int plantronics_input_mapping(struct hid_device 
*hdev,
return 1;
 }
 
+static int plantronics_event(struct hid_device *hdev, struct hid_field *field,
+struct hid_usage *usage, __s32 value)
+{
+   struct plt_drv_data *drv_data = hid_get_drvdata(hdev);
+
+   if (drv_data->quirks & PLT_QUIRK_DOUBLE_VOLUME_KEYS) {
+   unsigned long prev_ts, cur_ts;
+
+   /* Usages are filtered in plantronics_usages. */
+
+   if (!value) /* Handle key presses only. */
+   return 0;
+
+   prev_ts = drv_data->last_volume_key_ts;
+   cur_ts = jiffies;
+   if (jiffies_to_msecs(cur_ts - prev_ts) <= 
PLT_DOUBLE_KEY_TIMEOUT)
+   return 1; /* Ignore the repeated key. */
+
+   drv_data->last_volume_key_ts = cur_ts;
+   }
+
+   return 0;
+}
+
 static unsigned long plantronics_device_type(struct hid_device *hdev)
 {
unsigned i, col_page;
@@ -133,15 +169,24 @@ static unsigned long plantronics_device_type(struct 
hid_device *hdev)
 static int plantronics_probe(struct hid_device *hdev,
 const struct hid_device_id *id)
 {
+   struct plt_drv_data *drv_data;
int ret;
 
+   drv_data = devm_kzalloc(>dev, sizeof(*drv_data), GFP_KERNEL);
+   if (!drv_data)
+   return -ENOMEM;
+
ret = hid_parse(hdev);
if (ret) {
hid_err(hdev, "parse failed\n");
goto err;
}
 
-   hid_set_drvdata(hdev, (void *)plantronics_device_type(hdev));
+   drv_data->device_type = plantronics_device_type(hdev);
+   drv_data->quirks = id->driver_data;
+   drv_data->last_volume_key_ts = jiffies - 
msecs_to_jiffies(PLT_DOUBLE_KEY_TIMEOUT);
+
+   hid_set_drvdata(hdev, drv_data);
 
ret = hid_hw_start(hdev, HID_CONNECT_DEFAULT |
HID_CONNECT_HIDINPUT_FORCE | HID_CONNECT_HIDDEV_FORCE);

[PATCH] media: usbtv: Fix deadlock on suspend

2021-02-05 Thread Maxim Mikityanskiy
usbtv doesn't support power management, so on system suspend the
.disconnect callback of the driver is called. The teardown sequence
includes a call to snd_card_free. Its implementation waits until the
refcount of the sound card device drops to zero, however, if its file is
open, snd_card_file_add takes a reference, which can't be dropped during
the suspend, because the userspace processes are already frozen at this
point. snd_card_free waits for completion forever, leading to a hang on
suspend.

This commit fixes this deadlock condition by replacing snd_card_free
with snd_card_free_when_closed, that doesn't wait until all references
are released, allowing suspend to progress.

Fixes: 63ddf68de52e ("[media] usbtv: add audio support")
Signed-off-by: Maxim Mikityanskiy 
---
 drivers/media/usb/usbtv/usbtv-audio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/usb/usbtv/usbtv-audio.c 
b/drivers/media/usb/usbtv/usbtv-audio.c
index b57e94fb1977..333bd305a4f9 100644
--- a/drivers/media/usb/usbtv/usbtv-audio.c
+++ b/drivers/media/usb/usbtv/usbtv-audio.c
@@ -371,7 +371,7 @@ void usbtv_audio_free(struct usbtv *usbtv)
cancel_work_sync(>snd_trigger);
 
if (usbtv->snd && usbtv->udev) {
-   snd_card_free(usbtv->snd);
+   snd_card_free_when_closed(usbtv->snd);
usbtv->snd = NULL;
}
 }
-- 
2.30.0



Re: [PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance

2021-01-26 Thread Maxim Mikityanskiy
On Wed, Jan 27, 2021 at 6:23 AM Bart Van Assche  wrote:
>
> On 1/26/21 11:59 AM, Maxim Mikityanskiy wrote:
> > The cited commit introduced a serious regression with SATA write speed,
> > as found by bisecting. This patch reverts this commit, which restores
> > write speed back to the values observed before this commit.
> >
> > The performance tests were done on a Helios4 NAS (2nd batch) with 4 HDDs
> > (WD8003FFBX) using dd (bs=1M count=2000). "Direct" is a test with a
> > single HDD, the rest are different RAID levels built over the first
> > partitions of 4 HDDs. Test results are in MB/s, R is read, W is write.
> >
> > | Direct | RAID0 | RAID10 f2 | RAID10 n2 | RAID6
> > ++---+---+---+
> > 9011495c9466| R:256  | R:313 | R:276 | R:313 | R:323
> > (before faulty) | W:254  | W:253 | W:195 | W:204 | W:117
> > ++---+---+---+
> > 5ff9f19231a0| R:257  | R:398 | R:312 | R:344 | R:391
> > (faulty commit) | W:154  | W:122 | W:67.7| W:66.6| W:67.2
> > ++---+---+---+
> > 5.10.10 | R:256  | R:401 | R:312 | R:356 | R:375
> > unpatched   | W:149  | W:123 | W:64  | W:64.1| W:61.5
> > ++---+---+---+
> > 5.10.10 | R:255  | R:396 | R:312 | R:340 | R:393
> > patched | W:247  | W:274 | W:220 | W:225 | W:121
> >
> > Applying this patch doesn't hurt read performance, while improves the
> > write speed by 1.5x - 3.5x (more impact on RAID tests). The write speed
> > is restored back to the state before the faulty commit, and even a bit
> > higher in RAID tests (which aren't HDD-bound on this device) - that is
> > likely related to other optimizations done between the faulty commit and
> > 5.10.10 which also improved the read speed.
> >
> > Signed-off-by: Maxim Mikityanskiy 
> > Fixes: 5ff9f19231a0 ("block: simplify set_init_blocksize")
> > Cc: Christoph Hellwig 
> > Cc: Jens Axboe 
> > ---
> >  fs/block_dev.c | 10 +-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/block_dev.c b/fs/block_dev.c
> > index 3b8963e228a1..235b5042672e 100644
> > --- a/fs/block_dev.c
> > +++ b/fs/block_dev.c
> > @@ -130,7 +130,15 @@ EXPORT_SYMBOL(truncate_bdev_range);
> >
> >  static void set_init_blocksize(struct block_device *bdev)
> >  {
> > - bdev->bd_inode->i_blkbits = 
> > blksize_bits(bdev_logical_block_size(bdev));
> > + unsigned int bsize = bdev_logical_block_size(bdev);
> > + loff_t size = i_size_read(bdev->bd_inode);
> > +
> > + while (bsize < PAGE_SIZE) {
> > + if (size & bsize)
> > + break;
> > + bsize <<= 1;
> > + }
> > + bdev->bd_inode->i_blkbits = blksize_bits(bsize);
> >  }
> >
> >  int set_blocksize(struct block_device *bdev, int size)
>
> How can this patch affect write speed? I haven't found any calls of
> set_init_blocksize() in the I/O path. Did I perhaps overlook something?

I don't know the exact mechanism how this change affects the speed,
I'm not an expert in the block device subsystem (I'm a networking
guy). This commit was found by git bisect, and my performance test
confirmed that reverting it fixes the bug.

It looks to me as this function sets the block size as part of control
flow, and this size is used later in the fast path, and the commit
that removed the loop decreased this block size.

> Bart.
>
>


[PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance

2021-01-26 Thread Maxim Mikityanskiy
The cited commit introduced a serious regression with SATA write speed,
as found by bisecting. This patch reverts this commit, which restores
write speed back to the values observed before this commit.

The performance tests were done on a Helios4 NAS (2nd batch) with 4 HDDs
(WD8003FFBX) using dd (bs=1M count=2000). "Direct" is a test with a
single HDD, the rest are different RAID levels built over the first
partitions of 4 HDDs. Test results are in MB/s, R is read, W is write.

| Direct | RAID0 | RAID10 f2 | RAID10 n2 | RAID6
++---+---+---+
9011495c9466| R:256  | R:313 | R:276 | R:313 | R:323
(before faulty) | W:254  | W:253 | W:195 | W:204 | W:117
++---+---+---+
5ff9f19231a0| R:257  | R:398 | R:312 | R:344 | R:391
(faulty commit) | W:154  | W:122 | W:67.7| W:66.6| W:67.2
++---+---+---+
5.10.10 | R:256  | R:401 | R:312 | R:356 | R:375
unpatched   | W:149  | W:123 | W:64  | W:64.1| W:61.5
++---+---+---+
5.10.10 | R:255  | R:396 | R:312 | R:340 | R:393
patched | W:247  | W:274 | W:220 | W:225 | W:121

Applying this patch doesn't hurt read performance, while improves the
write speed by 1.5x - 3.5x (more impact on RAID tests). The write speed
is restored back to the state before the faulty commit, and even a bit
higher in RAID tests (which aren't HDD-bound on this device) - that is
likely related to other optimizations done between the faulty commit and
5.10.10 which also improved the read speed.

Signed-off-by: Maxim Mikityanskiy 
Fixes: 5ff9f19231a0 ("block: simplify set_init_blocksize")
Cc: Christoph Hellwig 
Cc: Jens Axboe 
---
 fs/block_dev.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 3b8963e228a1..235b5042672e 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -130,7 +130,15 @@ EXPORT_SYMBOL(truncate_bdev_range);
 
 static void set_init_blocksize(struct block_device *bdev)
 {
-   bdev->bd_inode->i_blkbits = blksize_bits(bdev_logical_block_size(bdev));
+   unsigned int bsize = bdev_logical_block_size(bdev);
+   loff_t size = i_size_read(bdev->bd_inode);
+
+   while (bsize < PAGE_SIZE) {
+   if (size & bsize)
+   break;
+   bsize <<= 1;
+   }
+   bdev->bd_inode->i_blkbits = blksize_bits(bsize);
 }
 
 int set_blocksize(struct block_device *bdev, int size)
-- 
2.30.0



Re: [PATCH] [net-next] net/mlx5e: xsk: dynamically allocate mlx5e_channel_param

2019-07-23 Thread Maxim Mikityanskiy
On 2019-07-08 18:16, Maxim Mikityanskiy wrote:
> On 2019-07-08 15:55, Arnd Bergmann wrote:
>> -mlx5e_build_xsk_cparam(priv, params, xsk, );
>> +cparam = kzalloc(sizeof(*cparam), GFP_KERNEL);
> 
> Similar code in mlx5e_open_channels (en_main.c) uses kvzalloc. Although
> the struct is currently smaller than a page anyway, and there should be
> no difference in behavior now, I suggest using the same alloc function
> to keep code uniform.
> 
>>  /* Create a dedicated SQ for posting NOPs whenever we need an IRQ to be
>>   * triggered and NAPI to be called on the correct CPU.
>>   */
>> -err = mlx5e_open_icosq(c, params, , >xskicosq);
>> +err = mlx5e_open_icosq(c, params, >icosq, >xskicosq);
>>  if (unlikely(err))
>>  goto err_close_icocq;
>>
> 
> Here is kfree missing. It's a memory leak in the good path.

Arnd, I'm going to take over your patch and respin it, addressing my own 
comments, because it's been quite a while, and we want to have this fix.

Thanks for spotting it.


Re: [PATCH] [net-next] net/mlx5e: xsk: dynamically allocate mlx5e_channel_param

2019-07-08 Thread Maxim Mikityanskiy
On 2019-07-08 15:55, Arnd Bergmann wrote:
> The structure is too large to put on the stack, resulting in a
> warning on 32-bit ARM:
> 
> drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c:59:5: error: stack 
> frame size of 1344 bytes in function
>'mlx5e_open_xsk' [-Werror,-Wframe-larger-than=]
> 
> Use kzalloc() instead.
> 
> Fixes: a038e9794541 ("net/mlx5e: Add XSK zero-copy support")
> Signed-off-by: Arnd Bergmann 
> ---
>   .../mellanox/mlx5/core/en/xsk/setup.c | 25 ---
>   1 file changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c 
> b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
> index aaffa6f68dc0..db9bbec68dbf 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
> @@ -60,24 +60,28 @@ int mlx5e_open_xsk(struct mlx5e_priv *priv, struct 
> mlx5e_params *params,
>  struct mlx5e_xsk_param *xsk, struct xdp_umem *umem,
>  struct mlx5e_channel *c)
>   {
> - struct mlx5e_channel_param cparam = {};
> + struct mlx5e_channel_param *cparam;
>   struct dim_cq_moder icocq_moder = {};
>   int err;
>   
>   if (!mlx5e_validate_xsk_param(params, xsk, priv->mdev))
>   return -EINVAL;
>   
> - mlx5e_build_xsk_cparam(priv, params, xsk, );
> + cparam = kzalloc(sizeof(*cparam), GFP_KERNEL);

Similar code in mlx5e_open_channels (en_main.c) uses kvzalloc. Although 
the struct is currently smaller than a page anyway, and there should be 
no difference in behavior now, I suggest using the same alloc function 
to keep code uniform.

> + if (!cparam)
> + return -ENOMEM;
>   
> - err = mlx5e_open_cq(c, params->rx_cq_moderation, _cq, 
> >xskrq.cq);
> + mlx5e_build_xsk_cparam(priv, params, xsk, cparam);
> +
> + err = mlx5e_open_cq(c, params->rx_cq_moderation, >rx_cq, 
> >xskrq.cq);
>   if (unlikely(err))
> - return err;
> + goto err_kfree_cparam;
>   
> - err = mlx5e_open_rq(c, params, , xsk, umem, >xskrq);
> + err = mlx5e_open_rq(c, params, >rq, xsk, umem, >xskrq);
>   if (unlikely(err))
>   goto err_close_rx_cq;
>   
> - err = mlx5e_open_cq(c, params->tx_cq_moderation, _cq, 
> >xsksq.cq);
> + err = mlx5e_open_cq(c, params->tx_cq_moderation, >tx_cq, 
> >xsksq.cq);
>   if (unlikely(err))
>   goto err_close_rq;
>   
> @@ -87,18 +91,18 @@ int mlx5e_open_xsk(struct mlx5e_priv *priv, struct 
> mlx5e_params *params,
>* is disabled and then reenabled, but the SQ continues receiving CQEs
>* from the old UMEM.
>*/
> - err = mlx5e_open_xdpsq(c, params, _sq, umem, >xsksq, 
> true);
> + err = mlx5e_open_xdpsq(c, params, >xdp_sq, umem, >xsksq, 
> true);
>   if (unlikely(err))
>   goto err_close_tx_cq;
>   
> - err = mlx5e_open_cq(c, icocq_moder, _cq, >xskicosq.cq);
> + err = mlx5e_open_cq(c, icocq_moder, >icosq_cq, >xskicosq.cq);
>   if (unlikely(err))
>   goto err_close_sq;
>   
>   /* Create a dedicated SQ for posting NOPs whenever we need an IRQ to be
>* triggered and NAPI to be called on the correct CPU.
>*/
> - err = mlx5e_open_icosq(c, params, , >xskicosq);
> + err = mlx5e_open_icosq(c, params, >icosq, >xskicosq);
>   if (unlikely(err))
>   goto err_close_icocq;
>   

Here is kfree missing. It's a memory leak in the good path.

Thanks!

> @@ -123,6 +127,9 @@ int mlx5e_open_xsk(struct mlx5e_priv *priv, struct 
> mlx5e_params *params,
>   err_close_rx_cq:
>   mlx5e_close_cq(>xskrq.cq);
>   
> +err_kfree_cparam:
> + kfree(cparam);
> +
>   return err;
>   }
>   
> 



Re: [PATCH 1/1] Address regression in inet6_validate_link_af

2019-06-13 Thread Maxim Mikityanskiy
On 2019-06-13 09:45, Jonas Bonn wrote:
> Hi Max,
> 
> On 12/06/2019 12:42, Maxim Mikityanskiy wrote:
>> On 2019-06-11 13:03, Jonas Bonn wrote:
>>> Patch 7dc2bccab0ee37ac28096b8fcdc390a679a15841 introduces a regression
>>> with systemd 241.  In that revision, systemd-networkd fails to pass the
>>> required flags early enough.  This appears to be addressed in later
>>> versions of systemd, but for users of version 241 where systemd-networkd
>>> nonetheless worked with earlier kernels, the strict check introduced by
>>> the patch causes a regression in behaviour.
>>>
>>> This patch converts the failure to supply the required flags from an
>>> error into a warning.
>> The purpose of my patch was to prevent a partial configuration update on
>> invalid input. -EINVAL was returned both before and after my patch, the
>> difference is that before my patch there was a partial update and a 
>> warning.
>>
>> Your patch basically makes mine pointless, because you revert the fix,
>> and now we'll have the same partial update and two warnings.
> 
> Unfortunately, yes...

So what you propose is a degradation.

>>
>> One more thing is that after applying your patch on top of mine, the
>> kernel won't return -EINVAL anymore on invalid input. Returning -EINVAL
>> is what happened before my patch, and also after my patch.
> 
> Yes, you're right, it would probably be better revert the entire patch 
> because the checks in set_link_af have been dropped on the assumption 
> that validate_link_af catches the badness.

We shouldn't introduce workarounds in the kernel for some temporary bugs 
in old userspace. A regression was introduced in systemd: it started 
sending invalid messages, didn't check the return code properly and 
relied on an undefined behavior. It was then fixed in systemd. If the 
kernel had all kinds of workarounds for all buggy software ever existed, 
it would be a complete mess. The software used the API in a wrong way, 
the expected behavior is failure, so it shouldn't expect anything else. 
For me, the trade-off between fixing the kernel behavior and supporting 
some old buggy software while keeping a UB in the kernel forever has an 
obvious resolution.

>>
>> Regarding the systemd issue, I don't think we should change the kernel
>> to adapt to bugs in systemd. systemd didn't have this bug from day one,
>> it was a regression introduced in [1]. The kernel has always returned
>> -EINVAL here, but the behavior before my patch was basically a UB, and
>> after the patch it's well-defined. If systemd saw EINVAL and relied on
>> the UB that came with it, it can't be a reason enough to break the 
>> kernel.
>>
>> Moreover, the bug looks fixed in systemd's master, so what you suggest
>> is to insert a kernel-side workaround for an old version of software
>> when there is a fixed one.
> 
> I agree, systemd is buggy here.  Probably what happens is:
> 
> i)  systemd tries to set the link up
> ii)  it ends up doing a "partial" modification of the link state; 
> critically, though, enough to actually effect the link state changing to UP
> iii)  systemd sees the -EINVAL error and decides it probably failed to 
> bring up the link
> iv)  systemd then gets a notification that the link is up and runs a 
> DHCP client on it
> 
> I haven't noticed any "partial modification" warnings in the kernel log 
> but I wasn't looking for them, either...
> 
> With the new behaviour in 5.2, step ii) above results in no "partial 
> modification" so the link remains down and systemd is forever unable to 
> bring it up.
> 
> Anyway, for the record, the error is:
> 
> systemd:  Could not bring up interface... (invalid parameter)
> 
> And the solution is:  Linux >= 5.2 requires systemd != v241.
> 
> If nobody else notices, that's good enough for me.
> 
>>
>> Please correct me if anything I say is wrong.
> 
> Nothing wrong, but it's still a regression.
> 
> /Jonas
> 
>>
>> Thanks,
>> Max
>>
>> [1]:
>> https://github.com/systemd/systemd/commit/0e2fdb83bb5e22047e0c7cc058b415d0e93f02cf
>>  
>>
>>
>>> With this, systemd-networkd version 241 once
>>> again is able to bring up the link, albeit not quite as intended and
>>> thereby with a warning in the kernel log.
>>>
>>> CC: Maxim Mikityanskiy 
>>> CC: David S. Miller 
>>> CC: Alexey Kuznetsov 
>>> CC: Hideaki YOSHIFUJI 
>>> Signed-off-by: Jonas Bonn 
>>> ---
>>>    net/ipv6/addrconf.c | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>

Re: [PATCH 1/1] Address regression in inet6_validate_link_af

2019-06-12 Thread Maxim Mikityanskiy
On 2019-06-11 13:03, Jonas Bonn wrote:
> Patch 7dc2bccab0ee37ac28096b8fcdc390a679a15841 introduces a regression
> with systemd 241.  In that revision, systemd-networkd fails to pass the
> required flags early enough.  This appears to be addressed in later
> versions of systemd, but for users of version 241 where systemd-networkd
> nonetheless worked with earlier kernels, the strict check introduced by
> the patch causes a regression in behaviour.
> 
> This patch converts the failure to supply the required flags from an
> error into a warning.
The purpose of my patch was to prevent a partial configuration update on 
invalid input. -EINVAL was returned both before and after my patch, the 
difference is that before my patch there was a partial update and a warning.

Your patch basically makes mine pointless, because you revert the fix, 
and now we'll have the same partial update and two warnings.

One more thing is that after applying your patch on top of mine, the 
kernel won't return -EINVAL anymore on invalid input. Returning -EINVAL 
is what happened before my patch, and also after my patch.

Regarding the systemd issue, I don't think we should change the kernel 
to adapt to bugs in systemd. systemd didn't have this bug from day one, 
it was a regression introduced in [1]. The kernel has always returned 
-EINVAL here, but the behavior before my patch was basically a UB, and 
after the patch it's well-defined. If systemd saw EINVAL and relied on 
the UB that came with it, it can't be a reason enough to break the kernel.

Moreover, the bug looks fixed in systemd's master, so what you suggest 
is to insert a kernel-side workaround for an old version of software 
when there is a fixed one.

Please correct me if anything I say is wrong.

Thanks,
Max

[1]: 
https://github.com/systemd/systemd/commit/0e2fdb83bb5e22047e0c7cc058b415d0e93f02cf

> With this, systemd-networkd version 241 once
> again is able to bring up the link, albeit not quite as intended and
> thereby with a warning in the kernel log.
> 
> CC: Maxim Mikityanskiy 
> CC: David S. Miller 
> CC: Alexey Kuznetsov 
> CC: Hideaki YOSHIFUJI 
> Signed-off-by: Jonas Bonn 
> ---
>   net/ipv6/addrconf.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index 081bb517e40d..e2477bf92e12 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -5696,7 +5696,8 @@ static int inet6_validate_link_af(const struct 
> net_device *dev,
>   return err;
>   
>   if (!tb[IFLA_INET6_TOKEN] && !tb[IFLA_INET6_ADDR_GEN_MODE])
> - return -EINVAL;
> + net_warn_ratelimited(
> + "required link flag omitted: TOKEN/ADDR_GEN_MODE\n");
>   
>   if (tb[IFLA_INET6_ADDR_GEN_MODE]) {
>   u8 mode = nla_get_u8(tb[IFLA_INET6_ADDR_GEN_MODE]);
> 



Re: [PATCH] platform/x86: intel_int0002_vgpio: Only implement irq_set_wake on Bay Trail

2019-03-04 Thread Maxim Mikityanskiy
On Tue, Feb 5, 2019 at 8:20 PM Andy Shevchenko
 wrote:
>
> On Sun, Feb 3, 2019 at 11:42 AM Hans de Goede  wrote:
> >
> > Commit c3b8e884defa ("platform/x86: intel_int0002_vgpio: Implement
> > irq_set_wake"), was written to fix some wakeup issues on Bay Trail (BYT)
> > devices.
> >
> > We've received a bug report that this causes a suspend regression on some
> > Cherry Trail (CHT) based devices.
> >
> > To fix the issues this causes on CHT devices, this commit modifies the
> > irq_set_wake support so that we only implement irq_set_wake on BYT devices,
> >
>
> Pushed to my review and testing queue, thanks!

Kernel 5.0 was released without this patch, is it still planned to merge it?

> > Fixes: c3b8e884defa ("platform/x86: intel_int0002_vgpio: ... irq_set_wake")
> > Reported-and-tested-by: Maxim Mikityanskiy 
> > Signed-off-by: Hans de Goede 
> > ---
> >  drivers/platform/x86/intel_int0002_vgpio.c | 32 ++
> >  1 file changed, 26 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/platform/x86/intel_int0002_vgpio.c 
> > b/drivers/platform/x86/intel_int0002_vgpio.c
> > index 4b8f7305fc8a..78787934b572 100644
> > --- a/drivers/platform/x86/intel_int0002_vgpio.c
> > +++ b/drivers/platform/x86/intel_int0002_vgpio.c
> > @@ -51,11 +51,14 @@
> >  #define GPE0A_STS_PORT 0x420
> >  #define GPE0A_EN_PORT  0x428
> >
> > -#define ICPU(model){ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, }
> > +#define BAYTRAIL   0x01
> > +#define CHERRYTRAIL0x02
> > +
> > +#define ICPU(model, data) { X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, 
> > data}
> >
> >  static const struct x86_cpu_id int0002_cpu_ids[] = {
> > -   ICPU(INTEL_FAM6_ATOM_SILVERMONT),   /* Valleyview, Bay Trail  */
> > -   ICPU(INTEL_FAM6_ATOM_AIRMONT),  /* Braswell, Cherry Trail */
> > +   ICPU(INTEL_FAM6_ATOM_SILVERMONT, BAYTRAIL), /* Valleyview, Bay 
> > Trail  */
> > +   ICPU(INTEL_FAM6_ATOM_AIRMONT, CHERRYTRAIL), /* Braswell, Cherry 
> > Trail */
> > {}
> >  };
> >
> > @@ -135,7 +138,7 @@ static irqreturn_t int0002_irq(int irq, void *data)
> > return IRQ_HANDLED;
> >  }
> >
> > -static struct irq_chip int0002_irqchip = {
> > +static struct irq_chip int0002_byt_irqchip = {
> > .name   = DRV_NAME,
> > .irq_ack= int0002_irq_ack,
> > .irq_mask   = int0002_irq_mask,
> > @@ -143,10 +146,22 @@ static struct irq_chip int0002_irqchip = {
> > .irq_set_wake   = int0002_irq_set_wake,
> >  };
> >
> > +static struct irq_chip int0002_cht_irqchip = {
> > +   .name   = DRV_NAME,
> > +   .irq_ack= int0002_irq_ack,
> > +   .irq_mask   = int0002_irq_mask,
> > +   .irq_unmask = int0002_irq_unmask,
> > +   /*
> > +* No set_wake, on CHT the IRQ is typically shared with the ACPI SCI
> > +* and we don't want to mess with the ACPI SCI irq settings.
> > +*/
> > +};
> > +
> >  static int int0002_probe(struct platform_device *pdev)
> >  {
> > struct device *dev = >dev;
> > const struct x86_cpu_id *cpu_id;
> > +   struct irq_chip *irq_chip;
> > struct gpio_chip *chip;
> > int irq, ret;
> >
> > @@ -195,14 +210,19 @@ static int int0002_probe(struct platform_device *pdev)
> > return ret;
> > }
> >
> > -   ret = gpiochip_irqchip_add(chip, _irqchip, 0, 
> > handle_edge_irq,
> > +   if (cpu_id->driver_data == BAYTRAIL)
> > +   irq_chip = _byt_irqchip;
> > +   else
> > +   irq_chip = _cht_irqchip;
> > +
> > +   ret = gpiochip_irqchip_add(chip, irq_chip, 0, handle_edge_irq,
> >IRQ_TYPE_NONE);
> > if (ret) {
> > dev_err(dev, "Error adding irqchip: %d\n", ret);
> > return ret;
> > }
> >
> > -   gpiochip_set_chained_irqchip(chip, _irqchip, irq, NULL);
> > +   gpiochip_set_chained_irqchip(chip, irq_chip, irq, NULL);
> >
> > return 0;
> >  }
> > --
> > 2.20.1
> >
>
>
> --
> With Best Regards,
> Andy Shevchenko


Re: [PATCH] platform/x86: intel_int0002_vgpio: Implement irq_set_wake

2019-02-02 Thread Maxim Mikityanskiy
On Sat, Feb 2, 2019 at 12:52 AM Hans de Goede  wrote:
>
> Hi,
>
> On 1/31/19 8:47 PM, Maxim Mikityanskiy wrote:
> > Hi,
> >
> > On Mon, Sep 24, 2018 at 5:37 PM Hans de Goede  wrote:
> >>
> >> We were relying on the interrupt being shared with the ACPI SCI and the
> >> ACPI core calling irq_set_wake. But that does not always happen on
> >> Bay Trail devices, so we should do it ourselves.
> >>
> >> This fixes wake from USB not working on various Bay Trail devices.
> >
> > This patch breaks suspend on ASUS E202SA (bisecting pointed me to this
> > patch, and if I revert it and build 4.20.5 without this patch,
> > everything works flawlessly).
>
> Thank you for the bug report, can you please test 4.20.5 with the attached
> patch on top? That should fix it. Once I've confirmation that this fixes
> things I will submit the patch upstream.

I've tested your patch against both 4.20.5 and 4.20.6, and it works
fine. Thank you for such a quick fix.

> Regards,
>
> Hans
>
>
> >
> > This command fails with the following message:
> >
> > # echo mem > /sys/power/state
> > Error while writing to stdout
> > write_loop: Device or resource busy
> >
> > And here is dmesg output:
> >
> > [  224.077275] PM: suspend entry (deep)
> > [  224.077286] PM: Syncing filesystems ... done.
> > [  225.495014] Freezing user space processes ... (elapsed 0.003 seconds) 
> > done.
> > [  225.498540] OOM killer disabled.
> > [  225.498543] Freezing remaining freezable tasks ... (elapsed 0.002
> > seconds) done.
> > [  225.500693] printk: Suspending console(s) (use no_console_suspend to 
> > debug)
> > [  225.502793] wlp1s0: deauthenticating from 00:03:7f:12:34:56 by
> > local choice (Reason: 3=DEAUTH_LEAVING)
> > [  225.535333] sd 0:0:0:0: [sda] Synchronizing SCSI cache
> > [  225.535882] sd 0:0:0:0: [sda] Stopping disk
> > [  226.969070] ACPI: EC: interrupt blocked
> > [  227.002156] ACPI: Preparing to enter system sleep state S3
> > [  227.007890] ACPI: EC: event blocked
> > [  227.007895] ACPI: EC: EC stopped
> > [  227.007900] PM: Saving platform NVS memory
> > [  227.008264] Disabling non-boot CPUs ...
> > [  227.034114] smpboot: CPU 1 is now offline
> > [  227.088320] smpboot: CPU 2 is now offline
> > [  227.141513] smpboot: CPU 3 is now offline
> > [  227.147086] Enabling non-boot CPUs ...
> > [  227.147187] x86: Booting SMP configuration:
> > [  227.147190] smpboot: Booting Node 0 Processor 1 APIC 0x2
> > [  227.147916]  cache: parent cpu1 should not be sleeping
> > [  227.148354] CPU1 is up
> > [  227.148424] smpboot: Booting Node 0 Processor 2 APIC 0x4
> > [  227.149800]  cache: parent cpu2 should not be sleeping
> > [  227.151143] CPU2 is up
> > [  227.151187] smpboot: Booting Node 0 Processor 3 APIC 0x6
> > [  227.152399]  cache: parent cpu3 should not be sleeping
> > [  227.153883] CPU3 is up
> > [  227.154876] ACPI: EC: EC started
> > [  227.155282] ACPI: Waking up from system sleep state S3
> > [  227.159874] ACPI: button: The lid device is not compliant to SW_LID.
> > [  227.169441] ACPI: EC: interrupt unblocked
> > [  228.236790] ACPI: EC: event unblocked
> > [  228.241950] rtlwifi: rtlwifi: wireless switch is on
> > [  228.251865] sd 0:0:0:0: [sda] Starting disk
> > [  228.476637] usb 1-4: reset full-speed USB device number 2 using xhci_hcd
> > [  228.562879] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> > [  228.563924] ata2: SATA link down (SStatus 4 SControl 300)
> > [  228.564979] ata1.00: supports DRM functions and may not be fully 
> > accessible
> > [  228.565493] ata1.00: NCQ Send/Recv Log not supported
> > [  228.567649] ata1.00: supports DRM functions and may not be fully 
> > accessible
> > [  228.568252] ata1.00: NCQ Send/Recv Log not supported
> > [  228.570075] ata1.00: configured for UDMA/133
> > [  228.580412] ahci :00:13.0: port does not support device sleep
> > [  228.639349] Bluetooth: hci0: RTL: rtl: examining hci_ver=06
> > hci_rev=0e2f lmp_ver=06 lmp_subver=a041
> >
> > [  228.639368] Bluetooth: hci0: RTL: rtl: unknown IC info, lmp subver
> > a041, hci rev 0e2f, hci ver 0006
> > [  228.639742] acpi LNXPOWER:01: Turning OFF
> > [  228.640033] OOM killer enabled.
> > [  228.640040] Restarting tasks ... done.
> > [  228.795406] PM: suspend exit
> > [  228.800399] audit: type=1130 audit(1548962671.104:94): pid=1 uid=0
> > auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd"
> > exe="/usr/lib/systemd/s

Re: [PATCH] platform/x86: intel_int0002_vgpio: Implement irq_set_wake

2019-01-31 Thread Maxim Mikityanskiy
Hi,

On Mon, Sep 24, 2018 at 5:37 PM Hans de Goede  wrote:
>
> We were relying on the interrupt being shared with the ACPI SCI and the
> ACPI core calling irq_set_wake. But that does not always happen on
> Bay Trail devices, so we should do it ourselves.
>
> This fixes wake from USB not working on various Bay Trail devices.

This patch breaks suspend on ASUS E202SA (bisecting pointed me to this
patch, and if I revert it and build 4.20.5 without this patch,
everything works flawlessly).

This command fails with the following message:

# echo mem > /sys/power/state
Error while writing to stdout
write_loop: Device or resource busy

And here is dmesg output:

[  224.077275] PM: suspend entry (deep)
[  224.077286] PM: Syncing filesystems ... done.
[  225.495014] Freezing user space processes ... (elapsed 0.003 seconds) done.
[  225.498540] OOM killer disabled.
[  225.498543] Freezing remaining freezable tasks ... (elapsed 0.002
seconds) done.
[  225.500693] printk: Suspending console(s) (use no_console_suspend to debug)
[  225.502793] wlp1s0: deauthenticating from 00:03:7f:12:34:56 by
local choice (Reason: 3=DEAUTH_LEAVING)
[  225.535333] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[  225.535882] sd 0:0:0:0: [sda] Stopping disk
[  226.969070] ACPI: EC: interrupt blocked
[  227.002156] ACPI: Preparing to enter system sleep state S3
[  227.007890] ACPI: EC: event blocked
[  227.007895] ACPI: EC: EC stopped
[  227.007900] PM: Saving platform NVS memory
[  227.008264] Disabling non-boot CPUs ...
[  227.034114] smpboot: CPU 1 is now offline
[  227.088320] smpboot: CPU 2 is now offline
[  227.141513] smpboot: CPU 3 is now offline
[  227.147086] Enabling non-boot CPUs ...
[  227.147187] x86: Booting SMP configuration:
[  227.147190] smpboot: Booting Node 0 Processor 1 APIC 0x2
[  227.147916]  cache: parent cpu1 should not be sleeping
[  227.148354] CPU1 is up
[  227.148424] smpboot: Booting Node 0 Processor 2 APIC 0x4
[  227.149800]  cache: parent cpu2 should not be sleeping
[  227.151143] CPU2 is up
[  227.151187] smpboot: Booting Node 0 Processor 3 APIC 0x6
[  227.152399]  cache: parent cpu3 should not be sleeping
[  227.153883] CPU3 is up
[  227.154876] ACPI: EC: EC started
[  227.155282] ACPI: Waking up from system sleep state S3
[  227.159874] ACPI: button: The lid device is not compliant to SW_LID.
[  227.169441] ACPI: EC: interrupt unblocked
[  228.236790] ACPI: EC: event unblocked
[  228.241950] rtlwifi: rtlwifi: wireless switch is on
[  228.251865] sd 0:0:0:0: [sda] Starting disk
[  228.476637] usb 1-4: reset full-speed USB device number 2 using xhci_hcd
[  228.562879] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  228.563924] ata2: SATA link down (SStatus 4 SControl 300)
[  228.564979] ata1.00: supports DRM functions and may not be fully accessible
[  228.565493] ata1.00: NCQ Send/Recv Log not supported
[  228.567649] ata1.00: supports DRM functions and may not be fully accessible
[  228.568252] ata1.00: NCQ Send/Recv Log not supported
[  228.570075] ata1.00: configured for UDMA/133
[  228.580412] ahci :00:13.0: port does not support device sleep
[  228.639349] Bluetooth: hci0: RTL: rtl: examining hci_ver=06
hci_rev=0e2f lmp_ver=06 lmp_subver=a041

[  228.639368] Bluetooth: hci0: RTL: rtl: unknown IC info, lmp subver
a041, hci rev 0e2f, hci ver 0006
[  228.639742] acpi LNXPOWER:01: Turning OFF
[  228.640033] OOM killer enabled.
[  228.640040] Restarting tasks ... done.
[  228.795406] PM: suspend exit
[  228.800399] audit: type=1130 audit(1548962671.104:94): pid=1 uid=0
auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd"
exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=?
res=success'
[  229.067206] wlp1s0: authenticate with 12:34:56:78:90:12
[  229.067823] wlp1s0: send auth to 12:34:56:78:90:12 (try 1/3)
[  229.070955] wlp1s0: authenticated
[  229.072395] wlp1s0: associate with 12:34:56:78:90:12 (try 1/3)
[  229.074505] wlp1s0: RX AssocResp from 12:34:56:78:90:12 (capab=0x11
status=0 aid=2)
[  229.074819] wlp1s0: associated
[  233.809200] audit: type=1131 audit(1548962676.106:95): pid=1 uid=0
auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd"
exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=?
res=success'

The laptop doesn't go to sleep, the screen turns off, and in a couple
of seconds it turns on.

Please take a look at this regression. Feel free to ask me for any
additional information you need.

Thanks,
Max

> Signed-off-by: Hans de Goede 
> ---
>  drivers/platform/x86/intel_int0002_vgpio.c | 16 
>  1 file changed, 16 insertions(+)
>
> diff --git a/drivers/platform/x86/intel_int0002_vgpio.c 
> b/drivers/platform/x86/intel_int0002_vgpio.c
> index 987a3b03f225..33c3489f5bc1 100644
> --- a/drivers/platform/x86/intel_int0002_vgpio.c
> +++ b/drivers/platform/x86/intel_int0002_vgpio.c
> @@ -106,6 +106,21 @@ static void int0002_irq_mask(struct irq_data *data)
> outl(gpe_en_reg, GPE0A_EN_PORT);
>  }
>
> +static int