Re: [PATCH] net/mlx5: Fix a potential use after free in mlx5e_ktls_del_rx
On 2021-03-22 16:21, Lv Yunlong wrote: My static analyzer tool reported a potential uaf in mlx5e_ktls_del_rx. In this function, if the condition cancel_work_sync(>work) is true, and then priv_rx could be freed. But priv_rx is used later. I'm unfamiliar with how this function works. Maybe the maintainer forgot to add return after freeing priv_rx? Thanks for running a static analyzer over our code! Sadly, the fix is not correct and breaks stuff, and there is no problem with this code. First of all, mlx5e_ktls_priv_rx_put doesn't necessarily free priv_rx. It decrements the refcount and frees the object only when the refcount goes to zero. Unless there are other bugs, the refcount in this branch is not expected to go to zero, so there is no use-after-free in the code below. The corresponding elevation of the refcount happens before queue_work of resync->work. So, no, we haven't forgot to add a return, we just expect priv_rx to stay alive after this call, and we want to run the cleanup code below this `if`, while your fix skips the cleanup and skips the second mlx5e_ktls_priv_rx_put in the end of this function, leading to a memory leak. If you'd like to calm down the static analyzer, you could try to add a WARN_ON assertion to check that mlx5e_ktls_priv_rx_put returns false in that `if` (meaning that the object hasn't been freed). If would be nice to have this WARN_ON regardless of static analyzers. Fixes: b850bbff96512 ("net/mlx5e: kTLS, Use refcounts to free kTLS RX priv context") Signed-off-by: Lv Yunlong --- drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c index d06532d0baa4..54a77df42316 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c @@ -663,8 +663,10 @@ void mlx5e_ktls_del_rx(struct net_device *netdev, struct tls_context *tls_ctx) */ wait_for_completion(_rx->add_ctx); resync = _rx->resync; - if (cancel_work_sync(>work)) + if (cancel_work_sync(>work)) { mlx5e_ktls_priv_rx_put(priv_rx); + return; + } priv_rx->stats->tls_del++; if (priv_rx->rule.rule)
Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue
On 2021-03-10 19:03, Eric Dumazet wrote: On 3/10/21 3:54 PM, Maxim Mikityanskiy wrote: On 2021-03-09 17:20, Eric Dumazet wrote: On 3/9/21 4:13 PM, syzbot wrote: Hello, syzbot found the following issue on: HEAD commit: 38b5133a octeontx2-pf: Fix otx2_get_fecparam() git tree: net-next console output: https://syzkaller.appspot.com/x/log.txt?x=166288a8d0 kernel config: https://syzkaller.appspot.com/x/.config?x=dbc1ca9e55dc1f9f dashboard link: https://syzkaller.appspot.com/bug?extid=b53a709f04722ca12a3c syz repro: https://syzkaller.appspot.com/x/repro.syz?x=119454ccd0 The issue was bisected to: commit d03b195b5aa015f6c11988b86a3625f8d5dbac52 Author: Maxim Mikityanskiy Date: Tue Jan 19 12:08:13 2021 + sch_htb: Hierarchical QoS hardware offload bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13ab12ecd0 final oops: https://syzkaller.appspot.com/x/report.txt?x=106b12ecd0 console output: https://syzkaller.appspot.com/x/log.txt?x=17ab12ecd0 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+b53a709f04722ca12...@syzkaller.appspotmail.com Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") BUG: kernel NULL pointer dereference, address: #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page PGD 183fe067 P4D 183fe067 PUD 21aef067 PMD 0 Oops: 0010 [#1] PREEMPT SMP KASAN CPU: 0 PID: 10125 Comm: syz-executor.0 Not tainted 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:0x0 Code: Unable to access opcode bytes at RIP 0xffd6. RSP: 0018:c9000a9c74e8 EFLAGS: 00010246 RAX: dc00 RBX: 192001538e9e RCX: RDX: c9000a9c7520 RSI: 0012 RDI: 88802d158000 RBP: 88802d158000 R08: fff1 R09: 0400 R10: 871631c4 R11: R12: 89ea6b40 R13: dc00 R14: 888012b79c00 R15: FS: 7f73f9698700() GS:8880b9c0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: ffd6 CR3: 173b5000 CR4: 001506f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: htb_offload net/sched/sch_htb.c:1011 [inline] htb_select_queue+0x17f/0x2c0 net/sched/sch_htb.c:1349 tc_modify_qdisc+0x44a/0x1a50 net/sched/sch_api.c:1657 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 sys_sendmsg+0x6e8/0x810 net/socket.c:2348 ___sys_sendmsg+0xf3/0x170 net/socket.c:2402 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x466019 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:7f73f9698188 EFLAGS: 0246 ORIG_RAX: 002e RAX: ffda RBX: 0056bf60 RCX: 00466019 RDX: RSI: 27c0 RDI: 0004 RBP: 004bd067 R08: R09: R10: R11: 0246 R12: 0056bf60 R13: 7fffefccc11f R14: 7f73f9698300 R15: 00022000 Modules linked in: CR2: ---[ end trace e1544e8206616773 ]--- RIP: 0010:0x0 Code: Unable to access opcode bytes at RIP 0xffd6. RSP: 0018:c9000a9c74e8 EFLAGS: 00010246 RAX: dc00 RBX: 192001538e9e RCX: RDX: c9000a9c7520 RSI: 0012 RDI: 88802d158000 RBP: 88802d158000 R08: fff1 R09: 0400 R10: 871631c4 R11: R12: 89ea6b40 R13: dc00 R14: 888012b79c00 R15: FS: 7f73f9698700() GS:8880b9d0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: ffd6 CR3: 173b5000 CR4: 001506e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for ho
Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue
On 2021-03-09 17:20, Eric Dumazet wrote: On 3/9/21 4:13 PM, syzbot wrote: Hello, syzbot found the following issue on: HEAD commit:38b5133a octeontx2-pf: Fix otx2_get_fecparam() git tree: net-next console output: https://syzkaller.appspot.com/x/log.txt?x=166288a8d0 kernel config: https://syzkaller.appspot.com/x/.config?x=dbc1ca9e55dc1f9f dashboard link: https://syzkaller.appspot.com/bug?extid=b53a709f04722ca12a3c syz repro: https://syzkaller.appspot.com/x/repro.syz?x=119454ccd0 The issue was bisected to: commit d03b195b5aa015f6c11988b86a3625f8d5dbac52 Author: Maxim Mikityanskiy Date: Tue Jan 19 12:08:13 2021 + sch_htb: Hierarchical QoS hardware offload bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13ab12ecd0 final oops: https://syzkaller.appspot.com/x/report.txt?x=106b12ecd0 console output: https://syzkaller.appspot.com/x/log.txt?x=17ab12ecd0 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+b53a709f04722ca12...@syzkaller.appspotmail.com Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") BUG: kernel NULL pointer dereference, address: #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page PGD 183fe067 P4D 183fe067 PUD 21aef067 PMD 0 Oops: 0010 [#1] PREEMPT SMP KASAN CPU: 0 PID: 10125 Comm: syz-executor.0 Not tainted 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:0x0 Code: Unable to access opcode bytes at RIP 0xffd6. RSP: 0018:c9000a9c74e8 EFLAGS: 00010246 RAX: dc00 RBX: 192001538e9e RCX: RDX: c9000a9c7520 RSI: 0012 RDI: 88802d158000 RBP: 88802d158000 R08: fff1 R09: 0400 R10: 871631c4 R11: R12: 89ea6b40 R13: dc00 R14: 888012b79c00 R15: FS: 7f73f9698700() GS:8880b9c0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: ffd6 CR3: 173b5000 CR4: 001506f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: htb_offload net/sched/sch_htb.c:1011 [inline] htb_select_queue+0x17f/0x2c0 net/sched/sch_htb.c:1349 tc_modify_qdisc+0x44a/0x1a50 net/sched/sch_api.c:1657 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 sys_sendmsg+0x6e8/0x810 net/socket.c:2348 ___sys_sendmsg+0xf3/0x170 net/socket.c:2402 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x466019 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:7f73f9698188 EFLAGS: 0246 ORIG_RAX: 002e RAX: ffda RBX: 0056bf60 RCX: 00466019 RDX: RSI: 27c0 RDI: 0004 RBP: 004bd067 R08: R09: R10: R11: 0246 R12: 0056bf60 R13: 7fffefccc11f R14: 7f73f9698300 R15: 00022000 Modules linked in: CR2: ---[ end trace e1544e8206616773 ]--- RIP: 0010:0x0 Code: Unable to access opcode bytes at RIP 0xffd6. RSP: 0018:c9000a9c74e8 EFLAGS: 00010246 RAX: dc00 RBX: 192001538e9e RCX: RDX: c9000a9c7520 RSI: 0012 RDI: 88802d158000 RBP: 88802d158000 R08: fff1 R09: 0400 R10: 871631c4 R11: R12: 89ea6b40 R13: dc00 R14: 888012b79c00 R15: FS: 7f73f9698700() GS:8880b9d0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: ffd6 CR3: 173b5000 CR4: 001506e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. For information about bisection process see: https://goo.gl/tpsmEJ#bisection syzbot
[PATCH] HID: plantronics: Workaround for double volume key presses
Plantronics Blackwire 3220 Series (047f:c056) sends HID reports twice for each volume key press. This patch adds a quirk to hid-plantronics for this product ID, which will ignore the second volume key press if it happens within 5 ms from the last one that was handled. The patch was tested on the mentioned model only, it shouldn't affect other models, however, this quirk might be needed for them too. Auto-repeat (when a key is held pressed) is not affected, because the rate is about 3 times per second, which is far less frequent than once in 5 ms. Fixes: 81bb773faed7 ("HID: plantronics: Update to map volume up/down controls") Signed-off-by: Maxim Mikityanskiy --- People from Plantronics, maybe you could advise on a better fix than filtering duplicate events on driver level? Do you happen to know why they occur in the first place? Are any other headsets affected? drivers/hid/hid-ids.h | 1 + drivers/hid/hid-plantronics.c | 60 +-- include/linux/hid.h | 2 ++ 3 files changed, 61 insertions(+), 2 deletions(-) diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index 5ba0aa1d2335..5ac52b4d900d 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -937,6 +937,7 @@ #define USB_DEVICE_ID_ORTEK_IHOME_IMAC_A210S 0x8003 #define USB_VENDOR_ID_PLANTRONICS 0x047f +#define USB_DEVICE_ID_PLANTRONICS_BLACKWIRE_3220_SERIES0xc056 #define USB_VENDOR_ID_PANASONIC0x04da #define USB_DEVICE_ID_PANABOARD_UBT780 0x1044 diff --git a/drivers/hid/hid-plantronics.c b/drivers/hid/hid-plantronics.c index 85b685efc12f..e81b7cec2d12 100644 --- a/drivers/hid/hid-plantronics.c +++ b/drivers/hid/hid-plantronics.c @@ -13,6 +13,7 @@ #include #include +#include #define PLT_HID_1_0_PAGE 0xffa0 #define PLT_HID_2_0_PAGE 0xffa2 @@ -36,6 +37,16 @@ #define PLT_ALLOW_CONSUMER (field->application == HID_CP_CONSUMERCONTROL && \ (usage->hid & HID_USAGE_PAGE) == HID_UP_CONSUMER) +#define PLT_QUIRK_DOUBLE_VOLUME_KEYS BIT(0) + +#define PLT_DOUBLE_KEY_TIMEOUT 5 /* ms */ + +struct plt_drv_data { + unsigned long device_type; + unsigned long last_volume_key_ts; + u32 quirks; +}; + static int plantronics_input_mapping(struct hid_device *hdev, struct hid_input *hi, struct hid_field *field, @@ -43,7 +54,8 @@ static int plantronics_input_mapping(struct hid_device *hdev, unsigned long **bit, int *max) { unsigned short mapped_key; - unsigned long plt_type = (unsigned long)hid_get_drvdata(hdev); + struct plt_drv_data *drv_data = hid_get_drvdata(hdev); + unsigned long plt_type = drv_data->device_type; /* special case for PTT products */ if (field->application == HID_GD_JOYSTICK) @@ -105,6 +117,30 @@ static int plantronics_input_mapping(struct hid_device *hdev, return 1; } +static int plantronics_event(struct hid_device *hdev, struct hid_field *field, +struct hid_usage *usage, __s32 value) +{ + struct plt_drv_data *drv_data = hid_get_drvdata(hdev); + + if (drv_data->quirks & PLT_QUIRK_DOUBLE_VOLUME_KEYS) { + unsigned long prev_ts, cur_ts; + + /* Usages are filtered in plantronics_usages. */ + + if (!value) /* Handle key presses only. */ + return 0; + + prev_ts = drv_data->last_volume_key_ts; + cur_ts = jiffies; + if (jiffies_to_msecs(cur_ts - prev_ts) <= PLT_DOUBLE_KEY_TIMEOUT) + return 1; /* Ignore the repeated key. */ + + drv_data->last_volume_key_ts = cur_ts; + } + + return 0; +} + static unsigned long plantronics_device_type(struct hid_device *hdev) { unsigned i, col_page; @@ -133,15 +169,24 @@ static unsigned long plantronics_device_type(struct hid_device *hdev) static int plantronics_probe(struct hid_device *hdev, const struct hid_device_id *id) { + struct plt_drv_data *drv_data; int ret; + drv_data = devm_kzalloc(>dev, sizeof(*drv_data), GFP_KERNEL); + if (!drv_data) + return -ENOMEM; + ret = hid_parse(hdev); if (ret) { hid_err(hdev, "parse failed\n"); goto err; } - hid_set_drvdata(hdev, (void *)plantronics_device_type(hdev)); + drv_data->device_type = plantronics_device_type(hdev); + drv_data->quirks = id->driver_data; + drv_data->last_volume_key_ts = jiffies - msecs_to_jiffies(PLT_DOUBLE_KEY_TIMEOUT); + + hid_set_drvdata(hdev, drv_data); ret = hid_hw_start(hdev, HID_CONNECT_DEFAULT | HID_CONNECT_HIDINPUT_FORCE | HID_CONNECT_HIDDEV_FORCE);
[PATCH] media: usbtv: Fix deadlock on suspend
usbtv doesn't support power management, so on system suspend the .disconnect callback of the driver is called. The teardown sequence includes a call to snd_card_free. Its implementation waits until the refcount of the sound card device drops to zero, however, if its file is open, snd_card_file_add takes a reference, which can't be dropped during the suspend, because the userspace processes are already frozen at this point. snd_card_free waits for completion forever, leading to a hang on suspend. This commit fixes this deadlock condition by replacing snd_card_free with snd_card_free_when_closed, that doesn't wait until all references are released, allowing suspend to progress. Fixes: 63ddf68de52e ("[media] usbtv: add audio support") Signed-off-by: Maxim Mikityanskiy --- drivers/media/usb/usbtv/usbtv-audio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/media/usb/usbtv/usbtv-audio.c b/drivers/media/usb/usbtv/usbtv-audio.c index b57e94fb1977..333bd305a4f9 100644 --- a/drivers/media/usb/usbtv/usbtv-audio.c +++ b/drivers/media/usb/usbtv/usbtv-audio.c @@ -371,7 +371,7 @@ void usbtv_audio_free(struct usbtv *usbtv) cancel_work_sync(>snd_trigger); if (usbtv->snd && usbtv->udev) { - snd_card_free(usbtv->snd); + snd_card_free_when_closed(usbtv->snd); usbtv->snd = NULL; } } -- 2.30.0
Re: [PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance
On Wed, Jan 27, 2021 at 6:23 AM Bart Van Assche wrote: > > On 1/26/21 11:59 AM, Maxim Mikityanskiy wrote: > > The cited commit introduced a serious regression with SATA write speed, > > as found by bisecting. This patch reverts this commit, which restores > > write speed back to the values observed before this commit. > > > > The performance tests were done on a Helios4 NAS (2nd batch) with 4 HDDs > > (WD8003FFBX) using dd (bs=1M count=2000). "Direct" is a test with a > > single HDD, the rest are different RAID levels built over the first > > partitions of 4 HDDs. Test results are in MB/s, R is read, W is write. > > > > | Direct | RAID0 | RAID10 f2 | RAID10 n2 | RAID6 > > ++---+---+---+ > > 9011495c9466| R:256 | R:313 | R:276 | R:313 | R:323 > > (before faulty) | W:254 | W:253 | W:195 | W:204 | W:117 > > ++---+---+---+ > > 5ff9f19231a0| R:257 | R:398 | R:312 | R:344 | R:391 > > (faulty commit) | W:154 | W:122 | W:67.7| W:66.6| W:67.2 > > ++---+---+---+ > > 5.10.10 | R:256 | R:401 | R:312 | R:356 | R:375 > > unpatched | W:149 | W:123 | W:64 | W:64.1| W:61.5 > > ++---+---+---+ > > 5.10.10 | R:255 | R:396 | R:312 | R:340 | R:393 > > patched | W:247 | W:274 | W:220 | W:225 | W:121 > > > > Applying this patch doesn't hurt read performance, while improves the > > write speed by 1.5x - 3.5x (more impact on RAID tests). The write speed > > is restored back to the state before the faulty commit, and even a bit > > higher in RAID tests (which aren't HDD-bound on this device) - that is > > likely related to other optimizations done between the faulty commit and > > 5.10.10 which also improved the read speed. > > > > Signed-off-by: Maxim Mikityanskiy > > Fixes: 5ff9f19231a0 ("block: simplify set_init_blocksize") > > Cc: Christoph Hellwig > > Cc: Jens Axboe > > --- > > fs/block_dev.c | 10 +- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/fs/block_dev.c b/fs/block_dev.c > > index 3b8963e228a1..235b5042672e 100644 > > --- a/fs/block_dev.c > > +++ b/fs/block_dev.c > > @@ -130,7 +130,15 @@ EXPORT_SYMBOL(truncate_bdev_range); > > > > static void set_init_blocksize(struct block_device *bdev) > > { > > - bdev->bd_inode->i_blkbits = > > blksize_bits(bdev_logical_block_size(bdev)); > > + unsigned int bsize = bdev_logical_block_size(bdev); > > + loff_t size = i_size_read(bdev->bd_inode); > > + > > + while (bsize < PAGE_SIZE) { > > + if (size & bsize) > > + break; > > + bsize <<= 1; > > + } > > + bdev->bd_inode->i_blkbits = blksize_bits(bsize); > > } > > > > int set_blocksize(struct block_device *bdev, int size) > > How can this patch affect write speed? I haven't found any calls of > set_init_blocksize() in the I/O path. Did I perhaps overlook something? I don't know the exact mechanism how this change affects the speed, I'm not an expert in the block device subsystem (I'm a networking guy). This commit was found by git bisect, and my performance test confirmed that reverting it fixes the bug. It looks to me as this function sets the block size as part of control flow, and this size is used later in the fast path, and the commit that removed the loop decreased this block size. > Bart. > >
[PATCH] Revert "block: simplify set_init_blocksize" to regain lost performance
The cited commit introduced a serious regression with SATA write speed, as found by bisecting. This patch reverts this commit, which restores write speed back to the values observed before this commit. The performance tests were done on a Helios4 NAS (2nd batch) with 4 HDDs (WD8003FFBX) using dd (bs=1M count=2000). "Direct" is a test with a single HDD, the rest are different RAID levels built over the first partitions of 4 HDDs. Test results are in MB/s, R is read, W is write. | Direct | RAID0 | RAID10 f2 | RAID10 n2 | RAID6 ++---+---+---+ 9011495c9466| R:256 | R:313 | R:276 | R:313 | R:323 (before faulty) | W:254 | W:253 | W:195 | W:204 | W:117 ++---+---+---+ 5ff9f19231a0| R:257 | R:398 | R:312 | R:344 | R:391 (faulty commit) | W:154 | W:122 | W:67.7| W:66.6| W:67.2 ++---+---+---+ 5.10.10 | R:256 | R:401 | R:312 | R:356 | R:375 unpatched | W:149 | W:123 | W:64 | W:64.1| W:61.5 ++---+---+---+ 5.10.10 | R:255 | R:396 | R:312 | R:340 | R:393 patched | W:247 | W:274 | W:220 | W:225 | W:121 Applying this patch doesn't hurt read performance, while improves the write speed by 1.5x - 3.5x (more impact on RAID tests). The write speed is restored back to the state before the faulty commit, and even a bit higher in RAID tests (which aren't HDD-bound on this device) - that is likely related to other optimizations done between the faulty commit and 5.10.10 which also improved the read speed. Signed-off-by: Maxim Mikityanskiy Fixes: 5ff9f19231a0 ("block: simplify set_init_blocksize") Cc: Christoph Hellwig Cc: Jens Axboe --- fs/block_dev.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index 3b8963e228a1..235b5042672e 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -130,7 +130,15 @@ EXPORT_SYMBOL(truncate_bdev_range); static void set_init_blocksize(struct block_device *bdev) { - bdev->bd_inode->i_blkbits = blksize_bits(bdev_logical_block_size(bdev)); + unsigned int bsize = bdev_logical_block_size(bdev); + loff_t size = i_size_read(bdev->bd_inode); + + while (bsize < PAGE_SIZE) { + if (size & bsize) + break; + bsize <<= 1; + } + bdev->bd_inode->i_blkbits = blksize_bits(bsize); } int set_blocksize(struct block_device *bdev, int size) -- 2.30.0
Re: [PATCH] [net-next] net/mlx5e: xsk: dynamically allocate mlx5e_channel_param
On 2019-07-08 18:16, Maxim Mikityanskiy wrote: > On 2019-07-08 15:55, Arnd Bergmann wrote: >> -mlx5e_build_xsk_cparam(priv, params, xsk, ); >> +cparam = kzalloc(sizeof(*cparam), GFP_KERNEL); > > Similar code in mlx5e_open_channels (en_main.c) uses kvzalloc. Although > the struct is currently smaller than a page anyway, and there should be > no difference in behavior now, I suggest using the same alloc function > to keep code uniform. > >> /* Create a dedicated SQ for posting NOPs whenever we need an IRQ to be >> * triggered and NAPI to be called on the correct CPU. >> */ >> -err = mlx5e_open_icosq(c, params, , >xskicosq); >> +err = mlx5e_open_icosq(c, params, >icosq, >xskicosq); >> if (unlikely(err)) >> goto err_close_icocq; >> > > Here is kfree missing. It's a memory leak in the good path. Arnd, I'm going to take over your patch and respin it, addressing my own comments, because it's been quite a while, and we want to have this fix. Thanks for spotting it.
Re: [PATCH] [net-next] net/mlx5e: xsk: dynamically allocate mlx5e_channel_param
On 2019-07-08 15:55, Arnd Bergmann wrote: > The structure is too large to put on the stack, resulting in a > warning on 32-bit ARM: > > drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c:59:5: error: stack > frame size of 1344 bytes in function >'mlx5e_open_xsk' [-Werror,-Wframe-larger-than=] > > Use kzalloc() instead. > > Fixes: a038e9794541 ("net/mlx5e: Add XSK zero-copy support") > Signed-off-by: Arnd Bergmann > --- > .../mellanox/mlx5/core/en/xsk/setup.c | 25 --- > 1 file changed, 16 insertions(+), 9 deletions(-) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c > b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c > index aaffa6f68dc0..db9bbec68dbf 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c > @@ -60,24 +60,28 @@ int mlx5e_open_xsk(struct mlx5e_priv *priv, struct > mlx5e_params *params, > struct mlx5e_xsk_param *xsk, struct xdp_umem *umem, > struct mlx5e_channel *c) > { > - struct mlx5e_channel_param cparam = {}; > + struct mlx5e_channel_param *cparam; > struct dim_cq_moder icocq_moder = {}; > int err; > > if (!mlx5e_validate_xsk_param(params, xsk, priv->mdev)) > return -EINVAL; > > - mlx5e_build_xsk_cparam(priv, params, xsk, ); > + cparam = kzalloc(sizeof(*cparam), GFP_KERNEL); Similar code in mlx5e_open_channels (en_main.c) uses kvzalloc. Although the struct is currently smaller than a page anyway, and there should be no difference in behavior now, I suggest using the same alloc function to keep code uniform. > + if (!cparam) > + return -ENOMEM; > > - err = mlx5e_open_cq(c, params->rx_cq_moderation, _cq, > >xskrq.cq); > + mlx5e_build_xsk_cparam(priv, params, xsk, cparam); > + > + err = mlx5e_open_cq(c, params->rx_cq_moderation, >rx_cq, > >xskrq.cq); > if (unlikely(err)) > - return err; > + goto err_kfree_cparam; > > - err = mlx5e_open_rq(c, params, , xsk, umem, >xskrq); > + err = mlx5e_open_rq(c, params, >rq, xsk, umem, >xskrq); > if (unlikely(err)) > goto err_close_rx_cq; > > - err = mlx5e_open_cq(c, params->tx_cq_moderation, _cq, > >xsksq.cq); > + err = mlx5e_open_cq(c, params->tx_cq_moderation, >tx_cq, > >xsksq.cq); > if (unlikely(err)) > goto err_close_rq; > > @@ -87,18 +91,18 @@ int mlx5e_open_xsk(struct mlx5e_priv *priv, struct > mlx5e_params *params, >* is disabled and then reenabled, but the SQ continues receiving CQEs >* from the old UMEM. >*/ > - err = mlx5e_open_xdpsq(c, params, _sq, umem, >xsksq, > true); > + err = mlx5e_open_xdpsq(c, params, >xdp_sq, umem, >xsksq, > true); > if (unlikely(err)) > goto err_close_tx_cq; > > - err = mlx5e_open_cq(c, icocq_moder, _cq, >xskicosq.cq); > + err = mlx5e_open_cq(c, icocq_moder, >icosq_cq, >xskicosq.cq); > if (unlikely(err)) > goto err_close_sq; > > /* Create a dedicated SQ for posting NOPs whenever we need an IRQ to be >* triggered and NAPI to be called on the correct CPU. >*/ > - err = mlx5e_open_icosq(c, params, , >xskicosq); > + err = mlx5e_open_icosq(c, params, >icosq, >xskicosq); > if (unlikely(err)) > goto err_close_icocq; > Here is kfree missing. It's a memory leak in the good path. Thanks! > @@ -123,6 +127,9 @@ int mlx5e_open_xsk(struct mlx5e_priv *priv, struct > mlx5e_params *params, > err_close_rx_cq: > mlx5e_close_cq(>xskrq.cq); > > +err_kfree_cparam: > + kfree(cparam); > + > return err; > } > >
Re: [PATCH 1/1] Address regression in inet6_validate_link_af
On 2019-06-13 09:45, Jonas Bonn wrote: > Hi Max, > > On 12/06/2019 12:42, Maxim Mikityanskiy wrote: >> On 2019-06-11 13:03, Jonas Bonn wrote: >>> Patch 7dc2bccab0ee37ac28096b8fcdc390a679a15841 introduces a regression >>> with systemd 241. In that revision, systemd-networkd fails to pass the >>> required flags early enough. This appears to be addressed in later >>> versions of systemd, but for users of version 241 where systemd-networkd >>> nonetheless worked with earlier kernels, the strict check introduced by >>> the patch causes a regression in behaviour. >>> >>> This patch converts the failure to supply the required flags from an >>> error into a warning. >> The purpose of my patch was to prevent a partial configuration update on >> invalid input. -EINVAL was returned both before and after my patch, the >> difference is that before my patch there was a partial update and a >> warning. >> >> Your patch basically makes mine pointless, because you revert the fix, >> and now we'll have the same partial update and two warnings. > > Unfortunately, yes... So what you propose is a degradation. >> >> One more thing is that after applying your patch on top of mine, the >> kernel won't return -EINVAL anymore on invalid input. Returning -EINVAL >> is what happened before my patch, and also after my patch. > > Yes, you're right, it would probably be better revert the entire patch > because the checks in set_link_af have been dropped on the assumption > that validate_link_af catches the badness. We shouldn't introduce workarounds in the kernel for some temporary bugs in old userspace. A regression was introduced in systemd: it started sending invalid messages, didn't check the return code properly and relied on an undefined behavior. It was then fixed in systemd. If the kernel had all kinds of workarounds for all buggy software ever existed, it would be a complete mess. The software used the API in a wrong way, the expected behavior is failure, so it shouldn't expect anything else. For me, the trade-off between fixing the kernel behavior and supporting some old buggy software while keeping a UB in the kernel forever has an obvious resolution. >> >> Regarding the systemd issue, I don't think we should change the kernel >> to adapt to bugs in systemd. systemd didn't have this bug from day one, >> it was a regression introduced in [1]. The kernel has always returned >> -EINVAL here, but the behavior before my patch was basically a UB, and >> after the patch it's well-defined. If systemd saw EINVAL and relied on >> the UB that came with it, it can't be a reason enough to break the >> kernel. >> >> Moreover, the bug looks fixed in systemd's master, so what you suggest >> is to insert a kernel-side workaround for an old version of software >> when there is a fixed one. > > I agree, systemd is buggy here. Probably what happens is: > > i) systemd tries to set the link up > ii) it ends up doing a "partial" modification of the link state; > critically, though, enough to actually effect the link state changing to UP > iii) systemd sees the -EINVAL error and decides it probably failed to > bring up the link > iv) systemd then gets a notification that the link is up and runs a > DHCP client on it > > I haven't noticed any "partial modification" warnings in the kernel log > but I wasn't looking for them, either... > > With the new behaviour in 5.2, step ii) above results in no "partial > modification" so the link remains down and systemd is forever unable to > bring it up. > > Anyway, for the record, the error is: > > systemd: Could not bring up interface... (invalid parameter) > > And the solution is: Linux >= 5.2 requires systemd != v241. > > If nobody else notices, that's good enough for me. > >> >> Please correct me if anything I say is wrong. > > Nothing wrong, but it's still a regression. > > /Jonas > >> >> Thanks, >> Max >> >> [1]: >> https://github.com/systemd/systemd/commit/0e2fdb83bb5e22047e0c7cc058b415d0e93f02cf >> >> >> >>> With this, systemd-networkd version 241 once >>> again is able to bring up the link, albeit not quite as intended and >>> thereby with a warning in the kernel log. >>> >>> CC: Maxim Mikityanskiy >>> CC: David S. Miller >>> CC: Alexey Kuznetsov >>> CC: Hideaki YOSHIFUJI >>> Signed-off-by: Jonas Bonn >>> --- >>> net/ipv6/addrconf.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>
Re: [PATCH 1/1] Address regression in inet6_validate_link_af
On 2019-06-11 13:03, Jonas Bonn wrote: > Patch 7dc2bccab0ee37ac28096b8fcdc390a679a15841 introduces a regression > with systemd 241. In that revision, systemd-networkd fails to pass the > required flags early enough. This appears to be addressed in later > versions of systemd, but for users of version 241 where systemd-networkd > nonetheless worked with earlier kernels, the strict check introduced by > the patch causes a regression in behaviour. > > This patch converts the failure to supply the required flags from an > error into a warning. The purpose of my patch was to prevent a partial configuration update on invalid input. -EINVAL was returned both before and after my patch, the difference is that before my patch there was a partial update and a warning. Your patch basically makes mine pointless, because you revert the fix, and now we'll have the same partial update and two warnings. One more thing is that after applying your patch on top of mine, the kernel won't return -EINVAL anymore on invalid input. Returning -EINVAL is what happened before my patch, and also after my patch. Regarding the systemd issue, I don't think we should change the kernel to adapt to bugs in systemd. systemd didn't have this bug from day one, it was a regression introduced in [1]. The kernel has always returned -EINVAL here, but the behavior before my patch was basically a UB, and after the patch it's well-defined. If systemd saw EINVAL and relied on the UB that came with it, it can't be a reason enough to break the kernel. Moreover, the bug looks fixed in systemd's master, so what you suggest is to insert a kernel-side workaround for an old version of software when there is a fixed one. Please correct me if anything I say is wrong. Thanks, Max [1]: https://github.com/systemd/systemd/commit/0e2fdb83bb5e22047e0c7cc058b415d0e93f02cf > With this, systemd-networkd version 241 once > again is able to bring up the link, albeit not quite as intended and > thereby with a warning in the kernel log. > > CC: Maxim Mikityanskiy > CC: David S. Miller > CC: Alexey Kuznetsov > CC: Hideaki YOSHIFUJI > Signed-off-by: Jonas Bonn > --- > net/ipv6/addrconf.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c > index 081bb517e40d..e2477bf92e12 100644 > --- a/net/ipv6/addrconf.c > +++ b/net/ipv6/addrconf.c > @@ -5696,7 +5696,8 @@ static int inet6_validate_link_af(const struct > net_device *dev, > return err; > > if (!tb[IFLA_INET6_TOKEN] && !tb[IFLA_INET6_ADDR_GEN_MODE]) > - return -EINVAL; > + net_warn_ratelimited( > + "required link flag omitted: TOKEN/ADDR_GEN_MODE\n"); > > if (tb[IFLA_INET6_ADDR_GEN_MODE]) { > u8 mode = nla_get_u8(tb[IFLA_INET6_ADDR_GEN_MODE]); >
Re: [PATCH] platform/x86: intel_int0002_vgpio: Only implement irq_set_wake on Bay Trail
On Tue, Feb 5, 2019 at 8:20 PM Andy Shevchenko wrote: > > On Sun, Feb 3, 2019 at 11:42 AM Hans de Goede wrote: > > > > Commit c3b8e884defa ("platform/x86: intel_int0002_vgpio: Implement > > irq_set_wake"), was written to fix some wakeup issues on Bay Trail (BYT) > > devices. > > > > We've received a bug report that this causes a suspend regression on some > > Cherry Trail (CHT) based devices. > > > > To fix the issues this causes on CHT devices, this commit modifies the > > irq_set_wake support so that we only implement irq_set_wake on BYT devices, > > > > Pushed to my review and testing queue, thanks! Kernel 5.0 was released without this patch, is it still planned to merge it? > > Fixes: c3b8e884defa ("platform/x86: intel_int0002_vgpio: ... irq_set_wake") > > Reported-and-tested-by: Maxim Mikityanskiy > > Signed-off-by: Hans de Goede > > --- > > drivers/platform/x86/intel_int0002_vgpio.c | 32 ++ > > 1 file changed, 26 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/platform/x86/intel_int0002_vgpio.c > > b/drivers/platform/x86/intel_int0002_vgpio.c > > index 4b8f7305fc8a..78787934b572 100644 > > --- a/drivers/platform/x86/intel_int0002_vgpio.c > > +++ b/drivers/platform/x86/intel_int0002_vgpio.c > > @@ -51,11 +51,14 @@ > > #define GPE0A_STS_PORT 0x420 > > #define GPE0A_EN_PORT 0x428 > > > > -#define ICPU(model){ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, } > > +#define BAYTRAIL 0x01 > > +#define CHERRYTRAIL0x02 > > + > > +#define ICPU(model, data) { X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, > > data} > > > > static const struct x86_cpu_id int0002_cpu_ids[] = { > > - ICPU(INTEL_FAM6_ATOM_SILVERMONT), /* Valleyview, Bay Trail */ > > - ICPU(INTEL_FAM6_ATOM_AIRMONT), /* Braswell, Cherry Trail */ > > + ICPU(INTEL_FAM6_ATOM_SILVERMONT, BAYTRAIL), /* Valleyview, Bay > > Trail */ > > + ICPU(INTEL_FAM6_ATOM_AIRMONT, CHERRYTRAIL), /* Braswell, Cherry > > Trail */ > > {} > > }; > > > > @@ -135,7 +138,7 @@ static irqreturn_t int0002_irq(int irq, void *data) > > return IRQ_HANDLED; > > } > > > > -static struct irq_chip int0002_irqchip = { > > +static struct irq_chip int0002_byt_irqchip = { > > .name = DRV_NAME, > > .irq_ack= int0002_irq_ack, > > .irq_mask = int0002_irq_mask, > > @@ -143,10 +146,22 @@ static struct irq_chip int0002_irqchip = { > > .irq_set_wake = int0002_irq_set_wake, > > }; > > > > +static struct irq_chip int0002_cht_irqchip = { > > + .name = DRV_NAME, > > + .irq_ack= int0002_irq_ack, > > + .irq_mask = int0002_irq_mask, > > + .irq_unmask = int0002_irq_unmask, > > + /* > > +* No set_wake, on CHT the IRQ is typically shared with the ACPI SCI > > +* and we don't want to mess with the ACPI SCI irq settings. > > +*/ > > +}; > > + > > static int int0002_probe(struct platform_device *pdev) > > { > > struct device *dev = >dev; > > const struct x86_cpu_id *cpu_id; > > + struct irq_chip *irq_chip; > > struct gpio_chip *chip; > > int irq, ret; > > > > @@ -195,14 +210,19 @@ static int int0002_probe(struct platform_device *pdev) > > return ret; > > } > > > > - ret = gpiochip_irqchip_add(chip, _irqchip, 0, > > handle_edge_irq, > > + if (cpu_id->driver_data == BAYTRAIL) > > + irq_chip = _byt_irqchip; > > + else > > + irq_chip = _cht_irqchip; > > + > > + ret = gpiochip_irqchip_add(chip, irq_chip, 0, handle_edge_irq, > >IRQ_TYPE_NONE); > > if (ret) { > > dev_err(dev, "Error adding irqchip: %d\n", ret); > > return ret; > > } > > > > - gpiochip_set_chained_irqchip(chip, _irqchip, irq, NULL); > > + gpiochip_set_chained_irqchip(chip, irq_chip, irq, NULL); > > > > return 0; > > } > > -- > > 2.20.1 > > > > > -- > With Best Regards, > Andy Shevchenko
Re: [PATCH] platform/x86: intel_int0002_vgpio: Implement irq_set_wake
On Sat, Feb 2, 2019 at 12:52 AM Hans de Goede wrote: > > Hi, > > On 1/31/19 8:47 PM, Maxim Mikityanskiy wrote: > > Hi, > > > > On Mon, Sep 24, 2018 at 5:37 PM Hans de Goede wrote: > >> > >> We were relying on the interrupt being shared with the ACPI SCI and the > >> ACPI core calling irq_set_wake. But that does not always happen on > >> Bay Trail devices, so we should do it ourselves. > >> > >> This fixes wake from USB not working on various Bay Trail devices. > > > > This patch breaks suspend on ASUS E202SA (bisecting pointed me to this > > patch, and if I revert it and build 4.20.5 without this patch, > > everything works flawlessly). > > Thank you for the bug report, can you please test 4.20.5 with the attached > patch on top? That should fix it. Once I've confirmation that this fixes > things I will submit the patch upstream. I've tested your patch against both 4.20.5 and 4.20.6, and it works fine. Thank you for such a quick fix. > Regards, > > Hans > > > > > > This command fails with the following message: > > > > # echo mem > /sys/power/state > > Error while writing to stdout > > write_loop: Device or resource busy > > > > And here is dmesg output: > > > > [ 224.077275] PM: suspend entry (deep) > > [ 224.077286] PM: Syncing filesystems ... done. > > [ 225.495014] Freezing user space processes ... (elapsed 0.003 seconds) > > done. > > [ 225.498540] OOM killer disabled. > > [ 225.498543] Freezing remaining freezable tasks ... (elapsed 0.002 > > seconds) done. > > [ 225.500693] printk: Suspending console(s) (use no_console_suspend to > > debug) > > [ 225.502793] wlp1s0: deauthenticating from 00:03:7f:12:34:56 by > > local choice (Reason: 3=DEAUTH_LEAVING) > > [ 225.535333] sd 0:0:0:0: [sda] Synchronizing SCSI cache > > [ 225.535882] sd 0:0:0:0: [sda] Stopping disk > > [ 226.969070] ACPI: EC: interrupt blocked > > [ 227.002156] ACPI: Preparing to enter system sleep state S3 > > [ 227.007890] ACPI: EC: event blocked > > [ 227.007895] ACPI: EC: EC stopped > > [ 227.007900] PM: Saving platform NVS memory > > [ 227.008264] Disabling non-boot CPUs ... > > [ 227.034114] smpboot: CPU 1 is now offline > > [ 227.088320] smpboot: CPU 2 is now offline > > [ 227.141513] smpboot: CPU 3 is now offline > > [ 227.147086] Enabling non-boot CPUs ... > > [ 227.147187] x86: Booting SMP configuration: > > [ 227.147190] smpboot: Booting Node 0 Processor 1 APIC 0x2 > > [ 227.147916] cache: parent cpu1 should not be sleeping > > [ 227.148354] CPU1 is up > > [ 227.148424] smpboot: Booting Node 0 Processor 2 APIC 0x4 > > [ 227.149800] cache: parent cpu2 should not be sleeping > > [ 227.151143] CPU2 is up > > [ 227.151187] smpboot: Booting Node 0 Processor 3 APIC 0x6 > > [ 227.152399] cache: parent cpu3 should not be sleeping > > [ 227.153883] CPU3 is up > > [ 227.154876] ACPI: EC: EC started > > [ 227.155282] ACPI: Waking up from system sleep state S3 > > [ 227.159874] ACPI: button: The lid device is not compliant to SW_LID. > > [ 227.169441] ACPI: EC: interrupt unblocked > > [ 228.236790] ACPI: EC: event unblocked > > [ 228.241950] rtlwifi: rtlwifi: wireless switch is on > > [ 228.251865] sd 0:0:0:0: [sda] Starting disk > > [ 228.476637] usb 1-4: reset full-speed USB device number 2 using xhci_hcd > > [ 228.562879] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) > > [ 228.563924] ata2: SATA link down (SStatus 4 SControl 300) > > [ 228.564979] ata1.00: supports DRM functions and may not be fully > > accessible > > [ 228.565493] ata1.00: NCQ Send/Recv Log not supported > > [ 228.567649] ata1.00: supports DRM functions and may not be fully > > accessible > > [ 228.568252] ata1.00: NCQ Send/Recv Log not supported > > [ 228.570075] ata1.00: configured for UDMA/133 > > [ 228.580412] ahci :00:13.0: port does not support device sleep > > [ 228.639349] Bluetooth: hci0: RTL: rtl: examining hci_ver=06 > > hci_rev=0e2f lmp_ver=06 lmp_subver=a041 > > > > [ 228.639368] Bluetooth: hci0: RTL: rtl: unknown IC info, lmp subver > > a041, hci rev 0e2f, hci ver 0006 > > [ 228.639742] acpi LNXPOWER:01: Turning OFF > > [ 228.640033] OOM killer enabled. > > [ 228.640040] Restarting tasks ... done. > > [ 228.795406] PM: suspend exit > > [ 228.800399] audit: type=1130 audit(1548962671.104:94): pid=1 uid=0 > > auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd" > > exe="/usr/lib/systemd/s
Re: [PATCH] platform/x86: intel_int0002_vgpio: Implement irq_set_wake
Hi, On Mon, Sep 24, 2018 at 5:37 PM Hans de Goede wrote: > > We were relying on the interrupt being shared with the ACPI SCI and the > ACPI core calling irq_set_wake. But that does not always happen on > Bay Trail devices, so we should do it ourselves. > > This fixes wake from USB not working on various Bay Trail devices. This patch breaks suspend on ASUS E202SA (bisecting pointed me to this patch, and if I revert it and build 4.20.5 without this patch, everything works flawlessly). This command fails with the following message: # echo mem > /sys/power/state Error while writing to stdout write_loop: Device or resource busy And here is dmesg output: [ 224.077275] PM: suspend entry (deep) [ 224.077286] PM: Syncing filesystems ... done. [ 225.495014] Freezing user space processes ... (elapsed 0.003 seconds) done. [ 225.498540] OOM killer disabled. [ 225.498543] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done. [ 225.500693] printk: Suspending console(s) (use no_console_suspend to debug) [ 225.502793] wlp1s0: deauthenticating from 00:03:7f:12:34:56 by local choice (Reason: 3=DEAUTH_LEAVING) [ 225.535333] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 225.535882] sd 0:0:0:0: [sda] Stopping disk [ 226.969070] ACPI: EC: interrupt blocked [ 227.002156] ACPI: Preparing to enter system sleep state S3 [ 227.007890] ACPI: EC: event blocked [ 227.007895] ACPI: EC: EC stopped [ 227.007900] PM: Saving platform NVS memory [ 227.008264] Disabling non-boot CPUs ... [ 227.034114] smpboot: CPU 1 is now offline [ 227.088320] smpboot: CPU 2 is now offline [ 227.141513] smpboot: CPU 3 is now offline [ 227.147086] Enabling non-boot CPUs ... [ 227.147187] x86: Booting SMP configuration: [ 227.147190] smpboot: Booting Node 0 Processor 1 APIC 0x2 [ 227.147916] cache: parent cpu1 should not be sleeping [ 227.148354] CPU1 is up [ 227.148424] smpboot: Booting Node 0 Processor 2 APIC 0x4 [ 227.149800] cache: parent cpu2 should not be sleeping [ 227.151143] CPU2 is up [ 227.151187] smpboot: Booting Node 0 Processor 3 APIC 0x6 [ 227.152399] cache: parent cpu3 should not be sleeping [ 227.153883] CPU3 is up [ 227.154876] ACPI: EC: EC started [ 227.155282] ACPI: Waking up from system sleep state S3 [ 227.159874] ACPI: button: The lid device is not compliant to SW_LID. [ 227.169441] ACPI: EC: interrupt unblocked [ 228.236790] ACPI: EC: event unblocked [ 228.241950] rtlwifi: rtlwifi: wireless switch is on [ 228.251865] sd 0:0:0:0: [sda] Starting disk [ 228.476637] usb 1-4: reset full-speed USB device number 2 using xhci_hcd [ 228.562879] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 228.563924] ata2: SATA link down (SStatus 4 SControl 300) [ 228.564979] ata1.00: supports DRM functions and may not be fully accessible [ 228.565493] ata1.00: NCQ Send/Recv Log not supported [ 228.567649] ata1.00: supports DRM functions and may not be fully accessible [ 228.568252] ata1.00: NCQ Send/Recv Log not supported [ 228.570075] ata1.00: configured for UDMA/133 [ 228.580412] ahci :00:13.0: port does not support device sleep [ 228.639349] Bluetooth: hci0: RTL: rtl: examining hci_ver=06 hci_rev=0e2f lmp_ver=06 lmp_subver=a041 [ 228.639368] Bluetooth: hci0: RTL: rtl: unknown IC info, lmp subver a041, hci rev 0e2f, hci ver 0006 [ 228.639742] acpi LNXPOWER:01: Turning OFF [ 228.640033] OOM killer enabled. [ 228.640040] Restarting tasks ... done. [ 228.795406] PM: suspend exit [ 228.800399] audit: type=1130 audit(1548962671.104:94): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 229.067206] wlp1s0: authenticate with 12:34:56:78:90:12 [ 229.067823] wlp1s0: send auth to 12:34:56:78:90:12 (try 1/3) [ 229.070955] wlp1s0: authenticated [ 229.072395] wlp1s0: associate with 12:34:56:78:90:12 (try 1/3) [ 229.074505] wlp1s0: RX AssocResp from 12:34:56:78:90:12 (capab=0x11 status=0 aid=2) [ 229.074819] wlp1s0: associated [ 233.809200] audit: type=1131 audit(1548962676.106:95): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' The laptop doesn't go to sleep, the screen turns off, and in a couple of seconds it turns on. Please take a look at this regression. Feel free to ask me for any additional information you need. Thanks, Max > Signed-off-by: Hans de Goede > --- > drivers/platform/x86/intel_int0002_vgpio.c | 16 > 1 file changed, 16 insertions(+) > > diff --git a/drivers/platform/x86/intel_int0002_vgpio.c > b/drivers/platform/x86/intel_int0002_vgpio.c > index 987a3b03f225..33c3489f5bc1 100644 > --- a/drivers/platform/x86/intel_int0002_vgpio.c > +++ b/drivers/platform/x86/intel_int0002_vgpio.c > @@ -106,6 +106,21 @@ static void int0002_irq_mask(struct irq_data *data) > outl(gpe_en_reg, GPE0A_EN_PORT); > } > > +static int