Re: [PATCH net-next 0/4] Adds support of RSS to HNS3 Driver for Rev 2(=0x21) H/W
From: Salil Mehta Date: Wed, 10 Oct 2018 20:05:33 +0100 > This patch-set mainly adds new additions related to RSS for the new > hardware Revision 0x21. It also adds support to use RSS hash value > provided by the hardware along with descriptor. Series applied.
Re: [PATCH net-next 0/4] Adds support of RSS to HNS3 Driver for Rev 2(=0x21) H/W
From: Salil Mehta Date: Wed, 10 Oct 2018 20:05:33 +0100 > This patch-set mainly adds new additions related to RSS for the new > hardware Revision 0x21. It also adds support to use RSS hash value > provided by the hardware along with descriptor. Series applied.
Re: [PATCH 2/2] target: Fix target_wait_for_sess_cmds breakage with active signals
Hello MNC & Co, On Wed, 2018-10-10 at 11:58 -0500, Mike Christie wrote: > On 10/09/2018 10:23 PM, Nicholas A. Bellinger wrote: > > From: Nicholas Bellinger > > > > With the addition of commit 00d909a107 in v4.19-rc, it incorrectly assumes > > no > > signals will be pending for task_struct executing the normal session > > shutdown > > and I/O quiesce code-path. > > > > diff --git a/drivers/target/target_core_transport.c > > b/drivers/target/target_core_transport.c > > index 86c0156..fc3093d2 100644 > > --- a/drivers/target/target_core_transport.c > > +++ b/drivers/target/target_core_transport.c > > @@ -2754,7 +2754,7 @@ static void target_release_cmd_kref(struct kref *kref) > > if (se_sess) { > > spin_lock_irqsave(_sess->sess_cmd_lock, flags); > > list_del_init(_cmd->se_cmd_list); > > - if (list_empty(_sess->sess_cmd_list)) > > + if (se_sess->sess_tearing_down && > > list_empty(_sess->sess_cmd_list)) > > I think there is another issue with 00d909a107 and ibmvscsi_tgt. > > The problem is that ibmvscsi_tgt never called > target_sess_cmd_list_set_waiting. It only called > target_wait_for_sess_cmds. So before 00d909a107 there was a bug in that > driver and target_wait_for_sess_cmds never did what was intended because > sess_wait_list would always be empty. > > With 00d909a107, we no longer need to call > target_sess_cmd_list_set_waiting to wait for outstanding commands, so > for ibmvscsi_tgt will now wait for commands like we wanted. However, the > commit added a WARN_ON that is hit if target_sess_cmd_list_set_waiting > is not called, so we could hit that. > > So I think we need to add a target_sess_cmd_list_set_waiting call in > ibmvscsi_tgt to go along with your patch chunk above and make sure we do > not trigger the WARN_ON. > Nice catch. :) With target_wait_for_sess_cmd() usage pre 00d909a107 doing a list-splice in target_sess_cmd_list_set_waiting(), this particular usage in ibmvscsi_tgt has always been list_empty(>sess_wait_list) = true (eg: no se_cmd I/O is quiesced, because no se_cmd in sess_wait_list) since commit 712db3eb in 4.9.y code. That said, ibmvscsi_tgt usage is very similar to vhost/scsi in the respect individual /sys/kernel/config/target/$FABRIC/$WWN/$TPGT/ endpoints used by VMs do not remove their I_T nexus while the VM is active. So AFAICT, ibmvscsi_tgt doesn't strictly need target_sess_wait_for_cmd() at all if this is true.
Re: [PATCH 2/2] target: Fix target_wait_for_sess_cmds breakage with active signals
Hello MNC & Co, On Wed, 2018-10-10 at 11:58 -0500, Mike Christie wrote: > On 10/09/2018 10:23 PM, Nicholas A. Bellinger wrote: > > From: Nicholas Bellinger > > > > With the addition of commit 00d909a107 in v4.19-rc, it incorrectly assumes > > no > > signals will be pending for task_struct executing the normal session > > shutdown > > and I/O quiesce code-path. > > > > diff --git a/drivers/target/target_core_transport.c > > b/drivers/target/target_core_transport.c > > index 86c0156..fc3093d2 100644 > > --- a/drivers/target/target_core_transport.c > > +++ b/drivers/target/target_core_transport.c > > @@ -2754,7 +2754,7 @@ static void target_release_cmd_kref(struct kref *kref) > > if (se_sess) { > > spin_lock_irqsave(_sess->sess_cmd_lock, flags); > > list_del_init(_cmd->se_cmd_list); > > - if (list_empty(_sess->sess_cmd_list)) > > + if (se_sess->sess_tearing_down && > > list_empty(_sess->sess_cmd_list)) > > I think there is another issue with 00d909a107 and ibmvscsi_tgt. > > The problem is that ibmvscsi_tgt never called > target_sess_cmd_list_set_waiting. It only called > target_wait_for_sess_cmds. So before 00d909a107 there was a bug in that > driver and target_wait_for_sess_cmds never did what was intended because > sess_wait_list would always be empty. > > With 00d909a107, we no longer need to call > target_sess_cmd_list_set_waiting to wait for outstanding commands, so > for ibmvscsi_tgt will now wait for commands like we wanted. However, the > commit added a WARN_ON that is hit if target_sess_cmd_list_set_waiting > is not called, so we could hit that. > > So I think we need to add a target_sess_cmd_list_set_waiting call in > ibmvscsi_tgt to go along with your patch chunk above and make sure we do > not trigger the WARN_ON. > Nice catch. :) With target_wait_for_sess_cmd() usage pre 00d909a107 doing a list-splice in target_sess_cmd_list_set_waiting(), this particular usage in ibmvscsi_tgt has always been list_empty(>sess_wait_list) = true (eg: no se_cmd I/O is quiesced, because no se_cmd in sess_wait_list) since commit 712db3eb in 4.9.y code. That said, ibmvscsi_tgt usage is very similar to vhost/scsi in the respect individual /sys/kernel/config/target/$FABRIC/$WWN/$TPGT/ endpoints used by VMs do not remove their I_T nexus while the VM is active. So AFAICT, ibmvscsi_tgt doesn't strictly need target_sess_wait_for_cmd() at all if this is true.
Re: [PATCH -next] phy: phy-ocelot-serdes: fix return value check in serdes_probe()
From: Wei Yongjun Date: Wed, 10 Oct 2018 02:00:24 + > In case of error, the function syscon_node_to_regmap() returns ERR_PTR() > and never returns NULL. The NULL test in the return value check should > be replaced with IS_ERR(). > > Fixes: 51f6b410fc22 ("phy: add driver for Microsemi Ocelot SerDes muxing") > Signed-off-by: Wei Yongjun Applied.
Re: [PATCH -next] phy: phy-ocelot-serdes: fix return value check in serdes_probe()
From: Wei Yongjun Date: Wed, 10 Oct 2018 02:00:24 + > In case of error, the function syscon_node_to_regmap() returns ERR_PTR() > and never returns NULL. The NULL test in the return value check should > be replaced with IS_ERR(). > > Fixes: 51f6b410fc22 ("phy: add driver for Microsemi Ocelot SerDes muxing") > Signed-off-by: Wei Yongjun Applied.
[PATCH] platform/x86: thinkpad_acpi: Change the keymap for Favorites hotkey
The keycode KEY_FAVORITES(0x16c) used in thinkpad_acpi driver is too big (out of range > 255) for xorg to handle. xkeyboard-config has already mapped KEY_BOOKMARKS(156) to XF86Favorites: keycodes/evdev: = 164; // #define KEY_BOOKMARKS 156 symbols/inet: key{ [ XF86Favorites ] }; So change the keymap to KEY_BOOKMARKS for Favorites hotkey. Signed-off-by: Zhang Xianwei --- drivers/platform/x86/thinkpad_acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c index fde08a9..a86cf47 100644 --- a/drivers/platform/x86/thinkpad_acpi.c +++ b/drivers/platform/x86/thinkpad_acpi.c @@ -3457,7 +3457,7 @@ static int __init hotkey_init(struct ibm_init_struct *iibm) KEY_UNKNOWN, KEY_UNKNOWN, KEY_UNKNOWN, KEY_UNKNOWN, KEY_UNKNOWN, - KEY_FAVORITES, /* Favorite app, 0x311 */ + KEY_BOOKMARKS, /* Favorite app, 0x311 */ KEY_RESERVED,/* Clipping tool */ KEY_CALC,/* Calculator (above numpad, P52) */ KEY_BLUETOOTH, /* Bluetooth */ -- 2.9.5
[PATCH] platform/x86: thinkpad_acpi: Change the keymap for Favorites hotkey
The keycode KEY_FAVORITES(0x16c) used in thinkpad_acpi driver is too big (out of range > 255) for xorg to handle. xkeyboard-config has already mapped KEY_BOOKMARKS(156) to XF86Favorites: keycodes/evdev: = 164; // #define KEY_BOOKMARKS 156 symbols/inet: key{ [ XF86Favorites ] }; So change the keymap to KEY_BOOKMARKS for Favorites hotkey. Signed-off-by: Zhang Xianwei --- drivers/platform/x86/thinkpad_acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c index fde08a9..a86cf47 100644 --- a/drivers/platform/x86/thinkpad_acpi.c +++ b/drivers/platform/x86/thinkpad_acpi.c @@ -3457,7 +3457,7 @@ static int __init hotkey_init(struct ibm_init_struct *iibm) KEY_UNKNOWN, KEY_UNKNOWN, KEY_UNKNOWN, KEY_UNKNOWN, KEY_UNKNOWN, - KEY_FAVORITES, /* Favorite app, 0x311 */ + KEY_BOOKMARKS, /* Favorite app, 0x311 */ KEY_RESERVED,/* Clipping tool */ KEY_CALC,/* Calculator (above numpad, P52) */ KEY_BLUETOOTH, /* Bluetooth */ -- 2.9.5
Re: [PATCH v8 0/3] x86/boot/KASLR: Parse ACPI table and limit kaslr in immovable memory
On Thu, Oct 11, 2018 at 08:29:55AM +0800, Baoquan He wrote: >On 10/10/18 at 03:44pm, Masayoshi Mizuma wrote: >> On Wed, Oct 10, 2018 at 05:30:57PM +0800, Baoquan He wrote: >> > On 10/10/18 at 11:19am, Borislav Petkov wrote: >> > > On Wed, Oct 10, 2018 at 11:14:26AM +0200, Thomas Gleixner wrote: >> > > > Yes, it's different, but if the SRAT information is available early, >> > > > then >> > > > the command line parameter can go away because then the required >> > > > information for Masa's problem is available as well. >> > > >> > > Exactly. And I'd prefer we delayed the command line parameter until we >> > > figure out we really need it and not expose it to upstream and then >> > > remove it shortly after. >> > > >> > > So I'd suggest we move Masa's patches to a separate branch and not send >> > > it up this round. >> > >> > Yes, sounds more reasonable if we can reuse functions in Chao's patch 1/3 >> > to solve the padding issue. >> >> Thanks for your comments! Yes, immovable_mem[num_immovable_mem] in Chao's >> patch may be useful for calculating the padding size. If so, we don't >> need the new kernel parameter. It's nice! >> >> Do you happen to have ideas how we access immovable_mem[num_immovable_mem] >> from arch/x86/mm/kaslr.c ? It is located to arch/x86/boot/compressed/*, so >> I suppose it is not easy to access it... >> I would appreciate if you could give some advice. > >Hmm, they are living in different life cycle and space. So we can only >reuse the code from Chao's patch, but not the variable. Means we need >go through the SRAT accessing again in arch/x86/mm/kaslr.c and fill >immovable_mem[num_immovable_mem] for mm/kaslr.c usage, if we decide to >do like that. Reading three times is redundant, but reading two times is needed. Becasue the ACPI code run very stable, but we need more information before that. As for Masa's issue, I am wondering whether we can tranfer the information or only the address of SRAT table from compressed period to the period after start_kernel(). Thanks, Chao Fan > >Thanks >Baoquan > >
Re: [PATCH v8 0/3] x86/boot/KASLR: Parse ACPI table and limit kaslr in immovable memory
On Thu, Oct 11, 2018 at 08:29:55AM +0800, Baoquan He wrote: >On 10/10/18 at 03:44pm, Masayoshi Mizuma wrote: >> On Wed, Oct 10, 2018 at 05:30:57PM +0800, Baoquan He wrote: >> > On 10/10/18 at 11:19am, Borislav Petkov wrote: >> > > On Wed, Oct 10, 2018 at 11:14:26AM +0200, Thomas Gleixner wrote: >> > > > Yes, it's different, but if the SRAT information is available early, >> > > > then >> > > > the command line parameter can go away because then the required >> > > > information for Masa's problem is available as well. >> > > >> > > Exactly. And I'd prefer we delayed the command line parameter until we >> > > figure out we really need it and not expose it to upstream and then >> > > remove it shortly after. >> > > >> > > So I'd suggest we move Masa's patches to a separate branch and not send >> > > it up this round. >> > >> > Yes, sounds more reasonable if we can reuse functions in Chao's patch 1/3 >> > to solve the padding issue. >> >> Thanks for your comments! Yes, immovable_mem[num_immovable_mem] in Chao's >> patch may be useful for calculating the padding size. If so, we don't >> need the new kernel parameter. It's nice! >> >> Do you happen to have ideas how we access immovable_mem[num_immovable_mem] >> from arch/x86/mm/kaslr.c ? It is located to arch/x86/boot/compressed/*, so >> I suppose it is not easy to access it... >> I would appreciate if you could give some advice. > >Hmm, they are living in different life cycle and space. So we can only >reuse the code from Chao's patch, but not the variable. Means we need >go through the SRAT accessing again in arch/x86/mm/kaslr.c and fill >immovable_mem[num_immovable_mem] for mm/kaslr.c usage, if we decide to >do like that. Reading three times is redundant, but reading two times is needed. Becasue the ACPI code run very stable, but we need more information before that. As for Masa's issue, I am wondering whether we can tranfer the information or only the address of SRAT table from compressed period to the period after start_kernel(). Thanks, Chao Fan > >Thanks >Baoquan > >
KASAN: use-after-free Read in __llc_lookup_established
Hello, syzbot found the following crash on: HEAD commit:3d647e62686f Merge tag 's390-4.19-4' of git://git.kernel.o.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1707d80940 kernel config: https://syzkaller.appspot.com/x/.config?x=88e9a8a39dc0be2d dashboard link: https://syzkaller.appspot.com/bug?extid=11e05f04c15e03be5254 compiler: gcc (GCC) 8.0.1 20180413 (experimental) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+11e05f04c15e03be5...@syzkaller.appspotmail.com == BUG: KASAN: use-after-free in llc_estab_match net/llc/llc_conn.c:494 [inline] BUG: KASAN: use-after-free in __llc_lookup_established+0xc80/0xe10 net/llc/llc_conn.c:522 Read of size 1 at addr 8801c5794a7f by task syz-executor3/10277 CPU: 0 PID: 10277 Comm: syz-executor3 Not tainted 4.19.0-rc7+ #55 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113 print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412 net_ratelimit: 9 callbacks suppressed openvswitch: netlink: Key type 12288 is out of range max 29 __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:430 llc_estab_match net/llc/llc_conn.c:494 [inline] __llc_lookup_established+0xc80/0xe10 net/llc/llc_conn.c:522 openvswitch: netlink: Key type 12288 is out of range max 29 llc_lookup_established+0x36/0x60 net/llc/llc_conn.c:554 llc_ui_bind+0x810/0xdd0 net/llc/af_llc.c:381 __sys_bind+0x331/0x440 net/socket.c:1483 __do_sys_bind net/socket.c:1494 [inline] __se_sys_bind net/socket.c:1492 [inline] __x64_sys_bind+0x73/0xb0 net/socket.c:1492 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457579 Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7f2a18100c78 EFLAGS: 0246 ORIG_RAX: 0031 RAX: ffda RBX: 0003 RCX: 00457579 RDX: 0010 RSI: 2040 RDI: 0006 RBP: 0072bf00 R08: R09: R10: R11: 0246 R12: 7f2a181016d4 R13: 004bd718 R14: 004cbfe0 R15: Allocated by task 10278: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 __do_kmalloc mm/slab.c:3718 [inline] __kmalloc+0x14e/0x760 mm/slab.c:3727 kmalloc include/linux/slab.h:518 [inline] sk_prot_alloc+0x1b0/0x2e0 net/core/sock.c:1468 sk_alloc+0x10d/0x1690 net/core/sock.c:1522 llc_sk_alloc+0x35/0x4b0 net/llc/llc_conn.c:949 llc_ui_create+0x142/0x520 net/llc/af_llc.c:173 __sock_create+0x536/0x930 net/socket.c:1277 sock_create net/socket.c:1317 [inline] __sys_socket+0x106/0x260 net/socket.c:1347 __do_sys_socket net/socket.c:1356 [inline] __se_sys_socket net/socket.c:1354 [inline] __x64_sys_socket+0x73/0xb0 net/socket.c:1354 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 10276: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kfree+0xcf/0x230 mm/slab.c:3813 sk_prot_free net/core/sock.c:1505 [inline] __sk_destruct+0x797/0xa80 net/core/sock.c:1587 sk_destruct+0x78/0x90 net/core/sock.c:1595 __sk_free+0xcf/0x300 net/core/sock.c:1606 sk_free+0x42/0x50 net/core/sock.c:1617 sock_put include/net/sock.h:1691 [inline] llc_sk_free+0x9d/0xb0 net/llc/llc_conn.c:1017 llc_ui_release+0x161/0x2a0 net/llc/af_llc.c:218 __sock_release+0xd7/0x250 net/socket.c:579 sock_close+0x19/0x20 net/socket.c:1141 __fput+0x385/0xa30 fs/file_table.c:278 fput+0x15/0x20 fs/file_table.c:309 task_work_run+0x1e8/0x2a0 kernel/task_work.c:113 tracehook_notify_resume include/linux/tracehook.h:193 [inline] exit_to_usermode_loop+0x318/0x380 arch/x86/entry/common.c:166 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline] syscall_return_slowpath arch/x86/entry/common.c:268 [inline] do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at 8801c5794600 which belongs to the cache kmalloc-2048 of size 2048 The buggy address is located 1151 bytes inside of 2048-byte region [8801c5794600, 8801c5794e00) The buggy address belongs to the page:
KASAN: use-after-free Read in __llc_lookup_established
Hello, syzbot found the following crash on: HEAD commit:3d647e62686f Merge tag 's390-4.19-4' of git://git.kernel.o.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1707d80940 kernel config: https://syzkaller.appspot.com/x/.config?x=88e9a8a39dc0be2d dashboard link: https://syzkaller.appspot.com/bug?extid=11e05f04c15e03be5254 compiler: gcc (GCC) 8.0.1 20180413 (experimental) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+11e05f04c15e03be5...@syzkaller.appspotmail.com == BUG: KASAN: use-after-free in llc_estab_match net/llc/llc_conn.c:494 [inline] BUG: KASAN: use-after-free in __llc_lookup_established+0xc80/0xe10 net/llc/llc_conn.c:522 Read of size 1 at addr 8801c5794a7f by task syz-executor3/10277 CPU: 0 PID: 10277 Comm: syz-executor3 Not tainted 4.19.0-rc7+ #55 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113 print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412 net_ratelimit: 9 callbacks suppressed openvswitch: netlink: Key type 12288 is out of range max 29 __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:430 llc_estab_match net/llc/llc_conn.c:494 [inline] __llc_lookup_established+0xc80/0xe10 net/llc/llc_conn.c:522 openvswitch: netlink: Key type 12288 is out of range max 29 llc_lookup_established+0x36/0x60 net/llc/llc_conn.c:554 llc_ui_bind+0x810/0xdd0 net/llc/af_llc.c:381 __sys_bind+0x331/0x440 net/socket.c:1483 __do_sys_bind net/socket.c:1494 [inline] __se_sys_bind net/socket.c:1492 [inline] __x64_sys_bind+0x73/0xb0 net/socket.c:1492 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457579 Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7f2a18100c78 EFLAGS: 0246 ORIG_RAX: 0031 RAX: ffda RBX: 0003 RCX: 00457579 RDX: 0010 RSI: 2040 RDI: 0006 RBP: 0072bf00 R08: R09: R10: R11: 0246 R12: 7f2a181016d4 R13: 004bd718 R14: 004cbfe0 R15: Allocated by task 10278: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 __do_kmalloc mm/slab.c:3718 [inline] __kmalloc+0x14e/0x760 mm/slab.c:3727 kmalloc include/linux/slab.h:518 [inline] sk_prot_alloc+0x1b0/0x2e0 net/core/sock.c:1468 sk_alloc+0x10d/0x1690 net/core/sock.c:1522 llc_sk_alloc+0x35/0x4b0 net/llc/llc_conn.c:949 llc_ui_create+0x142/0x520 net/llc/af_llc.c:173 __sock_create+0x536/0x930 net/socket.c:1277 sock_create net/socket.c:1317 [inline] __sys_socket+0x106/0x260 net/socket.c:1347 __do_sys_socket net/socket.c:1356 [inline] __se_sys_socket net/socket.c:1354 [inline] __x64_sys_socket+0x73/0xb0 net/socket.c:1354 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 10276: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kfree+0xcf/0x230 mm/slab.c:3813 sk_prot_free net/core/sock.c:1505 [inline] __sk_destruct+0x797/0xa80 net/core/sock.c:1587 sk_destruct+0x78/0x90 net/core/sock.c:1595 __sk_free+0xcf/0x300 net/core/sock.c:1606 sk_free+0x42/0x50 net/core/sock.c:1617 sock_put include/net/sock.h:1691 [inline] llc_sk_free+0x9d/0xb0 net/llc/llc_conn.c:1017 llc_ui_release+0x161/0x2a0 net/llc/af_llc.c:218 __sock_release+0xd7/0x250 net/socket.c:579 sock_close+0x19/0x20 net/socket.c:1141 __fput+0x385/0xa30 fs/file_table.c:278 fput+0x15/0x20 fs/file_table.c:309 task_work_run+0x1e8/0x2a0 kernel/task_work.c:113 tracehook_notify_resume include/linux/tracehook.h:193 [inline] exit_to_usermode_loop+0x318/0x380 arch/x86/entry/common.c:166 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline] syscall_return_slowpath arch/x86/entry/common.c:268 [inline] do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at 8801c5794600 which belongs to the cache kmalloc-2048 of size 2048 The buggy address is located 1151 bytes inside of 2048-byte region [8801c5794600, 8801c5794e00) The buggy address belongs to the page:
Re: [PATCH] mfd: remove redundant 'default n' from Kconfig
On Wed, 10 Oct 2018, Bartlomiej Zolnierkiewicz wrote: > 'default n' is the default value for any bool or tristate Kconfig > setting so there is no need to write it explicitly. > > Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO > is not set' for visible symbols") the Kconfig behavior is the same > regardless of 'default n' being present or not: > > ... > One side effect of (and the main motivation for) this change is making > the following two definitions behave exactly the same: > > config FOO > bool > > config FOO > bool > default n > > With this change, neither of these will generate a > '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied). > That might make it clearer to people that a bare 'default n' is > redundant. > ... > > Signed-off-by: Bartlomiej Zolnierkiewicz > --- > drivers/mfd/Kconfig |6 -- > 1 file changed, 6 deletions(-) The change looks okay to me, but I would also like you to include the Maintainers/Reviewers for the affected source files. Also, I assume you are not just submitting these changes to the MFD subsystem. My suggesting is to change each subsystem per patch (as you have done here), and submit them in one patch-set with each of the subsystem Maintainers included, so each of us has some visibility into how the general idea is being received. > Index: b/drivers/mfd/Kconfig > === > --- a/drivers/mfd/Kconfig 2018-10-09 15:58:40.547122978 +0200 > +++ b/drivers/mfd/Kconfig 2018-10-10 16:49:37.575915230 +0200 > @@ -8,7 +8,6 @@ menu "Multifunction device drivers" > config MFD_CORE > tristate > select IRQ_DOMAIN > - default n > > config MFD_CS5535 > tristate "AMD CS5535 and CS5536 southbridge core functions" > @@ -870,7 +869,6 @@ config MFD_VIPERBOARD > tristate "Nano River Technologies Viperboard" > select MFD_CORE > depends on USB > - default n > help > Say yes here if you want support for Nano River Technologies > Viperboard. > @@ -1575,7 +1573,6 @@ config MFD_TWL4030_AUDIO > bool "TI TWL4030 Audio" > depends on TWL4030_CORE > select MFD_CORE > - default n > > config TWL6040_CORE > bool "TI TWL6040 audio codec" > @@ -1583,7 +1580,6 @@ config TWL6040_CORE > select MFD_CORE > select REGMAP_I2C > select REGMAP_IRQ > - default n > help > Say yes here if you want support for Texas Instruments TWL6040 audio > codec. > @@ -1605,7 +1601,6 @@ config MFD_WL1273_CORE > tristate "TI WL1273 FM radio" > depends on I2C > select MFD_CORE > - default n > help > This is the core driver for the TI WL1273 FM radio. This MFD > driver connects the radio-wl1273 V4L2 module and the wl1273 > @@ -1649,7 +1644,6 @@ config MFD_TC3589X > > config MFD_TMIO > bool > - default n > > config MFD_T7L66XB > bool "Toshiba T7L66XB" -- Lee Jones [李琼斯] Linaro Services Technical Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
Re: [PATCH] mfd: remove redundant 'default n' from Kconfig
On Wed, 10 Oct 2018, Bartlomiej Zolnierkiewicz wrote: > 'default n' is the default value for any bool or tristate Kconfig > setting so there is no need to write it explicitly. > > Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO > is not set' for visible symbols") the Kconfig behavior is the same > regardless of 'default n' being present or not: > > ... > One side effect of (and the main motivation for) this change is making > the following two definitions behave exactly the same: > > config FOO > bool > > config FOO > bool > default n > > With this change, neither of these will generate a > '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied). > That might make it clearer to people that a bare 'default n' is > redundant. > ... > > Signed-off-by: Bartlomiej Zolnierkiewicz > --- > drivers/mfd/Kconfig |6 -- > 1 file changed, 6 deletions(-) The change looks okay to me, but I would also like you to include the Maintainers/Reviewers for the affected source files. Also, I assume you are not just submitting these changes to the MFD subsystem. My suggesting is to change each subsystem per patch (as you have done here), and submit them in one patch-set with each of the subsystem Maintainers included, so each of us has some visibility into how the general idea is being received. > Index: b/drivers/mfd/Kconfig > === > --- a/drivers/mfd/Kconfig 2018-10-09 15:58:40.547122978 +0200 > +++ b/drivers/mfd/Kconfig 2018-10-10 16:49:37.575915230 +0200 > @@ -8,7 +8,6 @@ menu "Multifunction device drivers" > config MFD_CORE > tristate > select IRQ_DOMAIN > - default n > > config MFD_CS5535 > tristate "AMD CS5535 and CS5536 southbridge core functions" > @@ -870,7 +869,6 @@ config MFD_VIPERBOARD > tristate "Nano River Technologies Viperboard" > select MFD_CORE > depends on USB > - default n > help > Say yes here if you want support for Nano River Technologies > Viperboard. > @@ -1575,7 +1573,6 @@ config MFD_TWL4030_AUDIO > bool "TI TWL4030 Audio" > depends on TWL4030_CORE > select MFD_CORE > - default n > > config TWL6040_CORE > bool "TI TWL6040 audio codec" > @@ -1583,7 +1580,6 @@ config TWL6040_CORE > select MFD_CORE > select REGMAP_I2C > select REGMAP_IRQ > - default n > help > Say yes here if you want support for Texas Instruments TWL6040 audio > codec. > @@ -1605,7 +1601,6 @@ config MFD_WL1273_CORE > tristate "TI WL1273 FM radio" > depends on I2C > select MFD_CORE > - default n > help > This is the core driver for the TI WL1273 FM radio. This MFD > driver connects the radio-wl1273 V4L2 module and the wl1273 > @@ -1649,7 +1644,6 @@ config MFD_TC3589X > > config MFD_TMIO > bool > - default n > > config MFD_T7L66XB > bool "Toshiba T7L66XB" -- Lee Jones [李琼斯] Linaro Services Technical Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
Re: [PATCH net-next v7 28/28] net: WireGuard secure network tunnel
Wed, Oct 10, 2018 at 10:27:46PM CEST, ja...@zx2c4.com wrote: >Hey Jiri, > >Actually, in the end I went with the suggestion from Andrew and Lukas, >which is to follow Dan's guideline: >https://lkml.org/lkml/2016/8/22/374 . It looks like this: > >https://git.kernel.org/pub/scm/linux/kernel/git/zx2c4/linux.git/tree/drivers/net/wireguard/device.c?h=jd/wireguard#n280 I prefer: err = do_something(); if (err) goto err_do_something; But your style is also quite common. Up to you, I guess. > >Jason
Re: [PATCH net-next v7 28/28] net: WireGuard secure network tunnel
Wed, Oct 10, 2018 at 10:27:46PM CEST, ja...@zx2c4.com wrote: >Hey Jiri, > >Actually, in the end I went with the suggestion from Andrew and Lukas, >which is to follow Dan's guideline: >https://lkml.org/lkml/2016/8/22/374 . It looks like this: > >https://git.kernel.org/pub/scm/linux/kernel/git/zx2c4/linux.git/tree/drivers/net/wireguard/device.c?h=jd/wireguard#n280 I prefer: err = do_something(); if (err) goto err_do_something; But your style is also quite common. Up to you, I guess. > >Jason
Re: [PATCH 2/2] target: Fix target_wait_for_sess_cmds breakage with active signals
Hey Peter & Co, On Wed, 2018-10-10 at 10:43 +0200, Peter Zijlstra wrote: > On Wed, Oct 10, 2018 at 03:23:10AM +, Nicholas A. Bellinger wrote: > > From: Nicholas Bellinger > > > > With the addition of commit 00d909a107 in v4.19-rc, it incorrectly assumes > > no > > signals will be pending for task_struct executing the normal session > > shutdown > > and I/O quiesce code-path. > > > > For example, iscsi-target and iser-target issue SIGINT to all kthreads as > > part of session shutdown. This has been the behaviour since day one. > > Not knowing much context here; but does it make sense for those > kthreads to handle signals, ever? Most kthreads should be fine with > ignore_signals(). > iscsi-target + ib-isert uses SIGINT amongst dedicated rx/tx connection kthreads to signal connection shutdown, requiring in-flight se_cmd I/O descriptors to be quiesced before making forward progress to release se_session. By the point wait_event_lock_irq_timeout() is called in the example here, one of the two rx/tx connection kthreads has been stopped, and the other kthread is still processing shutdown. So while historically the pending SIGINTs where not cleared (or ignored) during shutdown at this point, there is no reason why they could not be ignored for iscsi-target + ib-isert. That said, pre commit 00d909a107 code always used wait_for_completion() and ignored pending signals. As-is target_wait_for_sess_cmds() is called directly from fabric driver code and in one case also from user-space via configfs_write_file(), so AFAICT it does need TASK_UNINTERRUPTIBLE.
Re: [GIT PULL] xfs: fixes for v4.19-rc7
On Thu, Oct 11, 2018 at 10:55:34AM +1100, Dave Chinner wrote: > Hi Greg, > > Can you please pull the XFS update from the tag listed below? This > contains the fix for the clone_file_range data corruption issue I > mentioned in my -rc6 pull request (zero post-eof blocks), as well as > fixes for several other equally serious problems we found while > auditing the clone/dedupe ioctls for other issues. The rest of the > problems we found (there were a *lot*) will be addressed in the 4.20 > cycle. > > Cheers, > > Dave. > > The following changes since commit e55ec4ddbef9897199c307dfb23167e3801fdaf5: > > xfs: fix error handling in xfs_bmap_extents_to_btree (2018-10-01 08:11:07 > +1000) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/fs/xfs/xfs-linux tags/xfs-fixes-for-4.19-rc7 Now merged, thanks. greg k-h
Re: [PATCH 2/2] target: Fix target_wait_for_sess_cmds breakage with active signals
Hey Peter & Co, On Wed, 2018-10-10 at 10:43 +0200, Peter Zijlstra wrote: > On Wed, Oct 10, 2018 at 03:23:10AM +, Nicholas A. Bellinger wrote: > > From: Nicholas Bellinger > > > > With the addition of commit 00d909a107 in v4.19-rc, it incorrectly assumes > > no > > signals will be pending for task_struct executing the normal session > > shutdown > > and I/O quiesce code-path. > > > > For example, iscsi-target and iser-target issue SIGINT to all kthreads as > > part of session shutdown. This has been the behaviour since day one. > > Not knowing much context here; but does it make sense for those > kthreads to handle signals, ever? Most kthreads should be fine with > ignore_signals(). > iscsi-target + ib-isert uses SIGINT amongst dedicated rx/tx connection kthreads to signal connection shutdown, requiring in-flight se_cmd I/O descriptors to be quiesced before making forward progress to release se_session. By the point wait_event_lock_irq_timeout() is called in the example here, one of the two rx/tx connection kthreads has been stopped, and the other kthread is still processing shutdown. So while historically the pending SIGINTs where not cleared (or ignored) during shutdown at this point, there is no reason why they could not be ignored for iscsi-target + ib-isert. That said, pre commit 00d909a107 code always used wait_for_completion() and ignored pending signals. As-is target_wait_for_sess_cmds() is called directly from fabric driver code and in one case also from user-space via configfs_write_file(), so AFAICT it does need TASK_UNINTERRUPTIBLE.
Re: [GIT PULL] xfs: fixes for v4.19-rc7
On Thu, Oct 11, 2018 at 10:55:34AM +1100, Dave Chinner wrote: > Hi Greg, > > Can you please pull the XFS update from the tag listed below? This > contains the fix for the clone_file_range data corruption issue I > mentioned in my -rc6 pull request (zero post-eof blocks), as well as > fixes for several other equally serious problems we found while > auditing the clone/dedupe ioctls for other issues. The rest of the > problems we found (there were a *lot*) will be addressed in the 4.20 > cycle. > > Cheers, > > Dave. > > The following changes since commit e55ec4ddbef9897199c307dfb23167e3801fdaf5: > > xfs: fix error handling in xfs_bmap_extents_to_btree (2018-10-01 08:11:07 > +1000) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/fs/xfs/xfs-linux tags/xfs-fixes-for-4.19-rc7 Now merged, thanks. greg k-h
Re: [PATCH 05.1/16] of:overlay: missing name, phandle, linux,phandle in new nodes
On 10/10/18 14:03, Frank Rowand wrote: > On 10/10/18 13:40, Alan Tull wrote: >> On Wed, Oct 10, 2018 at 1:49 AM Frank Rowand wrote: >>> >>> On 10/09/18 23:04, frowand.l...@gmail.com wrote: From: Frank Rowand "of: overlay: use prop add changeset entry for property in new nodes" fixed a problem where an 'update property' changeset entry was created for properties contained in nodes added by a changeset. The fix was to use an 'add property' changeset entry. This exposed more bugs in the apply overlay code. The properties 'name', 'phandle', and 'linux,phandle' were filtered out by add_changeset_property() as special properties. Change the filter to be only for existing nodes, not newly added nodes. The second bug is that the 'name' property does not exist in the newest FDT version, and has to be constructed from the node's full_name. Construct an 'add property' changeset entry for newly added nodes. Signed-off-by: Frank Rowand --- Hi Alan, Thanks for reporting the problem with missing node names. I was able to replicate the problem, and have created this preliminary version of a patch to fix the problem. I have not extensively reviewed the patch yet, but would appreciate if you can confirm this fixes your problem. I created this patch as patch 17 of the series, but have also applied it as patch 05.1, immediately after patch 05/16, and built the kernel, booted, and verified name and phandle for one of the nodes in a unittest overlay for both cases. So minimal testing so far on my part. I have not verified whether the series builds and boots after each of patches 06..16 if this patch is applied as patch 05.1. There is definitely more work needed for me to complete this patch because it allocates some more memory, but does not yet free it when the overlay is released. -Frank drivers/of/overlay.c | 72 1 file changed, 67 insertions(+), 5 deletions(-) diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c index 0b0904f44bc7..9746cea2aa91 100644 --- a/drivers/of/overlay.c +++ b/drivers/of/overlay.c @@ -301,10 +301,11 @@ static int add_changeset_property(struct overlay_changeset *ovcs, struct property *new_prop = NULL, *prop; int ret = 0; - if (!of_prop_cmp(overlay_prop->name, "name") || - !of_prop_cmp(overlay_prop->name, "phandle") || - !of_prop_cmp(overlay_prop->name, "linux,phandle")) - return 0; + if (target->in_livetree) + if (!of_prop_cmp(overlay_prop->name, "name") || + !of_prop_cmp(overlay_prop->name, "phandle") || + !of_prop_cmp(overlay_prop->name, "linux,phandle")) + return 0; >>> >>> This is a big hammer patch. >>> >>> Nobody should waste time reviewing this patch. >> >> I wasn't clear if you still could use the testing so I did re-run my >> test. This patch adds back some of the missing properties, but the >> the kobject names aren't set as dev_name() returns NULL: >> >> * without this patch some of_node properties don't show up in sysfs: >> root@arria10:~# ls >> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node >> clockscompatibleinterrupt-parent interruptsreg >> >> * with this patch, the of_node properties phandle and name are back: >> root@arria10:~# ls >> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node >> clockscompatibleinterrupt-parent interrupts >> name phandle reg > > Thanks for the testing. I'll keep chasing after this problem today. > > This is useful data for me as I was not looking under the /sys/bus/... > tree that you reported, but was instead looking at /proc/device-tree/... > which showed the same type of problem since the overlay I was using > does not show up under /sys/bus/... > > I'll have to create a useful overlay test case that will show up under > /sys/bus/... > > In the meantime, can you send me the base FDT and the overlay FDT for > your test case? I now have a test case that shows the problem under /sys/bus/... so I no longer need the base FDT and overlay FDT for your test case. I have determined the location that sets the name to "" but do not have the fix yet. Still working on that. -Frank > > Thanks, > > Frank > > >> >> root@arria10:~# cat >> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node/name >> freeze_controllerroot@arria10:~# ("freeze_controller" w/o the \n so >> the name is correct) >> >> * with or without the patch I see the behavior I reported yesterday, >> kobj names are NULL. >> root@arria10:~# ls
Re: fore200e DMA cleanups and fixes
From: Christoph Hellwig Date: Tue, 9 Oct 2018 16:57:13 +0200 > The fore200e driver came up during some dma-related audits, so > here is the fallout. Compile tested (x86 & sparc) only. Series applied to net-next.
Re: [PATCH 05.1/16] of:overlay: missing name, phandle, linux,phandle in new nodes
On 10/10/18 14:03, Frank Rowand wrote: > On 10/10/18 13:40, Alan Tull wrote: >> On Wed, Oct 10, 2018 at 1:49 AM Frank Rowand wrote: >>> >>> On 10/09/18 23:04, frowand.l...@gmail.com wrote: From: Frank Rowand "of: overlay: use prop add changeset entry for property in new nodes" fixed a problem where an 'update property' changeset entry was created for properties contained in nodes added by a changeset. The fix was to use an 'add property' changeset entry. This exposed more bugs in the apply overlay code. The properties 'name', 'phandle', and 'linux,phandle' were filtered out by add_changeset_property() as special properties. Change the filter to be only for existing nodes, not newly added nodes. The second bug is that the 'name' property does not exist in the newest FDT version, and has to be constructed from the node's full_name. Construct an 'add property' changeset entry for newly added nodes. Signed-off-by: Frank Rowand --- Hi Alan, Thanks for reporting the problem with missing node names. I was able to replicate the problem, and have created this preliminary version of a patch to fix the problem. I have not extensively reviewed the patch yet, but would appreciate if you can confirm this fixes your problem. I created this patch as patch 17 of the series, but have also applied it as patch 05.1, immediately after patch 05/16, and built the kernel, booted, and verified name and phandle for one of the nodes in a unittest overlay for both cases. So minimal testing so far on my part. I have not verified whether the series builds and boots after each of patches 06..16 if this patch is applied as patch 05.1. There is definitely more work needed for me to complete this patch because it allocates some more memory, but does not yet free it when the overlay is released. -Frank drivers/of/overlay.c | 72 1 file changed, 67 insertions(+), 5 deletions(-) diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c index 0b0904f44bc7..9746cea2aa91 100644 --- a/drivers/of/overlay.c +++ b/drivers/of/overlay.c @@ -301,10 +301,11 @@ static int add_changeset_property(struct overlay_changeset *ovcs, struct property *new_prop = NULL, *prop; int ret = 0; - if (!of_prop_cmp(overlay_prop->name, "name") || - !of_prop_cmp(overlay_prop->name, "phandle") || - !of_prop_cmp(overlay_prop->name, "linux,phandle")) - return 0; + if (target->in_livetree) + if (!of_prop_cmp(overlay_prop->name, "name") || + !of_prop_cmp(overlay_prop->name, "phandle") || + !of_prop_cmp(overlay_prop->name, "linux,phandle")) + return 0; >>> >>> This is a big hammer patch. >>> >>> Nobody should waste time reviewing this patch. >> >> I wasn't clear if you still could use the testing so I did re-run my >> test. This patch adds back some of the missing properties, but the >> the kobject names aren't set as dev_name() returns NULL: >> >> * without this patch some of_node properties don't show up in sysfs: >> root@arria10:~# ls >> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node >> clockscompatibleinterrupt-parent interruptsreg >> >> * with this patch, the of_node properties phandle and name are back: >> root@arria10:~# ls >> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node >> clockscompatibleinterrupt-parent interrupts >> name phandle reg > > Thanks for the testing. I'll keep chasing after this problem today. > > This is useful data for me as I was not looking under the /sys/bus/... > tree that you reported, but was instead looking at /proc/device-tree/... > which showed the same type of problem since the overlay I was using > does not show up under /sys/bus/... > > I'll have to create a useful overlay test case that will show up under > /sys/bus/... > > In the meantime, can you send me the base FDT and the overlay FDT for > your test case? I now have a test case that shows the problem under /sys/bus/... so I no longer need the base FDT and overlay FDT for your test case. I have determined the location that sets the name to "" but do not have the fix yet. Still working on that. -Frank > > Thanks, > > Frank > > >> >> root@arria10:~# cat >> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node/name >> freeze_controllerroot@arria10:~# ("freeze_controller" w/o the \n so >> the name is correct) >> >> * with or without the patch I see the behavior I reported yesterday, >> kobj names are NULL. >> root@arria10:~# ls
Re: fore200e DMA cleanups and fixes
From: Christoph Hellwig Date: Tue, 9 Oct 2018 16:57:13 +0200 > The fore200e driver came up during some dma-related audits, so > here is the fallout. Compile tested (x86 & sparc) only. Series applied to net-next.
Re: [PATCH net-next V3] virtio_net: ethtool tx napi configuration
From: Jason Wang Date: Tue, 9 Oct 2018 10:06:26 +0800 > Implement ethtool .set_coalesce (-C) and .get_coalesce (-c) handlers. > Interrupt moderation is currently not supported, so these accept and > display the default settings of 0 usec and 1 frame. > > Toggle tx napi through setting tx-frames. So as to not interfere > with possible future interrupt moderation, value 1 means tx napi while > value 0 means not. > > Only allow the switching when device is down for simplicity. > > Link: https://patchwork.ozlabs.org/patch/948149/ > Suggested-by: Jason Wang > Signed-off-by: Willem de Bruijn > Signed-off-by: Jason Wang > --- > Changes from V2: > - only allow the switching when device is done > - remove unnecessary global variable and initialization > Changes from V1: > - try to synchronize with datapath to allow changing mode when > interface is up. > - use tx-frames 0 as to disable tx napi while tx-frames 1 to enable tx napi Applied, with... > + bool running = netif_running(dev); this unused variable removed.
Re: netconsole warning in 4.19.0-rc7
(Cc'ing Dave) On Wed, Oct 10, 2018 at 5:14 AM Meelis Roos wrote: > > Thies 4.19-rc7 on a bunch of test machines and got this warning from one. > It is reproducible and I have not noticed it before. > [...] > [9.914805] WARNING: CPU: 0 PID: 0 at kernel/softirq.c:168 > __local_bh_enable_ip+0x2e/0x44 > [9.914806] Modules linked in: > [9.914808] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc7 #210 > [9.914810] Hardware name: MicroLink /D850MV > , BIOS MV85010A.86A.0067.P24.0304081124 04/08/2003 > [9.914811] EIP: __local_bh_enable_ip+0x2e/0x44 > [9.914813] Code: cc 02 5f c8 a9 00 00 0f 00 75 1f 83 ea 01 f7 da 01 15 cc > 02 5f c8 a1 cc 02 5f c8 a9 00 ff 1f 00 74 0c ff 0d cc 02 5f c8 5d c3 <0f> 0b > eb dd 66 a1 80 cd 5e c8 66 85 c0 74 e9 e8 87 ff ff ff eb e2 > [9.914814] EAX: 80010200 EBX: f602b000 ECX: 36346270 EDX: 0200 > [9.914815] ESI: f620ecc0 EDI: f620ebac EBP: f600de40 ESP: f600de40 > [9.914816] DS: 007b ES: 007b FS: GS: 00e0 SS: 0068 EFLAGS: 00010006 > [9.914817] CR0: 80050033 CR2: b7f5f000 CR3: 36389000 CR4: 06d0 > [9.914818] Call Trace: > [9.914819] > [9.914820] netpoll_send_skb_on_dev+0xa5/0x1b0 This is exactly what I mentioned in my review here: https://marc.info/?l=linux-netdev=153816136624679=2 "But irq is disabled here, so not sure if rcu_read_lock_bh() could cause trouble... "
Re: [PATCH net-next V3] virtio_net: ethtool tx napi configuration
From: Jason Wang Date: Tue, 9 Oct 2018 10:06:26 +0800 > Implement ethtool .set_coalesce (-C) and .get_coalesce (-c) handlers. > Interrupt moderation is currently not supported, so these accept and > display the default settings of 0 usec and 1 frame. > > Toggle tx napi through setting tx-frames. So as to not interfere > with possible future interrupt moderation, value 1 means tx napi while > value 0 means not. > > Only allow the switching when device is down for simplicity. > > Link: https://patchwork.ozlabs.org/patch/948149/ > Suggested-by: Jason Wang > Signed-off-by: Willem de Bruijn > Signed-off-by: Jason Wang > --- > Changes from V2: > - only allow the switching when device is done > - remove unnecessary global variable and initialization > Changes from V1: > - try to synchronize with datapath to allow changing mode when > interface is up. > - use tx-frames 0 as to disable tx napi while tx-frames 1 to enable tx napi Applied, with... > + bool running = netif_running(dev); this unused variable removed.
Re: netconsole warning in 4.19.0-rc7
(Cc'ing Dave) On Wed, Oct 10, 2018 at 5:14 AM Meelis Roos wrote: > > Thies 4.19-rc7 on a bunch of test machines and got this warning from one. > It is reproducible and I have not noticed it before. > [...] > [9.914805] WARNING: CPU: 0 PID: 0 at kernel/softirq.c:168 > __local_bh_enable_ip+0x2e/0x44 > [9.914806] Modules linked in: > [9.914808] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc7 #210 > [9.914810] Hardware name: MicroLink /D850MV > , BIOS MV85010A.86A.0067.P24.0304081124 04/08/2003 > [9.914811] EIP: __local_bh_enable_ip+0x2e/0x44 > [9.914813] Code: cc 02 5f c8 a9 00 00 0f 00 75 1f 83 ea 01 f7 da 01 15 cc > 02 5f c8 a1 cc 02 5f c8 a9 00 ff 1f 00 74 0c ff 0d cc 02 5f c8 5d c3 <0f> 0b > eb dd 66 a1 80 cd 5e c8 66 85 c0 74 e9 e8 87 ff ff ff eb e2 > [9.914814] EAX: 80010200 EBX: f602b000 ECX: 36346270 EDX: 0200 > [9.914815] ESI: f620ecc0 EDI: f620ebac EBP: f600de40 ESP: f600de40 > [9.914816] DS: 007b ES: 007b FS: GS: 00e0 SS: 0068 EFLAGS: 00010006 > [9.914817] CR0: 80050033 CR2: b7f5f000 CR3: 36389000 CR4: 06d0 > [9.914818] Call Trace: > [9.914819] > [9.914820] netpoll_send_skb_on_dev+0xa5/0x1b0 This is exactly what I mentioned in my review here: https://marc.info/?l=linux-netdev=153816136624679=2 "But irq is disabled here, so not sure if rcu_read_lock_bh() could cause trouble... "
[PATCH v10 0/3] powerpc: Detection and scheduler optimization for POWER9 bigcore
From: "Gautham R. Shenoy" Hi, This is the tenth iteration of the patchset to add support for big-core on POWER9. This patch also optimizes the task placement on such big-core systems. The previous versions can be found here: v9: https://lkml.org/lkml/2018/10/1/608 v8: https://lkml.org/lkml/2018/9/20/899 v7: https://lkml.org/lkml/2018/8/20/52 v6: https://lkml.org/lkml/2018/8/9/119 v5: https://lkml.org/lkml/2018/8/6/587 v4: https://lkml.org/lkml/2018/7/24/79 v3: https://lkml.org/lkml/2018/7/6/255 v2: https://lkml.org/lkml/2018/7/3/401 v1: https://lkml.org/lkml/2018/5/11/245 Changes : v9 --> v10: - Rebased it on v4.19-rc7 - Added a patch to report the correct shared_cpu_map for L1-caches on big-core systems. Description: IBM POWER9 SMT8 cores consists of two groups of small-cores where each group has its own L1 cache, translation cache and instruction-data flow. This can be discovered via the "ibm,thread-groups" CPU property in the device tree. Furthermore, on POWER9 the thread-ids of such a big-core is obtained by interleaving the thread-ids of the two small-cores. Eg: In an SMT8 core with thread ids {0,1,2,3,4,5,6,7}, the thread-ids of the threads in the two small-cores respectively will be {0,2,4,6} and {1,3,5,7} respectively. - |L1 Cache | -- |L2| | | | | | | 0 | 2 | 4 | 6 |Small Core0 |C | | | | | Big|a -- Core |c | | | | | |h | 1 | 3 | 5 | 7 | Small Core1 |e | | | | | - | L1 Cache | -- On such a big-core system, when multiple tasks are scheduled to run on the big-core, we get the best performance when the tasks are spread across the pair of small-cores. Eg: Suppose there 4 tasks {p1, p2, p3, p4} are run on a big core, then An Example of Optimal Task placement: -- | | | | | | 0 | 2 | 4 | 6 | Small Core0 | (p1)| (p2)| | | Big Core -- | | | | | | 1 | 3 | 5 | 7 | Small Core1 | | (p3)| | (p4) | -- An example of Suboptimal Task placement: -- | | | | | | 0 | 2 | 4 | 6 | Small Core0 | (p1)| (p2)| | (p4)| Big Core -- | | | | | | 1 | 3 | 5 | 7 | Small Core1 | | (p3)| | | -- Currently on the big-core systems, the sched domain hierarchy is: SMT : group of CPUs in the SMT8 core. DIE : groups of CPUs on the same die. NUMA : all the CPUs in the system. Thus the scheduler doesn't distinguish between CPUs in the core that share the L1-cache vs the ones that don't resulting in a run-to-run variance when multithreaded applications are run on an SMT8 core. In this patch-set, we address this by defining the sched-domain on the big-core systems to be: SMT : group of CPUs sharing the L1 cache CACHE : group of CPUs in the SMT8 core. DIE : groups of CPUs on the same die. NUMA : all the CPUs in the system. With this, the Linux Kernel load-balancer will ensure that the tasks are spread across all the component small cores in the system, thereby yielding optimum performance. Furthermore, this solution works correctly across all SMT modes (8,4,2), as the interleaved thread-ids ensures that when we go to lower SMT modes (4,2) the threads are offlined in a descending order, thereby leaving equal number of threads from the component small cores online as illustrated below. This patchset contains three patches which on detecting the presence of big-cores, defines the SMT level sched domain to correspond to the threads of the small cores. Patch 1: adds support to detect the presence of big-cores and parses the output of "ibm,thread-groups" device-tree which using which it updates a per-cpu mask named cpu_smallcore_mask Patch 2: Defines the SMT level sched domain to correspond to the threads of the small cores. Patch 3: Added a patch to report the correct shared_cpu_map for L1-caches on big-core systems. Without patch 3: /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 00ff /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 00ff /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00ff /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00ff With patch 3: /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 0055 /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map
[PATCH v10 2/3] powerpc: Use cpu_smallcore_sibling_mask at SMT level on bigcores
From: "Gautham R. Shenoy" POWER9 SMT8 cores consist of two groups of threads, where threads in each group shares L1-cache. The scheduler is not aware of this distinction as the current sched-domain hierarchy has all the threads of the core defined at the SMT domain. SMT [Thread siblings of the SMT8 core] DIE [CPUs in the same die] NUMA [All the CPUs in the system] Due to this, we can observe run-to-run variance when we run a multi-threaded benchmark bound to a single core based on how the scheduler spreads the software threads across the two groups in the core. We fix this in this patch by defining each group of threads which share L1-cache to be the SMT level. The group of threads in the SMT8 core is defined to be the CACHE level. The sched-domain hierarchy after this patch will be : SMT [Thread siblings in the core that share L1 cache] CACHE [Thread siblings that are in the SMT8 core] DIE [CPUs in the same die] NUMA[All the CPUs in the system] Signed-off-by: Gautham R. Shenoy --- arch/powerpc/kernel/smp.c | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 22a14a9..356751e 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -1266,6 +1266,7 @@ static void add_cpu_to_masks(int cpu) void start_secondary(void *unused) { unsigned int cpu = smp_processor_id(); + struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask; mmgrab(_mm); current->active_mm = _mm; @@ -1291,11 +1292,13 @@ void start_secondary(void *unused) /* Update topology CPU masks */ add_cpu_to_masks(cpu); + if (has_big_cores) + sibling_mask = cpu_smallcore_mask; /* * Check for any shared caches. Note that this must be done on a * per-core basis because one core in the pair might be disabled. */ - if (!cpumask_equal(cpu_l2_cache_mask(cpu), cpu_sibling_mask(cpu))) + if (!cpumask_equal(cpu_l2_cache_mask(cpu), sibling_mask(cpu))) shared_caches = true; set_numa_node(numa_cpu_lookup_table[cpu]); @@ -1362,6 +1365,13 @@ static const struct cpumask *shared_cache_mask(int cpu) return cpu_l2_cache_mask(cpu); } +#ifdef CONFIG_SCHED_SMT +static const struct cpumask *smallcore_smt_mask(int cpu) +{ + return cpu_smallcore_mask(cpu); +} +#endif + static struct sched_domain_topology_level power9_topology[] = { #ifdef CONFIG_SCHED_SMT { cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) }, @@ -1389,6 +1399,13 @@ void __init smp_cpus_done(unsigned int max_cpus) shared_proc_topology_init(); dump_numa_cpu_topology(); +#ifdef CONFIG_SCHED_SMT + if (has_big_cores) { + pr_info("Using small cores at SMT level\n"); + power9_topology[0].mask = smallcore_smt_mask; + powerpc_topology[0].mask = smallcore_smt_mask; + } +#endif /* * If any CPU detects that it's sharing a cache with another CPU then * use the deeper topology that is aware of this sharing. -- 1.9.4
[PATCH v10 3/3] powerpc/cacheinfo: Report the correct shared_cpu_map on big-cores
From: "Gautham R. Shenoy" Currently on POWER9 SMT8 cores systems, in sysfs, we report the shared_cache_map for L1 caches (both data and instruction) to be the cpu-ids of the threads in SMT8 cores. This is incorrect since on POWER9 SMT8 cores there are two groups of threads, each of which shares its own L1 cache. This patch addresses this by reporting the shared_cpu_map correctly in sysfs for L1 caches. Before the patch /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 00ff /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 00ff /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00ff /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00ff After the patch /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 0055 /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 0055 /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00aa /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00aa Signed-off-by: Gautham R. Shenoy --- arch/powerpc/kernel/cacheinfo.c | 37 +++-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c index a8f20e5..be57bd0 100644 --- a/arch/powerpc/kernel/cacheinfo.c +++ b/arch/powerpc/kernel/cacheinfo.c @@ -20,6 +20,8 @@ #include #include #include +#include +#include #include "cacheinfo.h" @@ -627,17 +629,48 @@ static ssize_t level_show(struct kobject *k, struct kobj_attribute *attr, char * static struct kobj_attribute cache_level_attr = __ATTR(level, 0444, level_show, NULL); +static unsigned int index_dir_to_cpu(struct cache_index_dir *index) +{ + struct kobject *index_dir_kobj = >kobj; + struct kobject *cache_dir_kobj = index_dir_kobj->parent; + struct kobject *cpu_dev_kobj = cache_dir_kobj->parent; + struct device *dev = kobj_to_dev(cpu_dev_kobj); + + return dev->id; +} + +/* + * On big-core systems, each core has two groups of CPUs each of which + * has its own L1-cache. The thread-siblings which share l1-cache with + * @cpu can be obtained via cpu_smallcore_mask(). + */ +static const struct cpumask *get_big_core_shared_cpu_map(int cpu, struct cache *cache) +{ + if (cache->level == 1) + return cpu_smallcore_mask(cpu); + + return >shared_cpu_map; +} + static ssize_t shared_cpu_map_show(struct kobject *k, struct kobj_attribute *attr, char *buf) { struct cache_index_dir *index; struct cache *cache; - int ret; + const struct cpumask *mask; + int ret, cpu; index = kobj_to_cache_index_dir(k); cache = index->cache; + if (has_big_cores) { + cpu = index_dir_to_cpu(index); + mask = get_big_core_shared_cpu_map(cpu, cache); + } else { + mask = >shared_cpu_map; + } + ret = scnprintf(buf, PAGE_SIZE - 1, "%*pb\n", - cpumask_pr_args(>shared_cpu_map)); + cpumask_pr_args(mask)); buf[ret++] = '\n'; buf[ret] = '\0'; return ret; -- 1.9.4
[PATCH v10 1/3] powerpc: Detect the presence of big-cores via "ibm,thread-groups"
From: "Gautham R. Shenoy" On IBM POWER9, the device tree exposes a property array identifed by "ibm,thread-groups" which will indicate which groups of threads share a particular set of resources. As of today we only have one form of grouping identifying the group of threads in the core that share the L1 cache, translation cache and instruction data flow. This patch adds helper functions to parse the contents of "ibm,thread-groups" and populate a per-cpu variable to cache information about siblings of each CPU that share the L1, traslation cache and instruction data-flow. It also defines a new global variable named "has_big_cores" which indicates if the cores on this configuration have multiple groups of threads that share L1 cache. For each online CPU, it maintains a cpu_smallcore_mask, which indicates the online siblings which share the L1-cache with it. Signed-off-by: Gautham R. Shenoy --- arch/powerpc/include/asm/cputhreads.h | 2 + arch/powerpc/include/asm/smp.h| 11 ++ arch/powerpc/kernel/smp.c | 222 ++ 3 files changed, 235 insertions(+) diff --git a/arch/powerpc/include/asm/cputhreads.h b/arch/powerpc/include/asm/cputhreads.h index d71a909..deb99fd 100644 --- a/arch/powerpc/include/asm/cputhreads.h +++ b/arch/powerpc/include/asm/cputhreads.h @@ -23,11 +23,13 @@ extern int threads_per_core; extern int threads_per_subcore; extern int threads_shift; +extern bool has_big_cores; extern cpumask_t threads_core_mask; #else #define threads_per_core 1 #define threads_per_subcore1 #define threads_shift 0 +#define has_big_cores 0 #define threads_core_mask (*get_cpu_mask(0)) #endif diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 95b66a0..4169574 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -100,6 +100,7 @@ static inline void set_hard_smp_processor_id(int cpu, int phys) DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map); DECLARE_PER_CPU(cpumask_var_t, cpu_l2_cache_map); DECLARE_PER_CPU(cpumask_var_t, cpu_core_map); +DECLARE_PER_CPU(cpumask_var_t, cpu_smallcore_map); static inline struct cpumask *cpu_sibling_mask(int cpu) { @@ -116,6 +117,11 @@ static inline struct cpumask *cpu_l2_cache_mask(int cpu) return per_cpu(cpu_l2_cache_map, cpu); } +static inline struct cpumask *cpu_smallcore_mask(int cpu) +{ + return per_cpu(cpu_smallcore_map, cpu); +} + extern int cpu_to_core_id(int cpu); /* Since OpenPIC has only 4 IPIs, we use slightly different message numbers. @@ -166,6 +172,11 @@ static inline const struct cpumask *cpu_sibling_mask(int cpu) return cpumask_of(cpu); } +static inline const struct cpumask *cpu_smallcore_mask(int cpu) +{ + return cpumask_of(cpu); +} + #endif /* CONFIG_SMP */ #ifdef CONFIG_PPC64 diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 61c1fad..22a14a9 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -74,14 +74,32 @@ #endif struct thread_info *secondary_ti; +bool has_big_cores; DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map); +DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map); DEFINE_PER_CPU(cpumask_var_t, cpu_l2_cache_map); DEFINE_PER_CPU(cpumask_var_t, cpu_core_map); EXPORT_PER_CPU_SYMBOL(cpu_sibling_map); EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map); EXPORT_PER_CPU_SYMBOL(cpu_core_map); +EXPORT_SYMBOL_GPL(has_big_cores); + +#define MAX_THREAD_LIST_SIZE 8 +#define THREAD_GROUP_SHARE_L1 1 +struct thread_groups { + unsigned int property; + unsigned int nr_groups; + unsigned int threads_per_group; + unsigned int thread_list[MAX_THREAD_LIST_SIZE]; +}; + +/* + * On big-cores system, cpu_l1_cache_map for each CPU corresponds to + * the set its siblings that share the L1-cache. + */ +DEFINE_PER_CPU(cpumask_var_t, cpu_l1_cache_map); /* SMP operations for this machine */ struct smp_ops_t *smp_ops; @@ -674,6 +692,185 @@ static void set_cpus_unrelated(int i, int j, } #endif +/* + * parse_thread_groups: Parses the "ibm,thread-groups" device tree + * property for the CPU device node @dn and stores + * the parsed output in the thread_groups + * structure @tg if the ibm,thread-groups[0] + * matches @property. + * + * @dn: The device node of the CPU device. + * @tg: Pointer to a thread group structure into which the parsed + * output of "ibm,thread-groups" is stored. + * @property: The property of the thread-group that the caller is + *interested in. + * + * ibm,thread-groups[0..N-1] array defines which group of threads in + * the CPU-device node can be grouped together based on the property. + * + * ibm,thread-groups[0] tells us the property based on which the + * threads are being grouped together. If this value is 1, it implies + * that the threads in the same group share L1, translation
[PATCH v10 0/3] powerpc: Detection and scheduler optimization for POWER9 bigcore
From: "Gautham R. Shenoy" Hi, This is the tenth iteration of the patchset to add support for big-core on POWER9. This patch also optimizes the task placement on such big-core systems. The previous versions can be found here: v9: https://lkml.org/lkml/2018/10/1/608 v8: https://lkml.org/lkml/2018/9/20/899 v7: https://lkml.org/lkml/2018/8/20/52 v6: https://lkml.org/lkml/2018/8/9/119 v5: https://lkml.org/lkml/2018/8/6/587 v4: https://lkml.org/lkml/2018/7/24/79 v3: https://lkml.org/lkml/2018/7/6/255 v2: https://lkml.org/lkml/2018/7/3/401 v1: https://lkml.org/lkml/2018/5/11/245 Changes : v9 --> v10: - Rebased it on v4.19-rc7 - Added a patch to report the correct shared_cpu_map for L1-caches on big-core systems. Description: IBM POWER9 SMT8 cores consists of two groups of small-cores where each group has its own L1 cache, translation cache and instruction-data flow. This can be discovered via the "ibm,thread-groups" CPU property in the device tree. Furthermore, on POWER9 the thread-ids of such a big-core is obtained by interleaving the thread-ids of the two small-cores. Eg: In an SMT8 core with thread ids {0,1,2,3,4,5,6,7}, the thread-ids of the threads in the two small-cores respectively will be {0,2,4,6} and {1,3,5,7} respectively. - |L1 Cache | -- |L2| | | | | | | 0 | 2 | 4 | 6 |Small Core0 |C | | | | | Big|a -- Core |c | | | | | |h | 1 | 3 | 5 | 7 | Small Core1 |e | | | | | - | L1 Cache | -- On such a big-core system, when multiple tasks are scheduled to run on the big-core, we get the best performance when the tasks are spread across the pair of small-cores. Eg: Suppose there 4 tasks {p1, p2, p3, p4} are run on a big core, then An Example of Optimal Task placement: -- | | | | | | 0 | 2 | 4 | 6 | Small Core0 | (p1)| (p2)| | | Big Core -- | | | | | | 1 | 3 | 5 | 7 | Small Core1 | | (p3)| | (p4) | -- An example of Suboptimal Task placement: -- | | | | | | 0 | 2 | 4 | 6 | Small Core0 | (p1)| (p2)| | (p4)| Big Core -- | | | | | | 1 | 3 | 5 | 7 | Small Core1 | | (p3)| | | -- Currently on the big-core systems, the sched domain hierarchy is: SMT : group of CPUs in the SMT8 core. DIE : groups of CPUs on the same die. NUMA : all the CPUs in the system. Thus the scheduler doesn't distinguish between CPUs in the core that share the L1-cache vs the ones that don't resulting in a run-to-run variance when multithreaded applications are run on an SMT8 core. In this patch-set, we address this by defining the sched-domain on the big-core systems to be: SMT : group of CPUs sharing the L1 cache CACHE : group of CPUs in the SMT8 core. DIE : groups of CPUs on the same die. NUMA : all the CPUs in the system. With this, the Linux Kernel load-balancer will ensure that the tasks are spread across all the component small cores in the system, thereby yielding optimum performance. Furthermore, this solution works correctly across all SMT modes (8,4,2), as the interleaved thread-ids ensures that when we go to lower SMT modes (4,2) the threads are offlined in a descending order, thereby leaving equal number of threads from the component small cores online as illustrated below. This patchset contains three patches which on detecting the presence of big-cores, defines the SMT level sched domain to correspond to the threads of the small cores. Patch 1: adds support to detect the presence of big-cores and parses the output of "ibm,thread-groups" device-tree which using which it updates a per-cpu mask named cpu_smallcore_mask Patch 2: Defines the SMT level sched domain to correspond to the threads of the small cores. Patch 3: Added a patch to report the correct shared_cpu_map for L1-caches on big-core systems. Without patch 3: /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 00ff /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 00ff /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00ff /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00ff With patch 3: /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 0055 /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map
[PATCH v10 2/3] powerpc: Use cpu_smallcore_sibling_mask at SMT level on bigcores
From: "Gautham R. Shenoy" POWER9 SMT8 cores consist of two groups of threads, where threads in each group shares L1-cache. The scheduler is not aware of this distinction as the current sched-domain hierarchy has all the threads of the core defined at the SMT domain. SMT [Thread siblings of the SMT8 core] DIE [CPUs in the same die] NUMA [All the CPUs in the system] Due to this, we can observe run-to-run variance when we run a multi-threaded benchmark bound to a single core based on how the scheduler spreads the software threads across the two groups in the core. We fix this in this patch by defining each group of threads which share L1-cache to be the SMT level. The group of threads in the SMT8 core is defined to be the CACHE level. The sched-domain hierarchy after this patch will be : SMT [Thread siblings in the core that share L1 cache] CACHE [Thread siblings that are in the SMT8 core] DIE [CPUs in the same die] NUMA[All the CPUs in the system] Signed-off-by: Gautham R. Shenoy --- arch/powerpc/kernel/smp.c | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 22a14a9..356751e 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -1266,6 +1266,7 @@ static void add_cpu_to_masks(int cpu) void start_secondary(void *unused) { unsigned int cpu = smp_processor_id(); + struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask; mmgrab(_mm); current->active_mm = _mm; @@ -1291,11 +1292,13 @@ void start_secondary(void *unused) /* Update topology CPU masks */ add_cpu_to_masks(cpu); + if (has_big_cores) + sibling_mask = cpu_smallcore_mask; /* * Check for any shared caches. Note that this must be done on a * per-core basis because one core in the pair might be disabled. */ - if (!cpumask_equal(cpu_l2_cache_mask(cpu), cpu_sibling_mask(cpu))) + if (!cpumask_equal(cpu_l2_cache_mask(cpu), sibling_mask(cpu))) shared_caches = true; set_numa_node(numa_cpu_lookup_table[cpu]); @@ -1362,6 +1365,13 @@ static const struct cpumask *shared_cache_mask(int cpu) return cpu_l2_cache_mask(cpu); } +#ifdef CONFIG_SCHED_SMT +static const struct cpumask *smallcore_smt_mask(int cpu) +{ + return cpu_smallcore_mask(cpu); +} +#endif + static struct sched_domain_topology_level power9_topology[] = { #ifdef CONFIG_SCHED_SMT { cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) }, @@ -1389,6 +1399,13 @@ void __init smp_cpus_done(unsigned int max_cpus) shared_proc_topology_init(); dump_numa_cpu_topology(); +#ifdef CONFIG_SCHED_SMT + if (has_big_cores) { + pr_info("Using small cores at SMT level\n"); + power9_topology[0].mask = smallcore_smt_mask; + powerpc_topology[0].mask = smallcore_smt_mask; + } +#endif /* * If any CPU detects that it's sharing a cache with another CPU then * use the deeper topology that is aware of this sharing. -- 1.9.4
[PATCH v10 3/3] powerpc/cacheinfo: Report the correct shared_cpu_map on big-cores
From: "Gautham R. Shenoy" Currently on POWER9 SMT8 cores systems, in sysfs, we report the shared_cache_map for L1 caches (both data and instruction) to be the cpu-ids of the threads in SMT8 cores. This is incorrect since on POWER9 SMT8 cores there are two groups of threads, each of which shares its own L1 cache. This patch addresses this by reporting the shared_cpu_map correctly in sysfs for L1 caches. Before the patch /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 00ff /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 00ff /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00ff /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00ff After the patch /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 0055 /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 0055 /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00aa /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00aa Signed-off-by: Gautham R. Shenoy --- arch/powerpc/kernel/cacheinfo.c | 37 +++-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c index a8f20e5..be57bd0 100644 --- a/arch/powerpc/kernel/cacheinfo.c +++ b/arch/powerpc/kernel/cacheinfo.c @@ -20,6 +20,8 @@ #include #include #include +#include +#include #include "cacheinfo.h" @@ -627,17 +629,48 @@ static ssize_t level_show(struct kobject *k, struct kobj_attribute *attr, char * static struct kobj_attribute cache_level_attr = __ATTR(level, 0444, level_show, NULL); +static unsigned int index_dir_to_cpu(struct cache_index_dir *index) +{ + struct kobject *index_dir_kobj = >kobj; + struct kobject *cache_dir_kobj = index_dir_kobj->parent; + struct kobject *cpu_dev_kobj = cache_dir_kobj->parent; + struct device *dev = kobj_to_dev(cpu_dev_kobj); + + return dev->id; +} + +/* + * On big-core systems, each core has two groups of CPUs each of which + * has its own L1-cache. The thread-siblings which share l1-cache with + * @cpu can be obtained via cpu_smallcore_mask(). + */ +static const struct cpumask *get_big_core_shared_cpu_map(int cpu, struct cache *cache) +{ + if (cache->level == 1) + return cpu_smallcore_mask(cpu); + + return >shared_cpu_map; +} + static ssize_t shared_cpu_map_show(struct kobject *k, struct kobj_attribute *attr, char *buf) { struct cache_index_dir *index; struct cache *cache; - int ret; + const struct cpumask *mask; + int ret, cpu; index = kobj_to_cache_index_dir(k); cache = index->cache; + if (has_big_cores) { + cpu = index_dir_to_cpu(index); + mask = get_big_core_shared_cpu_map(cpu, cache); + } else { + mask = >shared_cpu_map; + } + ret = scnprintf(buf, PAGE_SIZE - 1, "%*pb\n", - cpumask_pr_args(>shared_cpu_map)); + cpumask_pr_args(mask)); buf[ret++] = '\n'; buf[ret] = '\0'; return ret; -- 1.9.4
[PATCH v10 1/3] powerpc: Detect the presence of big-cores via "ibm,thread-groups"
From: "Gautham R. Shenoy" On IBM POWER9, the device tree exposes a property array identifed by "ibm,thread-groups" which will indicate which groups of threads share a particular set of resources. As of today we only have one form of grouping identifying the group of threads in the core that share the L1 cache, translation cache and instruction data flow. This patch adds helper functions to parse the contents of "ibm,thread-groups" and populate a per-cpu variable to cache information about siblings of each CPU that share the L1, traslation cache and instruction data-flow. It also defines a new global variable named "has_big_cores" which indicates if the cores on this configuration have multiple groups of threads that share L1 cache. For each online CPU, it maintains a cpu_smallcore_mask, which indicates the online siblings which share the L1-cache with it. Signed-off-by: Gautham R. Shenoy --- arch/powerpc/include/asm/cputhreads.h | 2 + arch/powerpc/include/asm/smp.h| 11 ++ arch/powerpc/kernel/smp.c | 222 ++ 3 files changed, 235 insertions(+) diff --git a/arch/powerpc/include/asm/cputhreads.h b/arch/powerpc/include/asm/cputhreads.h index d71a909..deb99fd 100644 --- a/arch/powerpc/include/asm/cputhreads.h +++ b/arch/powerpc/include/asm/cputhreads.h @@ -23,11 +23,13 @@ extern int threads_per_core; extern int threads_per_subcore; extern int threads_shift; +extern bool has_big_cores; extern cpumask_t threads_core_mask; #else #define threads_per_core 1 #define threads_per_subcore1 #define threads_shift 0 +#define has_big_cores 0 #define threads_core_mask (*get_cpu_mask(0)) #endif diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 95b66a0..4169574 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -100,6 +100,7 @@ static inline void set_hard_smp_processor_id(int cpu, int phys) DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map); DECLARE_PER_CPU(cpumask_var_t, cpu_l2_cache_map); DECLARE_PER_CPU(cpumask_var_t, cpu_core_map); +DECLARE_PER_CPU(cpumask_var_t, cpu_smallcore_map); static inline struct cpumask *cpu_sibling_mask(int cpu) { @@ -116,6 +117,11 @@ static inline struct cpumask *cpu_l2_cache_mask(int cpu) return per_cpu(cpu_l2_cache_map, cpu); } +static inline struct cpumask *cpu_smallcore_mask(int cpu) +{ + return per_cpu(cpu_smallcore_map, cpu); +} + extern int cpu_to_core_id(int cpu); /* Since OpenPIC has only 4 IPIs, we use slightly different message numbers. @@ -166,6 +172,11 @@ static inline const struct cpumask *cpu_sibling_mask(int cpu) return cpumask_of(cpu); } +static inline const struct cpumask *cpu_smallcore_mask(int cpu) +{ + return cpumask_of(cpu); +} + #endif /* CONFIG_SMP */ #ifdef CONFIG_PPC64 diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 61c1fad..22a14a9 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -74,14 +74,32 @@ #endif struct thread_info *secondary_ti; +bool has_big_cores; DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map); +DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map); DEFINE_PER_CPU(cpumask_var_t, cpu_l2_cache_map); DEFINE_PER_CPU(cpumask_var_t, cpu_core_map); EXPORT_PER_CPU_SYMBOL(cpu_sibling_map); EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map); EXPORT_PER_CPU_SYMBOL(cpu_core_map); +EXPORT_SYMBOL_GPL(has_big_cores); + +#define MAX_THREAD_LIST_SIZE 8 +#define THREAD_GROUP_SHARE_L1 1 +struct thread_groups { + unsigned int property; + unsigned int nr_groups; + unsigned int threads_per_group; + unsigned int thread_list[MAX_THREAD_LIST_SIZE]; +}; + +/* + * On big-cores system, cpu_l1_cache_map for each CPU corresponds to + * the set its siblings that share the L1-cache. + */ +DEFINE_PER_CPU(cpumask_var_t, cpu_l1_cache_map); /* SMP operations for this machine */ struct smp_ops_t *smp_ops; @@ -674,6 +692,185 @@ static void set_cpus_unrelated(int i, int j, } #endif +/* + * parse_thread_groups: Parses the "ibm,thread-groups" device tree + * property for the CPU device node @dn and stores + * the parsed output in the thread_groups + * structure @tg if the ibm,thread-groups[0] + * matches @property. + * + * @dn: The device node of the CPU device. + * @tg: Pointer to a thread group structure into which the parsed + * output of "ibm,thread-groups" is stored. + * @property: The property of the thread-group that the caller is + *interested in. + * + * ibm,thread-groups[0..N-1] array defines which group of threads in + * the CPU-device node can be grouped together based on the property. + * + * ibm,thread-groups[0] tells us the property based on which the + * threads are being grouped together. If this value is 1, it implies + * that the threads in the same group share L1, translation
Re: [PATCH] isdn/hisax: amd7930_fn: Remove unnecessary parentheses
From: Nathan Chancellor Date: Mon, 8 Oct 2018 15:59:05 -0700 > Clang warns when multiple sets of parentheses are used for a single > conditional statement. > > drivers/isdn/hisax/amd7930_fn.c:628:32: warning: equality comparison > with extraneous parentheses [-Wparentheses-equality] > if ((cs->dc.amd7930.ph_state == 8)) { > ^~~~ > drivers/isdn/hisax/amd7930_fn.c:628:32: note: remove extraneous > parentheses around the comparison to silence this warning > if ((cs->dc.amd7930.ph_state == 8)) { > ~^ ~ > drivers/isdn/hisax/amd7930_fn.c:628:32: note: use '=' to turn this > equality comparison into an assignment > if ((cs->dc.amd7930.ph_state == 8)) { > ^~ > = > 1 warning generated. > > Signed-off-by: Nathan Chancellor Applied.
Re: [PATCH] isdn/hisax: amd7930_fn: Remove unnecessary parentheses
From: Nathan Chancellor Date: Mon, 8 Oct 2018 15:59:05 -0700 > Clang warns when multiple sets of parentheses are used for a single > conditional statement. > > drivers/isdn/hisax/amd7930_fn.c:628:32: warning: equality comparison > with extraneous parentheses [-Wparentheses-equality] > if ((cs->dc.amd7930.ph_state == 8)) { > ^~~~ > drivers/isdn/hisax/amd7930_fn.c:628:32: note: remove extraneous > parentheses around the comparison to silence this warning > if ((cs->dc.amd7930.ph_state == 8)) { > ~^ ~ > drivers/isdn/hisax/amd7930_fn.c:628:32: note: use '=' to turn this > equality comparison into an assignment > if ((cs->dc.amd7930.ph_state == 8)) { > ^~ > = > 1 warning generated. > > Signed-off-by: Nathan Chancellor Applied.
Re: [PATCH net 00/10] rxrpc: Fix packet reception code
From: David Howells Date: Mon, 08 Oct 2018 23:47:18 +0100 > Here are a set of patches that prepares for and fix problems in rxrpc's > package reception code. There serious problems are: ... > The second patch fixes (A) - (C); the third patch renders (B) and (C) > non-issues by using the recap_rcv hook instead of data_ready - and the > final patch fixes (D). That last is the most complex. > > The preparatory patches are: ... > And then there are three main patches - note that these are mixed in with > the preparatory patches somewhat: ... > The patches are tagged here: > > git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git > rxrpc-fixes-20181008 Pulled, thanks David.
Re: [PATCH net 00/10] rxrpc: Fix packet reception code
From: David Howells Date: Mon, 08 Oct 2018 23:47:18 +0100 > Here are a set of patches that prepares for and fix problems in rxrpc's > package reception code. There serious problems are: ... > The second patch fixes (A) - (C); the third patch renders (B) and (C) > non-issues by using the recap_rcv hook instead of data_ready - and the > final patch fixes (D). That last is the most complex. > > The preparatory patches are: ... > And then there are three main patches - note that these are mixed in with > the preparatory patches somewhat: ... > The patches are tagged here: > > git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git > rxrpc-fixes-20181008 Pulled, thanks David.
linux-next: Tree for Oct 11
Hi all, Changes since 20181010: The crypto tree gained conflicts against the mac80211-next tree. Non-merge commits (relative to Linus' tree): 9625 9096 files changed, 465434 insertions(+), 197078 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And finally, a simple boot test of the powerpc pseries_le_defconfig kernel in qemu (with and without kvm enabled). Below is a summary of the state of the merge. I am currently merging 291 trees (counting Linus' and 66 trees of bug fix patches pending for the current merge release). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (b8db9e69dba9 Merge tag 'for-4.19/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm) Merging fixes/master (72358c0b59b7 linux-next: build warnings from the build of Linus' tree) Merging kbuild-current/fixes (5318321d367c samples: disable CONFIG_SAMPLES for UML) Merging arc-current/for-curr (c58a584f05e3 ARC: clone syscall to setp r25 as thread pointer) Merging arm-current/fixes (3a58ac65e2d7 ARM: 8799/1: mm: fix pci_ioremap_io() offset check) Merging arm64-fixes/for-next/fixes (2a3f93459d68 arm64: KVM: Sanitize PSTATE.M when being set from userspace) Merging m68k-current/for-linus (0986b16ab49b m68k/mac: Use correct PMU response format) Merging powerpc-fixes/fixes (ac1788cc7da4 powerpc/numa: Skip onlining a offline node in kdump path) Merging sparc/master (ff5d1a42096c sunvdc: Remove VLA usage) Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2) Merging net/master (52b5d6f5dcf0 net: make skb_partial_csum_set() more robust against overflows) Merging bpf/master (262f9d811c76 bpf: do not blindly change rlimit in reuseport net selftest) Merging ipsec/master (4da402597c2b xfrm: fix gro_cells leak when remove virtual xfrm interfaces) Merging netfilter/master (1ad98e9d1bdf tcp/dccp: fix lockdep issue when SYN is backlogged) Merging ipvs/master (feb9f55c33e5 netfilter: nft_dynset: allow dynamic updates of non-anonymous set) Merging wireless-drivers/master (3baafeffa48a iwlwifi: 1000: set the TFD queue size) Merging mac80211/master (8d0be26c781a mac80211_hwsim: fix module init error paths for netlink) Merging rdma-fixes/for-rc (5c5702e259dc RDMA/core: Set right entry state before releasing reference) Merging sound-current/for-linus (709ae62e8e6d ALSA: hda/realtek - Cannot adjust speaker's volume on Dell XPS 27 7760) Merging sound-asoc-fixes/for-linus (296a42942aa3 Merge branch 'asoc-4.19' into asoc-linus) Merging regmap-fixes/for-linus (7876320f8880 Linux 4.19-rc4) Merging regulator-fixes/for-linus (0238df646e62 Linux 4.19-rc7) Merging spi-fixes/for-linus (0238df646e62 Linux 4.19-rc7) Merging pci-current/for-linus (2edab4df98d9 PCI: Expand the "PF" acronym in Kconfig help text) Merging driver-core.current/driver-core-linus (7876320f8880 Linux 4.19-rc4) Merging tty.current/tty-linus (0238df646e62 Linux 4.19-rc7) Merging usb.current/usb-linus (0238df646e62 Linux 4.19-rc7) Merging usb-gadget-fixes/fixes (d9707490077b usb: dwc2: Fix call location of dwc2_check_core_endianness) Merging usb-serial-fixes/usb-linus (0238df646e62 Linux 4.19-rc7) Merging usb-chipidea-fixes/ci-for-usb-stable (a930d8bd94d8 usb: chipidea: Always build ULPI code) Merging phy/fixes (5b394b2ddf03 Linux 4.19-rc1) Merging staging.current/staging-linus (7876320f8880 Linux 4.19-rc4) Merging char-misc.current/char-misc-linus (0238df646e62 Linux 4.19-rc7) Merging soundwire-fixes/fixes (8d6ccf5cebbc soundwire: Fix acquiring bus loc
linux-next: Tree for Oct 11
Hi all, Changes since 20181010: The crypto tree gained conflicts against the mac80211-next tree. Non-merge commits (relative to Linus' tree): 9625 9096 files changed, 465434 insertions(+), 197078 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And finally, a simple boot test of the powerpc pseries_le_defconfig kernel in qemu (with and without kvm enabled). Below is a summary of the state of the merge. I am currently merging 291 trees (counting Linus' and 66 trees of bug fix patches pending for the current merge release). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (b8db9e69dba9 Merge tag 'for-4.19/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm) Merging fixes/master (72358c0b59b7 linux-next: build warnings from the build of Linus' tree) Merging kbuild-current/fixes (5318321d367c samples: disable CONFIG_SAMPLES for UML) Merging arc-current/for-curr (c58a584f05e3 ARC: clone syscall to setp r25 as thread pointer) Merging arm-current/fixes (3a58ac65e2d7 ARM: 8799/1: mm: fix pci_ioremap_io() offset check) Merging arm64-fixes/for-next/fixes (2a3f93459d68 arm64: KVM: Sanitize PSTATE.M when being set from userspace) Merging m68k-current/for-linus (0986b16ab49b m68k/mac: Use correct PMU response format) Merging powerpc-fixes/fixes (ac1788cc7da4 powerpc/numa: Skip onlining a offline node in kdump path) Merging sparc/master (ff5d1a42096c sunvdc: Remove VLA usage) Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2) Merging net/master (52b5d6f5dcf0 net: make skb_partial_csum_set() more robust against overflows) Merging bpf/master (262f9d811c76 bpf: do not blindly change rlimit in reuseport net selftest) Merging ipsec/master (4da402597c2b xfrm: fix gro_cells leak when remove virtual xfrm interfaces) Merging netfilter/master (1ad98e9d1bdf tcp/dccp: fix lockdep issue when SYN is backlogged) Merging ipvs/master (feb9f55c33e5 netfilter: nft_dynset: allow dynamic updates of non-anonymous set) Merging wireless-drivers/master (3baafeffa48a iwlwifi: 1000: set the TFD queue size) Merging mac80211/master (8d0be26c781a mac80211_hwsim: fix module init error paths for netlink) Merging rdma-fixes/for-rc (5c5702e259dc RDMA/core: Set right entry state before releasing reference) Merging sound-current/for-linus (709ae62e8e6d ALSA: hda/realtek - Cannot adjust speaker's volume on Dell XPS 27 7760) Merging sound-asoc-fixes/for-linus (296a42942aa3 Merge branch 'asoc-4.19' into asoc-linus) Merging regmap-fixes/for-linus (7876320f8880 Linux 4.19-rc4) Merging regulator-fixes/for-linus (0238df646e62 Linux 4.19-rc7) Merging spi-fixes/for-linus (0238df646e62 Linux 4.19-rc7) Merging pci-current/for-linus (2edab4df98d9 PCI: Expand the "PF" acronym in Kconfig help text) Merging driver-core.current/driver-core-linus (7876320f8880 Linux 4.19-rc4) Merging tty.current/tty-linus (0238df646e62 Linux 4.19-rc7) Merging usb.current/usb-linus (0238df646e62 Linux 4.19-rc7) Merging usb-gadget-fixes/fixes (d9707490077b usb: dwc2: Fix call location of dwc2_check_core_endianness) Merging usb-serial-fixes/usb-linus (0238df646e62 Linux 4.19-rc7) Merging usb-chipidea-fixes/ci-for-usb-stable (a930d8bd94d8 usb: chipidea: Always build ULPI code) Merging phy/fixes (5b394b2ddf03 Linux 4.19-rc1) Merging staging.current/staging-linus (7876320f8880 Linux 4.19-rc4) Merging char-misc.current/char-misc-linus (0238df646e62 Linux 4.19-rc7) Merging soundwire-fixes/fixes (8d6ccf5cebbc soundwire: Fix acquiring bus loc
Re: [PATCH] net: aquantia: remove some redundant variable initializations
From: Colin King Date: Mon, 8 Oct 2018 14:35:58 +0100 > From: Colin Ian King > > There are several variables being initialized that are being set later > and hence the initialization is redundant and can be removed. Remove > then. > > Signed-off-by: Colin Ian King Applied to net-next.
Re: [PATCH] net: aquantia: remove some redundant variable initializations
From: Colin King Date: Mon, 8 Oct 2018 14:35:58 +0100 > From: Colin Ian King > > There are several variables being initialized that are being set later > and hence the initialization is redundant and can be removed. Remove > then. > > Signed-off-by: Colin Ian King Applied to net-next.
Re: [PATCH] mm: Speed up mremap on large regions
On Wed, Oct 10, 2018 at 05:46:18PM -0700, Joel Fernandes wrote: > diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h > b/arch/powerpc/include/asm/book3s/64/pgalloc.h > index 391ed2c3b697..8a33f2044923 100644 > --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h > +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h > @@ -192,14 +192,12 @@ static inline pgtable_t pmd_pgtable(pmd_t pmd) > return (pgtable_t)pmd_page_vaddr(pmd); > } > > -static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm, > - unsigned long address) > +static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm) > { > return (pte_t *)pte_fragment_alloc(mm, address, 1); > } This is obviously broken. -- Kirill A. Shutemov
Re: [PATCH] mm: Speed up mremap on large regions
On Wed, Oct 10, 2018 at 05:46:18PM -0700, Joel Fernandes wrote: > diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h > b/arch/powerpc/include/asm/book3s/64/pgalloc.h > index 391ed2c3b697..8a33f2044923 100644 > --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h > +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h > @@ -192,14 +192,12 @@ static inline pgtable_t pmd_pgtable(pmd_t pmd) > return (pgtable_t)pmd_page_vaddr(pmd); > } > > -static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm, > - unsigned long address) > +static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm) > { > return (pte_t *)pte_fragment_alloc(mm, address, 1); > } This is obviously broken. -- Kirill A. Shutemov
Re: [Xen-devel] [PATCH] xen: drop writing error messages to xenstore
On 10/10/2018 18:57, Boris Ostrovsky wrote: > On 10/10/18 11:53 AM, Juergen Gross wrote: >> On 10/10/2018 17:09, Joao Martins wrote: >>> On 10/09/2018 05:09 PM, Juergen Gross wrote: xenbus_va_dev_error() will try to write error messages to Xenstore under the error//error node (with something like "device/vbd/51872"). This will fail normally and another message about this failure is added to dmesg. I believe this is a remnant from very ancient times, as it was added in the first pvops rush of commits in 2007. So remove the additional message when writing to Xenstore failed as a minimum step. Signed-off-by: Juergen Gross --- I am considering removing the Xenstore write altogether, but I'm not sure it isn't needed e.g. by xend based installations. So please speak up in case you know why this write is there. >>> So this: >>> >>> "This will fail normally and another message about this failure is added to >>> dmesg." >>> >>> Brings me to the question: What about {stub,driver}domains? Ideally you >>> shouldn't be looking at domU's dmesg as a control domain no? I can't >>> remember >>> any other error node, but if something fails e.g. netfront fails to >>> allocate an >>> unbound event channel - how do you know the cause from the control domain >>> perspective? >>> >>> Irrespective of xend or not: isn't this 'error' node the only one that >>> propagates error causes per device from domU? >> What does it help you in dom0 if you have an error message in Xenstore >> if a frontend driver couldn't do its job? Is there anything you can do >> as a Xen admin? > > The admin may want to know, for example, that a hotplug in the guest failed. Shouldn't he ask the guest for that? There are dozens of other possible problems letting hotplug fail which won't write anything to Xenstore. This might be interesting for development/test purposes, but I really question it to stay in mature code. Juergen
Re: [Xen-devel] [PATCH] xen: drop writing error messages to xenstore
On 10/10/2018 18:57, Boris Ostrovsky wrote: > On 10/10/18 11:53 AM, Juergen Gross wrote: >> On 10/10/2018 17:09, Joao Martins wrote: >>> On 10/09/2018 05:09 PM, Juergen Gross wrote: xenbus_va_dev_error() will try to write error messages to Xenstore under the error//error node (with something like "device/vbd/51872"). This will fail normally and another message about this failure is added to dmesg. I believe this is a remnant from very ancient times, as it was added in the first pvops rush of commits in 2007. So remove the additional message when writing to Xenstore failed as a minimum step. Signed-off-by: Juergen Gross --- I am considering removing the Xenstore write altogether, but I'm not sure it isn't needed e.g. by xend based installations. So please speak up in case you know why this write is there. >>> So this: >>> >>> "This will fail normally and another message about this failure is added to >>> dmesg." >>> >>> Brings me to the question: What about {stub,driver}domains? Ideally you >>> shouldn't be looking at domU's dmesg as a control domain no? I can't >>> remember >>> any other error node, but if something fails e.g. netfront fails to >>> allocate an >>> unbound event channel - how do you know the cause from the control domain >>> perspective? >>> >>> Irrespective of xend or not: isn't this 'error' node the only one that >>> propagates error causes per device from domU? >> What does it help you in dom0 if you have an error message in Xenstore >> if a frontend driver couldn't do its job? Is there anything you can do >> as a Xen admin? > > The admin may want to know, for example, that a hotplug in the guest failed. Shouldn't he ask the guest for that? There are dozens of other possible problems letting hotplug fail which won't write anything to Xenstore. This might be interesting for development/test purposes, but I really question it to stay in mature code. Juergen
[PATCH 0/2] docs: memory-hotplug: add details about locking internals
Hi, As discussed at [1], the latest updates to memory hotplug documentation are causing a conflict between docs and mmotm trees. These patches resolve the conflict. [1] https://lkml.org/lkml/2018/10/8/227 David Hildenbrand (1): docs/core-api: memory-hotplug: add some details about locking internals Mike Rapoport (1): docs/core-api: rename memory-hotplug-notifier to memory-hotplug Documentation/core-api/index.rst | 2 +- Documentation/core-api/memory-hotplug-notifier.rst | 84 -- Documentation/core-api/memory-hotplug.rst | 125 + 3 files changed, 126 insertions(+), 85 deletions(-) delete mode 100644 Documentation/core-api/memory-hotplug-notifier.rst create mode 100644 Documentation/core-api/memory-hotplug.rst -- 2.7.4
[PATCH 0/2] docs: memory-hotplug: add details about locking internals
Hi, As discussed at [1], the latest updates to memory hotplug documentation are causing a conflict between docs and mmotm trees. These patches resolve the conflict. [1] https://lkml.org/lkml/2018/10/8/227 David Hildenbrand (1): docs/core-api: memory-hotplug: add some details about locking internals Mike Rapoport (1): docs/core-api: rename memory-hotplug-notifier to memory-hotplug Documentation/core-api/index.rst | 2 +- Documentation/core-api/memory-hotplug-notifier.rst | 84 -- Documentation/core-api/memory-hotplug.rst | 125 + 3 files changed, 126 insertions(+), 85 deletions(-) delete mode 100644 Documentation/core-api/memory-hotplug-notifier.rst create mode 100644 Documentation/core-api/memory-hotplug.rst -- 2.7.4
[PATCH 1/2] docs/core-api: rename memory-hotplug-notifier to memory-hotplug
From: Mike Rapoport to allow additions of new documentation about memory hotplug under the same roof. Signed-off-by: Mike Rapoport --- Documentation/core-api/index.rst | 2 +- Documentation/core-api/memory-hotplug-notifier.rst | 84 - Documentation/core-api/memory-hotplug.rst | 87 ++ 3 files changed, 88 insertions(+), 85 deletions(-) delete mode 100644 Documentation/core-api/memory-hotplug-notifier.rst create mode 100644 Documentation/core-api/memory-hotplug.rst diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst index 4f8a426..29c790f 100644 --- a/Documentation/core-api/index.rst +++ b/Documentation/core-api/index.rst @@ -32,7 +32,7 @@ Core utilities gfp_mask-from-fs-io timekeeping boot-time-mm - memory-hotplug-notifier + memory-hotplug Interfaces for kernel debugging diff --git a/Documentation/core-api/memory-hotplug-notifier.rst b/Documentation/core-api/memory-hotplug-notifier.rst deleted file mode 100644 index 35347cc..000 --- a/Documentation/core-api/memory-hotplug-notifier.rst +++ /dev/null @@ -1,84 +0,0 @@ -.. _memory_hotplug_notifier: - -= -Memory hotplug event notifier -= - -Hotplugging events are sent to a notification queue. - -There are six types of notification defined in ``include/linux/memory.h``: - -MEM_GOING_ONLINE - Generated before new memory becomes available in order to be able to - prepare subsystems to handle memory. The page allocator is still unable - to allocate from the new memory. - -MEM_CANCEL_ONLINE - Generated if MEM_GOING_ONLINE fails. - -MEM_ONLINE - Generated when memory has successfully brought online. The callback may - allocate pages from the new memory. - -MEM_GOING_OFFLINE - Generated to begin the process of offlining memory. Allocations are no - longer possible from the memory but some of the memory to be offlined - is still in use. The callback can be used to free memory known to a - subsystem from the indicated memory block. - -MEM_CANCEL_OFFLINE - Generated if MEM_GOING_OFFLINE fails. Memory is available again from - the memory block that we attempted to offline. - -MEM_OFFLINE - Generated after offlining memory is complete. - -A callback routine can be registered by calling:: - - hotplug_memory_notifier(callback_func, priority) - -Callback functions with higher values of priority are called before callback -functions with lower values. - -A callback function must have the following prototype:: - - int callback_func( -struct notifier_block *self, unsigned long action, void *arg); - -The first argument of the callback function (self) is a pointer to the block -of the notifier chain that points to the callback function itself. -The second argument (action) is one of the event types described above. -The third argument (arg) passes a pointer of struct memory_notify:: - - struct memory_notify { - unsigned long start_pfn; - unsigned long nr_pages; - int status_change_nid_normal; - int status_change_nid_high; - int status_change_nid; - } - -- start_pfn is start_pfn of online/offline memory. -- nr_pages is # of pages of online/offline memory. -- status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask - is (will be) set/clear, if this is -1, then nodemask status is not changed. -- status_change_nid_high is set node id when N_HIGH_MEMORY of nodemask - is (will be) set/clear, if this is -1, then nodemask status is not changed. -- status_change_nid is set node id when N_MEMORY of nodemask is (will be) - set/clear. It means a new(memoryless) node gets new memory by online and a - node loses all memory. If this is -1, then nodemask status is not changed. - - If status_changed_nid* >= 0, callback should create/discard structures for the - node if necessary. - -The callback routine shall return one of the values -NOTIFY_DONE, NOTIFY_OK, NOTIFY_BAD, NOTIFY_STOP -defined in ``include/linux/notifier.h`` - -NOTIFY_DONE and NOTIFY_OK have no effect on the further processing. - -NOTIFY_BAD is used as response to the MEM_GOING_ONLINE, MEM_GOING_OFFLINE, -MEM_ONLINE, or MEM_OFFLINE action to cancel hotplugging. It stops -further processing of the notification queue. - -NOTIFY_STOP stops further processing of the notification queue. diff --git a/Documentation/core-api/memory-hotplug.rst b/Documentation/core-api/memory-hotplug.rst new file mode 100644 index 000..a99f2f2 --- /dev/null +++ b/Documentation/core-api/memory-hotplug.rst @@ -0,0 +1,87 @@ +.. _memory_hotplug: + +== +Memory hotplug +== + +Memory hotplug event notifier += + +Hotplugging events are sent to a notification queue. + +There are six types of notification defined in ``include/linux/memory.h``: + +MEM_GOING_ONLINE + Generated before new memory becomes
[PATCH 2/2] docs/core-api: memory-hotplug: add some details about locking internals
From: David Hildenbrand Let's document the magic a bit, especially why device_hotplug_lock is required when adding/removing memory and how it all play together with requests to online/offline memory from user space. [ rppt: moved the text to Documentation/core-api/memory-hotplug.rst ] Link: http://lkml.kernel.org/r/20180925091457.28651-7-da...@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Pavel Tatashin Reviewed-by: Rashmica Gupta Cc: Jonathan Corbet Cc: Michal Hocko Cc: Balbir Singh Cc: Benjamin Herrenschmidt Cc: Boris Ostrovsky Cc: Dan Williams Cc: Greg Kroah-Hartman Cc: Haiyang Zhang Cc: Heiko Carstens Cc: John Allen Cc: Joonsoo Kim Cc: Juergen Gross Cc: Kate Stewart Cc: "K. Y. Srinivasan" Cc: Len Brown Cc: Martin Schwidefsky Cc: Mathieu Malaterre Cc: Michael Ellerman Cc: Michael Neuling Cc: Nathan Fontenot Cc: Oscar Salvador Cc: Paul Mackerras Cc: Philippe Ombredanne Cc: Rafael J. Wysocki Cc: "Rafael J. Wysocki" Cc: Stephen Hemminger Cc: Thomas Gleixner Cc: Vlastimil Babka Cc: YASUAKI ISHIMATSU Signed-off-by: Andrew Morton Signed-off-by: Mike Rapoport --- Documentation/core-api/memory-hotplug.rst | 38 +++ 1 file changed, 38 insertions(+) diff --git a/Documentation/core-api/memory-hotplug.rst b/Documentation/core-api/memory-hotplug.rst index a99f2f2..de7467e 100644 --- a/Documentation/core-api/memory-hotplug.rst +++ b/Documentation/core-api/memory-hotplug.rst @@ -85,3 +85,41 @@ MEM_ONLINE, or MEM_OFFLINE action to cancel hotplugging. It stops further processing of the notification queue. NOTIFY_STOP stops further processing of the notification queue. + +Locking Internals += + +When adding/removing memory that uses memory block devices (i.e. ordinary RAM), +the device_hotplug_lock should be held to: + +- synchronize against online/offline requests (e.g. via sysfs). This way, memory + block devices can only be accessed (.online/.state attributes) by user + space once memory has been fully added. And when removing memory, we + know nobody is in critical sections. +- synchronize against CPU hotplug and similar (e.g. relevant for ACPI and PPC) + +Especially, there is a possible lock inversion that is avoided using +device_hotplug_lock when adding memory and user space tries to online that +memory faster than expected: + +- device_online() will first take the device_lock(), followed by + mem_hotplug_lock +- add_memory_resource() will first take the mem_hotplug_lock, followed by + the device_lock() (while creating the devices, during bus_add_device()). + +As the device is visible to user space before taking the device_lock(), this +can result in a lock inversion. + +onlining/offlining of memory should be done via device_online()/ +device_offline() - to make sure it is properly synchronized to actions +via sysfs. Holding device_hotplug_lock is advised (to e.g. protect online_type) + +When adding/removing/onlining/offlining memory or adding/removing +heterogeneous/device memory, we should always hold the mem_hotplug_lock in +write mode to serialise memory hotplug (e.g. access to global/zone +variables). + +In addition, mem_hotplug_lock (in contrast to device_hotplug_lock) in read +mode allows for a quite efficient get_online_mems/put_online_mems +implementation, so code accessing memory can protect from that memory +vanishing. -- 2.7.4
[PATCH 1/2] docs/core-api: rename memory-hotplug-notifier to memory-hotplug
From: Mike Rapoport to allow additions of new documentation about memory hotplug under the same roof. Signed-off-by: Mike Rapoport --- Documentation/core-api/index.rst | 2 +- Documentation/core-api/memory-hotplug-notifier.rst | 84 - Documentation/core-api/memory-hotplug.rst | 87 ++ 3 files changed, 88 insertions(+), 85 deletions(-) delete mode 100644 Documentation/core-api/memory-hotplug-notifier.rst create mode 100644 Documentation/core-api/memory-hotplug.rst diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst index 4f8a426..29c790f 100644 --- a/Documentation/core-api/index.rst +++ b/Documentation/core-api/index.rst @@ -32,7 +32,7 @@ Core utilities gfp_mask-from-fs-io timekeeping boot-time-mm - memory-hotplug-notifier + memory-hotplug Interfaces for kernel debugging diff --git a/Documentation/core-api/memory-hotplug-notifier.rst b/Documentation/core-api/memory-hotplug-notifier.rst deleted file mode 100644 index 35347cc..000 --- a/Documentation/core-api/memory-hotplug-notifier.rst +++ /dev/null @@ -1,84 +0,0 @@ -.. _memory_hotplug_notifier: - -= -Memory hotplug event notifier -= - -Hotplugging events are sent to a notification queue. - -There are six types of notification defined in ``include/linux/memory.h``: - -MEM_GOING_ONLINE - Generated before new memory becomes available in order to be able to - prepare subsystems to handle memory. The page allocator is still unable - to allocate from the new memory. - -MEM_CANCEL_ONLINE - Generated if MEM_GOING_ONLINE fails. - -MEM_ONLINE - Generated when memory has successfully brought online. The callback may - allocate pages from the new memory. - -MEM_GOING_OFFLINE - Generated to begin the process of offlining memory. Allocations are no - longer possible from the memory but some of the memory to be offlined - is still in use. The callback can be used to free memory known to a - subsystem from the indicated memory block. - -MEM_CANCEL_OFFLINE - Generated if MEM_GOING_OFFLINE fails. Memory is available again from - the memory block that we attempted to offline. - -MEM_OFFLINE - Generated after offlining memory is complete. - -A callback routine can be registered by calling:: - - hotplug_memory_notifier(callback_func, priority) - -Callback functions with higher values of priority are called before callback -functions with lower values. - -A callback function must have the following prototype:: - - int callback_func( -struct notifier_block *self, unsigned long action, void *arg); - -The first argument of the callback function (self) is a pointer to the block -of the notifier chain that points to the callback function itself. -The second argument (action) is one of the event types described above. -The third argument (arg) passes a pointer of struct memory_notify:: - - struct memory_notify { - unsigned long start_pfn; - unsigned long nr_pages; - int status_change_nid_normal; - int status_change_nid_high; - int status_change_nid; - } - -- start_pfn is start_pfn of online/offline memory. -- nr_pages is # of pages of online/offline memory. -- status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask - is (will be) set/clear, if this is -1, then nodemask status is not changed. -- status_change_nid_high is set node id when N_HIGH_MEMORY of nodemask - is (will be) set/clear, if this is -1, then nodemask status is not changed. -- status_change_nid is set node id when N_MEMORY of nodemask is (will be) - set/clear. It means a new(memoryless) node gets new memory by online and a - node loses all memory. If this is -1, then nodemask status is not changed. - - If status_changed_nid* >= 0, callback should create/discard structures for the - node if necessary. - -The callback routine shall return one of the values -NOTIFY_DONE, NOTIFY_OK, NOTIFY_BAD, NOTIFY_STOP -defined in ``include/linux/notifier.h`` - -NOTIFY_DONE and NOTIFY_OK have no effect on the further processing. - -NOTIFY_BAD is used as response to the MEM_GOING_ONLINE, MEM_GOING_OFFLINE, -MEM_ONLINE, or MEM_OFFLINE action to cancel hotplugging. It stops -further processing of the notification queue. - -NOTIFY_STOP stops further processing of the notification queue. diff --git a/Documentation/core-api/memory-hotplug.rst b/Documentation/core-api/memory-hotplug.rst new file mode 100644 index 000..a99f2f2 --- /dev/null +++ b/Documentation/core-api/memory-hotplug.rst @@ -0,0 +1,87 @@ +.. _memory_hotplug: + +== +Memory hotplug +== + +Memory hotplug event notifier += + +Hotplugging events are sent to a notification queue. + +There are six types of notification defined in ``include/linux/memory.h``: + +MEM_GOING_ONLINE + Generated before new memory becomes
[PATCH 2/2] docs/core-api: memory-hotplug: add some details about locking internals
From: David Hildenbrand Let's document the magic a bit, especially why device_hotplug_lock is required when adding/removing memory and how it all play together with requests to online/offline memory from user space. [ rppt: moved the text to Documentation/core-api/memory-hotplug.rst ] Link: http://lkml.kernel.org/r/20180925091457.28651-7-da...@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Pavel Tatashin Reviewed-by: Rashmica Gupta Cc: Jonathan Corbet Cc: Michal Hocko Cc: Balbir Singh Cc: Benjamin Herrenschmidt Cc: Boris Ostrovsky Cc: Dan Williams Cc: Greg Kroah-Hartman Cc: Haiyang Zhang Cc: Heiko Carstens Cc: John Allen Cc: Joonsoo Kim Cc: Juergen Gross Cc: Kate Stewart Cc: "K. Y. Srinivasan" Cc: Len Brown Cc: Martin Schwidefsky Cc: Mathieu Malaterre Cc: Michael Ellerman Cc: Michael Neuling Cc: Nathan Fontenot Cc: Oscar Salvador Cc: Paul Mackerras Cc: Philippe Ombredanne Cc: Rafael J. Wysocki Cc: "Rafael J. Wysocki" Cc: Stephen Hemminger Cc: Thomas Gleixner Cc: Vlastimil Babka Cc: YASUAKI ISHIMATSU Signed-off-by: Andrew Morton Signed-off-by: Mike Rapoport --- Documentation/core-api/memory-hotplug.rst | 38 +++ 1 file changed, 38 insertions(+) diff --git a/Documentation/core-api/memory-hotplug.rst b/Documentation/core-api/memory-hotplug.rst index a99f2f2..de7467e 100644 --- a/Documentation/core-api/memory-hotplug.rst +++ b/Documentation/core-api/memory-hotplug.rst @@ -85,3 +85,41 @@ MEM_ONLINE, or MEM_OFFLINE action to cancel hotplugging. It stops further processing of the notification queue. NOTIFY_STOP stops further processing of the notification queue. + +Locking Internals += + +When adding/removing memory that uses memory block devices (i.e. ordinary RAM), +the device_hotplug_lock should be held to: + +- synchronize against online/offline requests (e.g. via sysfs). This way, memory + block devices can only be accessed (.online/.state attributes) by user + space once memory has been fully added. And when removing memory, we + know nobody is in critical sections. +- synchronize against CPU hotplug and similar (e.g. relevant for ACPI and PPC) + +Especially, there is a possible lock inversion that is avoided using +device_hotplug_lock when adding memory and user space tries to online that +memory faster than expected: + +- device_online() will first take the device_lock(), followed by + mem_hotplug_lock +- add_memory_resource() will first take the mem_hotplug_lock, followed by + the device_lock() (while creating the devices, during bus_add_device()). + +As the device is visible to user space before taking the device_lock(), this +can result in a lock inversion. + +onlining/offlining of memory should be done via device_online()/ +device_offline() - to make sure it is properly synchronized to actions +via sysfs. Holding device_hotplug_lock is advised (to e.g. protect online_type) + +When adding/removing/onlining/offlining memory or adding/removing +heterogeneous/device memory, we should always hold the mem_hotplug_lock in +write mode to serialise memory hotplug (e.g. access to global/zone +variables). + +In addition, mem_hotplug_lock (in contrast to device_hotplug_lock) in read +mode allows for a quite efficient get_online_mems/put_online_mems +implementation, so code accessing memory can protect from that memory +vanishing. -- 2.7.4
Re: [PATCH v3 3/3] iio: magnetometer: Add driver support for PNI RM3100
On 2018年10月07日 23:44, Jonathan Cameron wrote: On Tue, 2 Oct 2018 22:38:12 +0800 Song Qiang wrote: PNI RM3100 is a high resolution, large signal immunity magnetometer, composed of 3 single sensors and a processing chip with a MagI2C interface. Following functions are available: - Single-shot measurement from /sys/bus/iio/devices/iio:deviceX/in_magn_{axis}_raw - Triggerd buffer measurement. - Both i2c and spi interface are supported. - Both interrupt and polling measurement is supported, depends on if the 'interrupts' in DT is declared. Signed-off-by: Song Qiang I realise now that I should have read the datasheet properly. Sorry about that. What we have here is a hybrid of polled and continuous measurement. If you are using the dataready as a trigger it is fine to support continuous measurement, but you aren't doing that here. The single shot measurement should be done with the method described in the datasheet where you write a POLL command and then wait for the single interrupt. There is no problem with racing and that interrupt is a high level one and can be handled as such. We should not do it by waiting for the next continuous measurement to happen after clearing the status register, which is what I think is happening here. If you want to use it in continuous mode, you should provide a trigger. That trigger will be fired by the dataready signal and the discussion I put in the earlier reply becomes relevant. Doing both of these options requires the interrupt handler to know which mode you are in, but that is straight forward to implement and is done in a number of other drivers. Sorry again that I failed to identify this issue earlier. Thanks to Phil as his question in the interrupt type got me thinking about how you were handing the interrupts. Jonathan Hi Jonathan, I learned the way of handling single shot from the driver of hmc5843, seems like it needs changing, too. There was some problems with my computer. Lenovo updates told me to update BIOS and it went dead. I didn't write any code the past few days, just got it fixed today. yours, Song Qiang --- MAINTAINERS| 7 + drivers/iio/magnetometer/Kconfig | 29 ++ drivers/iio/magnetometer/Makefile | 4 + drivers/iio/magnetometer/rm3100-core.c | 539 + drivers/iio/magnetometer/rm3100-i2c.c | 58 +++ drivers/iio/magnetometer/rm3100-spi.c | 64 +++ drivers/iio/magnetometer/rm3100.h | 17 + 7 files changed, 718 insertions(+) create mode 100644 drivers/iio/magnetometer/rm3100-core.c create mode 100644 drivers/iio/magnetometer/rm3100-i2c.c create mode 100644 drivers/iio/magnetometer/rm3100-spi.c create mode 100644 drivers/iio/magnetometer/rm3100.h diff --git a/MAINTAINERS b/MAINTAINERS index 967ce8cdd1cc..14eeeb072403 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11393,6 +11393,13 @@ M: "Rafael J. Wysocki" S:Maintained F:drivers/pnp/ +PNI RM3100 IIO DRIVER +M: Song Qiang +L: linux-...@vger.kernel.org +S: Maintained +F: drivers/iio/magnetometer/rm3100* +F: Documentation/devicetree/bindings/iio/magnetometer/pni,rm3100.txt + POSIX CLOCKS and TIMERS M:Thomas Gleixner L:linux-kernel@vger.kernel.org diff --git a/drivers/iio/magnetometer/Kconfig b/drivers/iio/magnetometer/Kconfig index ed9d776d01af..8a63cbbca4b7 100644 --- a/drivers/iio/magnetometer/Kconfig +++ b/drivers/iio/magnetometer/Kconfig @@ -175,4 +175,33 @@ config SENSORS_HMC5843_SPI - hmc5843_core (core functions) - hmc5843_spi (support for HMC5983) +config SENSORS_RM3100 + tristate + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER + +config SENSORS_RM3100_I2C + tristate "PNI RM3100 3-Axis Magnetometer (I2C)" + depends on I2C + select SENSORS_RM3100 + select REGMAP_I2C + help + Say Y here to add support for the PNI RM3100 3-Axis Magnetometer. + + This driver can also be compiled as a module. + To compile this driver as a module, choose M here: the module + will be called rm3100-i2c. + +config SENSORS_RM3100_SPI + tristate "PNI RM3100 3-Axis Magnetometer (SPI)" + depends on SPI_MASTER + select SENSORS_RM3100 + select REGMAP_SPI + help + Say Y here to add support for the PNI RM3100 3-Axis Magnetometer. + + This driver can also be compiled as a module. + To compile this driver as a module, choose M here: the module + will be called rm3100-spi. + endmenu diff --git a/drivers/iio/magnetometer/Makefile b/drivers/iio/magnetometer/Makefile index 664b2f866472..ba1bc34b82fa 100644 --- a/drivers/iio/magnetometer/Makefile +++ b/drivers/iio/magnetometer/Makefile @@ -24,3 +24,7 @@ obj-$(CONFIG_IIO_ST_MAGN_SPI_3AXIS) += st_magn_spi.o obj-$(CONFIG_SENSORS_HMC5843) += hmc5843_core.o obj-$(CONFIG_SENSORS_HMC5843_I2C) += hmc5843_i2c.o
Re: [PATCH v3 3/3] iio: magnetometer: Add driver support for PNI RM3100
On 2018年10月07日 23:44, Jonathan Cameron wrote: On Tue, 2 Oct 2018 22:38:12 +0800 Song Qiang wrote: PNI RM3100 is a high resolution, large signal immunity magnetometer, composed of 3 single sensors and a processing chip with a MagI2C interface. Following functions are available: - Single-shot measurement from /sys/bus/iio/devices/iio:deviceX/in_magn_{axis}_raw - Triggerd buffer measurement. - Both i2c and spi interface are supported. - Both interrupt and polling measurement is supported, depends on if the 'interrupts' in DT is declared. Signed-off-by: Song Qiang I realise now that I should have read the datasheet properly. Sorry about that. What we have here is a hybrid of polled and continuous measurement. If you are using the dataready as a trigger it is fine to support continuous measurement, but you aren't doing that here. The single shot measurement should be done with the method described in the datasheet where you write a POLL command and then wait for the single interrupt. There is no problem with racing and that interrupt is a high level one and can be handled as such. We should not do it by waiting for the next continuous measurement to happen after clearing the status register, which is what I think is happening here. If you want to use it in continuous mode, you should provide a trigger. That trigger will be fired by the dataready signal and the discussion I put in the earlier reply becomes relevant. Doing both of these options requires the interrupt handler to know which mode you are in, but that is straight forward to implement and is done in a number of other drivers. Sorry again that I failed to identify this issue earlier. Thanks to Phil as his question in the interrupt type got me thinking about how you were handing the interrupts. Jonathan Hi Jonathan, I learned the way of handling single shot from the driver of hmc5843, seems like it needs changing, too. There was some problems with my computer. Lenovo updates told me to update BIOS and it went dead. I didn't write any code the past few days, just got it fixed today. yours, Song Qiang --- MAINTAINERS| 7 + drivers/iio/magnetometer/Kconfig | 29 ++ drivers/iio/magnetometer/Makefile | 4 + drivers/iio/magnetometer/rm3100-core.c | 539 + drivers/iio/magnetometer/rm3100-i2c.c | 58 +++ drivers/iio/magnetometer/rm3100-spi.c | 64 +++ drivers/iio/magnetometer/rm3100.h | 17 + 7 files changed, 718 insertions(+) create mode 100644 drivers/iio/magnetometer/rm3100-core.c create mode 100644 drivers/iio/magnetometer/rm3100-i2c.c create mode 100644 drivers/iio/magnetometer/rm3100-spi.c create mode 100644 drivers/iio/magnetometer/rm3100.h diff --git a/MAINTAINERS b/MAINTAINERS index 967ce8cdd1cc..14eeeb072403 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11393,6 +11393,13 @@ M: "Rafael J. Wysocki" S:Maintained F:drivers/pnp/ +PNI RM3100 IIO DRIVER +M: Song Qiang +L: linux-...@vger.kernel.org +S: Maintained +F: drivers/iio/magnetometer/rm3100* +F: Documentation/devicetree/bindings/iio/magnetometer/pni,rm3100.txt + POSIX CLOCKS and TIMERS M:Thomas Gleixner L:linux-kernel@vger.kernel.org diff --git a/drivers/iio/magnetometer/Kconfig b/drivers/iio/magnetometer/Kconfig index ed9d776d01af..8a63cbbca4b7 100644 --- a/drivers/iio/magnetometer/Kconfig +++ b/drivers/iio/magnetometer/Kconfig @@ -175,4 +175,33 @@ config SENSORS_HMC5843_SPI - hmc5843_core (core functions) - hmc5843_spi (support for HMC5983) +config SENSORS_RM3100 + tristate + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER + +config SENSORS_RM3100_I2C + tristate "PNI RM3100 3-Axis Magnetometer (I2C)" + depends on I2C + select SENSORS_RM3100 + select REGMAP_I2C + help + Say Y here to add support for the PNI RM3100 3-Axis Magnetometer. + + This driver can also be compiled as a module. + To compile this driver as a module, choose M here: the module + will be called rm3100-i2c. + +config SENSORS_RM3100_SPI + tristate "PNI RM3100 3-Axis Magnetometer (SPI)" + depends on SPI_MASTER + select SENSORS_RM3100 + select REGMAP_SPI + help + Say Y here to add support for the PNI RM3100 3-Axis Magnetometer. + + This driver can also be compiled as a module. + To compile this driver as a module, choose M here: the module + will be called rm3100-spi. + endmenu diff --git a/drivers/iio/magnetometer/Makefile b/drivers/iio/magnetometer/Makefile index 664b2f866472..ba1bc34b82fa 100644 --- a/drivers/iio/magnetometer/Makefile +++ b/drivers/iio/magnetometer/Makefile @@ -24,3 +24,7 @@ obj-$(CONFIG_IIO_ST_MAGN_SPI_3AXIS) += st_magn_spi.o obj-$(CONFIG_SENSORS_HMC5843) += hmc5843_core.o obj-$(CONFIG_SENSORS_HMC5843_I2C) += hmc5843_i2c.o
Re: [RFC PATCH 2/2] net/ncsi: Configure multi-package, multi-channel modes with failover
On Wed, 2018-10-10 at 22:36 +, justin.l...@dell.com wrote: > Hi Samuel, > > I am still testing your change and have some comments below. > > Thanks, > Justin > > > > This patch extends the ncsi-netlink interface with two new commands and > > three new attributes to configure multiple packages and/or channels at > > once, and configure specific failover modes. > > > > NCSI_CMD_SET_PACKAGE mask and NCSI_CMD_SET_CHANNEL_MASK set a whitelist > > of packages or channels allowed to be configured with the > > NCSI_ATTR_PACKAGE_MASK and NCSI_ATTR_CHANNEL_MASK attributes > > respectively. If one of these whitelists is set only packages or > > channels matching the whitelist are considered for the channel queue in > > ncsi_choose_active_channel(). > > > > These commands may also use the NCSI_ATTR_MULTI_FLAG to signal that > > multiple packages or channels may be configured simultaneously. NCSI > > hardware arbitration (HWA) must be available in order to enable > > multi-package mode. Multi-channel mode is always available. > > > > If the NCSI_ATTR_CHANNEL_ID attribute is present in the > > NCSI_CMD_SET_CHANNEL_MASK command the it sets the preferred channel as > > with the NCSI_CMD_SET_INTERFACE command. The combination of preferred > > channel and channel whitelist defines a primary channel and the allowed > > failover channels. > > If the NCSI_ATTR_MULTI_FLAG attribute is also present then the preferred > > channel is configured for Tx/Rx and the other channels are enabled only > > for Rx. > > > > Signed-off-by: Samuel Mendoza-Jonas > > --- > > include/uapi/linux/ncsi.h | 16 +++ > > net/ncsi/internal.h | 11 +- > > net/ncsi/ncsi-aen.c | 2 +- > > net/ncsi/ncsi-manage.c| 138 > > net/ncsi/ncsi-netlink.c | 217 +- > > net/ncsi/ncsi-rsp.c | 2 +- > > 6 files changed, 312 insertions(+), 74 deletions(-) > > > > diff --git a/include/uapi/linux/ncsi.h b/include/uapi/linux/ncsi.h > > index 4c292ecbb748..035fba1693f9 100644 > > --- a/include/uapi/linux/ncsi.h > > +++ b/include/uapi/linux/ncsi.h > > @@ -23,6 +23,13 @@ > > * optionally the preferred NCSI_ATTR_CHANNEL_ID. > > * @NCSI_CMD_CLEAR_INTERFACE: clear any preferred package/channel > > combination. > > * Requires NCSI_ATTR_IFINDEX. > > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages. > > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels. > > + * Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK. > > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels. > > + * Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and > > + * NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets > > + * the primary channel. > > * @NCSI_CMD_MAX: highest command number > > */ > > There are some typo in the description. > * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages. > *Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK. > * @NCSI_CMD_SET_CHANNEL_MASK: set a whitelist of allowed channels. > *Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and > *NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets > *the primary channel. Haha, yes I threw that in at the end, thanks for catching it. > > > enum ncsi_nl_commands { > > @@ -30,6 +37,8 @@ enum ncsi_nl_commands { > > NCSI_CMD_PKG_INFO, > > NCSI_CMD_SET_INTERFACE, > > NCSI_CMD_CLEAR_INTERFACE, > > + NCSI_CMD_SET_PACKAGE_MASK, > > + NCSI_CMD_SET_CHANNEL_MASK, > > > > __NCSI_CMD_AFTER_LAST, > > NCSI_CMD_MAX = __NCSI_CMD_AFTER_LAST - 1 > > @@ -43,6 +52,10 @@ enum ncsi_nl_commands { > > * @NCSI_ATTR_PACKAGE_LIST: nested array of NCSI_PKG_ATTR attributes > > * @NCSI_ATTR_PACKAGE_ID: package ID > > * @NCSI_ATTR_CHANNEL_ID: channel ID > > + * @NCSI_ATTR_MULTI_FLAG: flag to signal that multi-mode should be enabled > > with > > + * NCSI_CMD_SET_PACKAGE_MASK or NCSI_CMD_SET_CHANNEL_MASK. > > + * @NCSI_ATTR_PACKAGE_MASK: 32-bit mask of allowed packages. > > + * @NCSI_ATTR_CHANNEL_MASK: 32-bit mask of allowed channels. > > * @NCSI_ATTR_MAX: highest attribute number > > */ > > enum ncsi_nl_attrs { > > @@ -51,6 +64,9 @@ enum ncsi_nl_attrs { > > NCSI_ATTR_PACKAGE_LIST, > > NCSI_ATTR_PACKAGE_ID, > > NCSI_ATTR_CHANNEL_ID, > > + NCSI_ATTR_MULTI_FLAG, > > + NCSI_ATTR_PACKAGE_MASK, > > + NCSI_ATTR_CHANNEL_MASK, > > Is there a case that we might set these two masks at the same time? > If not, maybe we can just have one generic MASK attribute. > I thought of this too: not yet, but I wonder if we might in the future. I'll have a think about it. > > > > __NCSI_ATTR_AFTER_LAST, > > NCSI_ATTR_MAX = __NCSI_ATTR_AFTER_LAST - 1 > > diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h > > index 3d0a33b874f5..8437474d0a78 100644 > > --- a/net/ncsi/internal.h > > +++ b/net/ncsi/internal.h > > @@ -213,6 +213,10 @@ struct ncsi_package { > > unsigned int
Re: [RFC PATCH 2/2] net/ncsi: Configure multi-package, multi-channel modes with failover
On Wed, 2018-10-10 at 22:36 +, justin.l...@dell.com wrote: > Hi Samuel, > > I am still testing your change and have some comments below. > > Thanks, > Justin > > > > This patch extends the ncsi-netlink interface with two new commands and > > three new attributes to configure multiple packages and/or channels at > > once, and configure specific failover modes. > > > > NCSI_CMD_SET_PACKAGE mask and NCSI_CMD_SET_CHANNEL_MASK set a whitelist > > of packages or channels allowed to be configured with the > > NCSI_ATTR_PACKAGE_MASK and NCSI_ATTR_CHANNEL_MASK attributes > > respectively. If one of these whitelists is set only packages or > > channels matching the whitelist are considered for the channel queue in > > ncsi_choose_active_channel(). > > > > These commands may also use the NCSI_ATTR_MULTI_FLAG to signal that > > multiple packages or channels may be configured simultaneously. NCSI > > hardware arbitration (HWA) must be available in order to enable > > multi-package mode. Multi-channel mode is always available. > > > > If the NCSI_ATTR_CHANNEL_ID attribute is present in the > > NCSI_CMD_SET_CHANNEL_MASK command the it sets the preferred channel as > > with the NCSI_CMD_SET_INTERFACE command. The combination of preferred > > channel and channel whitelist defines a primary channel and the allowed > > failover channels. > > If the NCSI_ATTR_MULTI_FLAG attribute is also present then the preferred > > channel is configured for Tx/Rx and the other channels are enabled only > > for Rx. > > > > Signed-off-by: Samuel Mendoza-Jonas > > --- > > include/uapi/linux/ncsi.h | 16 +++ > > net/ncsi/internal.h | 11 +- > > net/ncsi/ncsi-aen.c | 2 +- > > net/ncsi/ncsi-manage.c| 138 > > net/ncsi/ncsi-netlink.c | 217 +- > > net/ncsi/ncsi-rsp.c | 2 +- > > 6 files changed, 312 insertions(+), 74 deletions(-) > > > > diff --git a/include/uapi/linux/ncsi.h b/include/uapi/linux/ncsi.h > > index 4c292ecbb748..035fba1693f9 100644 > > --- a/include/uapi/linux/ncsi.h > > +++ b/include/uapi/linux/ncsi.h > > @@ -23,6 +23,13 @@ > > * optionally the preferred NCSI_ATTR_CHANNEL_ID. > > * @NCSI_CMD_CLEAR_INTERFACE: clear any preferred package/channel > > combination. > > * Requires NCSI_ATTR_IFINDEX. > > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages. > > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels. > > + * Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK. > > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels. > > + * Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and > > + * NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets > > + * the primary channel. > > * @NCSI_CMD_MAX: highest command number > > */ > > There are some typo in the description. > * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages. > *Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK. > * @NCSI_CMD_SET_CHANNEL_MASK: set a whitelist of allowed channels. > *Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and > *NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets > *the primary channel. Haha, yes I threw that in at the end, thanks for catching it. > > > enum ncsi_nl_commands { > > @@ -30,6 +37,8 @@ enum ncsi_nl_commands { > > NCSI_CMD_PKG_INFO, > > NCSI_CMD_SET_INTERFACE, > > NCSI_CMD_CLEAR_INTERFACE, > > + NCSI_CMD_SET_PACKAGE_MASK, > > + NCSI_CMD_SET_CHANNEL_MASK, > > > > __NCSI_CMD_AFTER_LAST, > > NCSI_CMD_MAX = __NCSI_CMD_AFTER_LAST - 1 > > @@ -43,6 +52,10 @@ enum ncsi_nl_commands { > > * @NCSI_ATTR_PACKAGE_LIST: nested array of NCSI_PKG_ATTR attributes > > * @NCSI_ATTR_PACKAGE_ID: package ID > > * @NCSI_ATTR_CHANNEL_ID: channel ID > > + * @NCSI_ATTR_MULTI_FLAG: flag to signal that multi-mode should be enabled > > with > > + * NCSI_CMD_SET_PACKAGE_MASK or NCSI_CMD_SET_CHANNEL_MASK. > > + * @NCSI_ATTR_PACKAGE_MASK: 32-bit mask of allowed packages. > > + * @NCSI_ATTR_CHANNEL_MASK: 32-bit mask of allowed channels. > > * @NCSI_ATTR_MAX: highest attribute number > > */ > > enum ncsi_nl_attrs { > > @@ -51,6 +64,9 @@ enum ncsi_nl_attrs { > > NCSI_ATTR_PACKAGE_LIST, > > NCSI_ATTR_PACKAGE_ID, > > NCSI_ATTR_CHANNEL_ID, > > + NCSI_ATTR_MULTI_FLAG, > > + NCSI_ATTR_PACKAGE_MASK, > > + NCSI_ATTR_CHANNEL_MASK, > > Is there a case that we might set these two masks at the same time? > If not, maybe we can just have one generic MASK attribute. > I thought of this too: not yet, but I wonder if we might in the future. I'll have a think about it. > > > > __NCSI_ATTR_AFTER_LAST, > > NCSI_ATTR_MAX = __NCSI_ATTR_AFTER_LAST - 1 > > diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h > > index 3d0a33b874f5..8437474d0a78 100644 > > --- a/net/ncsi/internal.h > > +++ b/net/ncsi/internal.h > > @@ -213,6 +213,10 @@ struct ncsi_package { > > unsigned int
Re: [PATCH] kbuild: Fail the build early when no lz4 present
Hi Borislav, On Thu, Oct 11, 2018 at 7:23 AM Borislav Petkov wrote: > > From: Borislav Petkov > > When building randconfigs, the build fails at kernel compression stage > due to missing lz4 on the system but CONFIG_KERNEL_LZ4 has been selected > by randconfig. The result looks somethins like this: > > (cat arch/x86/boot/compressed/vmlinux.bin > arch/x86/boot/compressed/vmlinux.relocs | lz4c -l -c1 stdin stdout && printf > \334\141\301\001) > arch/x86/boot/compressed/vmlinux.bin.lz4 || (rm -f > arch/x86/boot/compressed/vmlinux.bin.lz4 ; false) > /bin/sh: 1: lz4c: not found So, the cause of the failure is clear enough from the build log. It is weird to check only lz4c. If CONFIG_KERNEL_LZO is enabled, but lzop is not installed, I see this log LZO arch/x86/boot/compressed/vmlinux.bin.lzo /bin/sh: 1: lzop: not found It is still clear what to do, though. > make[2]: *** [arch/x86/boot/compressed/Makefile:143: > arch/x86/boot/compressed/vmlinux.bin.lz4] Error 1 > make[1]: *** [arch/x86/boot/Makefile:112: arch/x86/boot/compressed/vmlinux] > Error 2 > make: *** [arch/x86/Makefile:290: bzImage] Error 2 > > Fail the build much earlier by checking for lz4c presence before doing > anything else. Is it necessary to check this earlier? If you get this error, you just need to install the tool. Then, you can re-run the incremental build. BTW, this patch has a drawback. [1] Enable CONFIG_KERNEL_LZ4 on the system without lz4c installed [2] Run 'make' and you will get the error "lz4 tool not found on this system but CONFIG_KERNEL_LZ4 enabled" [3] Run 'make menuconfig' and switch from CONFIG_KERNEL_LZ4 to CONFIG_KERNEL_GZIP [4] Run 'make' and you will still get the same error even after you have chosen to use GZIP instead of LZ4. > Signed-off-by: Borislav Petkov > Cc: Masahiro Yamada > Cc: Michal Marek > Cc: linux-kbu...@vger.kernel.org > --- > Makefile | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/Makefile b/Makefile > index a0c32de80482..f91de649234b 100644 > --- a/Makefile > +++ b/Makefile > @@ -788,6 +788,12 @@ KBUILD_CFLAGS_KERNEL += -ffunction-sections > -fdata-sections > LDFLAGS_vmlinux += --gc-sections > endif > > +ifdef CONFIG_KERNEL_LZ4 > +ifeq ($(CONFIG_SHELL which lz4c),) > +$(error "lz4 tool not found on this system but CONFIG_KERNEL_LZ4 enabled") > +endif > +endif > + > # arch Makefile may override CC so keep this after arch Makefile is included > NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include) > > -- > 2.19.0.271.gfe8321ec057f > -- Best Regards Masahiro Yamada
Re: [PATCH] kbuild: Fail the build early when no lz4 present
Hi Borislav, On Thu, Oct 11, 2018 at 7:23 AM Borislav Petkov wrote: > > From: Borislav Petkov > > When building randconfigs, the build fails at kernel compression stage > due to missing lz4 on the system but CONFIG_KERNEL_LZ4 has been selected > by randconfig. The result looks somethins like this: > > (cat arch/x86/boot/compressed/vmlinux.bin > arch/x86/boot/compressed/vmlinux.relocs | lz4c -l -c1 stdin stdout && printf > \334\141\301\001) > arch/x86/boot/compressed/vmlinux.bin.lz4 || (rm -f > arch/x86/boot/compressed/vmlinux.bin.lz4 ; false) > /bin/sh: 1: lz4c: not found So, the cause of the failure is clear enough from the build log. It is weird to check only lz4c. If CONFIG_KERNEL_LZO is enabled, but lzop is not installed, I see this log LZO arch/x86/boot/compressed/vmlinux.bin.lzo /bin/sh: 1: lzop: not found It is still clear what to do, though. > make[2]: *** [arch/x86/boot/compressed/Makefile:143: > arch/x86/boot/compressed/vmlinux.bin.lz4] Error 1 > make[1]: *** [arch/x86/boot/Makefile:112: arch/x86/boot/compressed/vmlinux] > Error 2 > make: *** [arch/x86/Makefile:290: bzImage] Error 2 > > Fail the build much earlier by checking for lz4c presence before doing > anything else. Is it necessary to check this earlier? If you get this error, you just need to install the tool. Then, you can re-run the incremental build. BTW, this patch has a drawback. [1] Enable CONFIG_KERNEL_LZ4 on the system without lz4c installed [2] Run 'make' and you will get the error "lz4 tool not found on this system but CONFIG_KERNEL_LZ4 enabled" [3] Run 'make menuconfig' and switch from CONFIG_KERNEL_LZ4 to CONFIG_KERNEL_GZIP [4] Run 'make' and you will still get the same error even after you have chosen to use GZIP instead of LZ4. > Signed-off-by: Borislav Petkov > Cc: Masahiro Yamada > Cc: Michal Marek > Cc: linux-kbu...@vger.kernel.org > --- > Makefile | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/Makefile b/Makefile > index a0c32de80482..f91de649234b 100644 > --- a/Makefile > +++ b/Makefile > @@ -788,6 +788,12 @@ KBUILD_CFLAGS_KERNEL += -ffunction-sections > -fdata-sections > LDFLAGS_vmlinux += --gc-sections > endif > > +ifdef CONFIG_KERNEL_LZ4 > +ifeq ($(CONFIG_SHELL which lz4c),) > +$(error "lz4 tool not found on this system but CONFIG_KERNEL_LZ4 enabled") > +endif > +endif > + > # arch Makefile may override CC so keep this after arch Makefile is included > NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include) > > -- > 2.19.0.271.gfe8321ec057f > -- Best Regards Masahiro Yamada
[PATCH v3 7/7] ia64: wire up system calls
wire up perf_event_open, seccomp, pkey_mprotect, pkey_alloc, pkey_free, statx, io_pgetevents and rseq system calls This require an architecture specific implementation as it not present now. Signed-off-by: Firoz Khan --- arch/ia64/kernel/syscalls/syscall.tbl | 16 1 file changed, 16 insertions(+) diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl index 6b64f60..1f42b60 100644 --- a/arch/ia64/kernel/syscalls/syscall.tbl +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -335,3 +335,19 @@ 323 common copy_file_range sys_copy_file_range 324 common preadv2 sys_preadv2 325 common pwritev2sys_pwritev2 +# perf_event_open requires an architecture specific implementation +326common perf_event_open sys_perf_event_open +# seccomp requires an architecture specific implementation +327common seccomp sys_seccomp +# pkey_mprotect requires an architecture specific implementation +328common pkey_mprotect sys_pkey_mprotect +# pkey_alloc requires an architecture specific implementation +329common pkey_alloc sys_pkey_alloc +# pkey_free requires an architecture specific implementation +330common pkey_free sys_pkey_free +# statx requires an architecture specific implementation +331common statx sys_statx +# io_pgetevents requires an architecture specific implementation +332common io_pgetevents sys_io_pgetevents +# rseq requires an architecture specific implementation +333common rseqsys_rseq -- 1.9.1
[PATCH v3 7/7] ia64: wire up system calls
wire up perf_event_open, seccomp, pkey_mprotect, pkey_alloc, pkey_free, statx, io_pgetevents and rseq system calls This require an architecture specific implementation as it not present now. Signed-off-by: Firoz Khan --- arch/ia64/kernel/syscalls/syscall.tbl | 16 1 file changed, 16 insertions(+) diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl index 6b64f60..1f42b60 100644 --- a/arch/ia64/kernel/syscalls/syscall.tbl +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -335,3 +335,19 @@ 323 common copy_file_range sys_copy_file_range 324 common preadv2 sys_preadv2 325 common pwritev2sys_pwritev2 +# perf_event_open requires an architecture specific implementation +326common perf_event_open sys_perf_event_open +# seccomp requires an architecture specific implementation +327common seccomp sys_seccomp +# pkey_mprotect requires an architecture specific implementation +328common pkey_mprotect sys_pkey_mprotect +# pkey_alloc requires an architecture specific implementation +329common pkey_alloc sys_pkey_alloc +# pkey_free requires an architecture specific implementation +330common pkey_free sys_pkey_free +# statx requires an architecture specific implementation +331common statx sys_statx +# io_pgetevents requires an architecture specific implementation +332common io_pgetevents sys_io_pgetevents +# rseq requires an architecture specific implementation +333common rseqsys_rseq -- 1.9.1
[PATCH v3 6/7] ia64: uapi header and system call table file generation
System call table generation script must be run to generate unistd_64.h and syscall_table.h files. This patch will have changes which will invokes the script. This patch will generate unistd_64.h and syscall_table.h files by the syscall table generation script invoked by arch/ia64/Makefile and the generated files against the removed files will be identical. The generated uapi header file will be included in uapi/asm/unistd.h and generated system call table support file will be included by ia64/kernel/syscall_table.S file. Signed-off-by: Firoz Khan --- arch/ia64/Makefile | 3 + arch/ia64/include/asm/Kbuild| 1 + arch/ia64/include/uapi/asm/Kbuild | 1 + arch/ia64/include/uapi/asm/unistd.h | 332 +--- arch/ia64/kernel/syscall_table.S| 331 +-- 5 files changed, 9 insertions(+), 659 deletions(-) diff --git a/arch/ia64/Makefile b/arch/ia64/Makefile index 45f5980..320d86f 100644 --- a/arch/ia64/Makefile +++ b/arch/ia64/Makefile @@ -80,6 +80,9 @@ unwcheck: vmlinux archclean: $(Q)$(MAKE) $(clean)=$(boot) +archheaders: + $(Q)$(MAKE) $(build)=arch/ia64/kernel/syscalls all + CLEAN_FILES += vmlinux.gz bootloader boot: lib/lib.a vmlinux diff --git a/arch/ia64/include/asm/Kbuild b/arch/ia64/include/asm/Kbuild index 557bbc8..5b17695 100644 --- a/arch/ia64/include/asm/Kbuild +++ b/arch/ia64/include/asm/Kbuild @@ -7,3 +7,4 @@ generic-y += preempt.h generic-y += trace_clock.h generic-y += vtime.h generic-y += word-at-a-time.h +generic-y += syscall_table.h diff --git a/arch/ia64/include/uapi/asm/Kbuild b/arch/ia64/include/uapi/asm/Kbuild index 3982e67..5c30543 100644 --- a/arch/ia64/include/uapi/asm/Kbuild +++ b/arch/ia64/include/uapi/asm/Kbuild @@ -8,3 +8,4 @@ generic-y += msgbuf.h generic-y += poll.h generic-y += sembuf.h generic-y += shmbuf.h +generic-y += unistd_64.h diff --git a/arch/ia64/include/uapi/asm/unistd.h b/arch/ia64/include/uapi/asm/unistd.h index bd2575f..286349b 100644 --- a/arch/ia64/include/uapi/asm/unistd.h +++ b/arch/ia64/include/uapi/asm/unistd.h @@ -13,336 +13,6 @@ #define __BREAK_SYSCALL__IA64_BREAK_SYSCALL #define __NR_Linux 1024 -#define __NR_ni_syscall(__NR_Linux + 0) -#define __NR_exit (__NR_Linux + 1) -#define __NR_read (__NR_Linux + 2) -#define __NR_write (__NR_Linux + 3) -#define __NR_open (__NR_Linux + 4) -#define __NR_close (__NR_Linux + 5) -#define __NR_creat (__NR_Linux + 6) -#define __NR_link (__NR_Linux + 7) -#define __NR_unlink(__NR_Linux + 8) -#define __NR_execve(__NR_Linux + 9) -#define __NR_chdir (__NR_Linux + 10) -#define __NR_fchdir(__NR_Linux + 11) -#define __NR_utimes(__NR_Linux + 12) -#define __NR_mknod (__NR_Linux + 13) -#define __NR_chmod (__NR_Linux + 14) -#define __NR_chown (__NR_Linux + 15) -#define __NR_lseek (__NR_Linux + 16) -#define __NR_getpid(__NR_Linux + 17) -#define __NR_getppid (__NR_Linux + 18) -#define __NR_mount (__NR_Linux + 19) -#define __NR_umount(__NR_Linux + 20) -#define __NR_setuid(__NR_Linux + 21) -#define __NR_getuid(__NR_Linux + 22) -#define __NR_geteuid (__NR_Linux + 23) -#define __NR_ptrace(__NR_Linux + 24) -#define __NR_access(__NR_Linux + 25) -#define __NR_sync (__NR_Linux + 26) -#define __NR_fsync (__NR_Linux + 27) -#define __NR_fdatasync (__NR_Linux + 28) -#define __NR_kill (__NR_Linux + 29) -#define __NR_rename(__NR_Linux + 30) -#define __NR_mkdir (__NR_Linux + 31) -#define __NR_rmdir (__NR_Linux + 32) -#define __NR_dup (__NR_Linux + 33) -#define __NR_pipe (__NR_Linux + 34) -#define __NR_times (__NR_Linux + 35) -#define __NR_brk (__NR_Linux + 36) -#define __NR_setgid(__NR_Linux + 37) -#define __NR_getgid(__NR_Linux + 38) -#define __NR_getegid (__NR_Linux + 39) -#define __NR_acct (__NR_Linux + 40) -#define __NR_ioctl (__NR_Linux + 41) -#define __NR_fcntl (__NR_Linux + 42) -#define __NR_umask (__NR_Linux + 43) -#define __NR_chroot(__NR_Linux + 44) -#define __NR_ustat (__NR_Linux + 45) -#define __NR_dup2 (__NR_Linux + 46) -#define __NR_setreuid (__NR_Linux + 47) -#define __NR_setregid (__NR_Linux + 48) -#define __NR_getresuid (__NR_Linux + 49) -#define __NR_setresuid (__NR_Linux + 50) -#define __NR_getresgid (__NR_Linux + 51) -#define __NR_setresgid (__NR_Linux + 52) -#define __NR_getgroups (__NR_Linux + 53) -#define __NR_setgroups (__NR_Linux + 54) -#define __NR_getpgid (__NR_Linux + 55) -#define __NR_setpgid (__NR_Linux + 56) -#define __NR_setsid(__NR_Linux + 57) -#define __NR_getsid(__NR_Linux + 58) -#define __NR_sethostname (__NR_Linux + 59) -#define __NR_setrlimit (__NR_Linux + 60) -#define __NR_getrlimit (__NR_Linux + 61) -#define __NR_getrusage (__NR_Linux + 62) -#define __NR_gettimeofday (__NR_Linux + 63) -#define __NR_settimeofday (__NR_Linux + 64) -#define __NR_select
[PATCH v3 4/7] ia64: replace the system call table entries from entry.S
In IA64, system call table entries are the part of entry.S file. We need to keep it in a separate file so that one of the patch in this patch series contains a system call table generation script which can separately handle system call table entries. Replaced the system call table from entry.S to syscall_table.S, this is a new file. This change will unify the implementation across all the architecture and to simplify the implementation for system call table generation using the script. Signed-off-by: Firoz Khan --- arch/ia64/kernel/entry.S | 333 +- arch/ia64/kernel/syscall_table.S | 334 +++ 2 files changed, 335 insertions(+), 332 deletions(-) create mode 100644 arch/ia64/kernel/syscall_table.S diff --git a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S index 68362b3..249b2e9 100644 --- a/arch/ia64/kernel/entry.S +++ b/arch/ia64/kernel/entry.S @@ -1426,335 +1426,4 @@ END(ftrace_stub) #endif /* CONFIG_FUNCTION_TRACER */ - .rodata - .align 8 - .globl sys_call_table -sys_call_table: - data8 sys_ni_syscall// This must be sys_ni_syscall! See ivt.S. - data8 sys_exit // 1025 - data8 sys_read - data8 sys_write - data8 sys_open - data8 sys_close - data8 sys_creat // 1030 - data8 sys_link - data8 sys_unlink - data8 ia64_execve - data8 sys_chdir - data8 sys_fchdir// 1035 - data8 sys_utimes - data8 sys_mknod - data8 sys_chmod - data8 sys_chown - data8 sys_lseek // 1040 - data8 sys_getpid - data8 sys_getppid - data8 sys_mount - data8 sys_umount - data8 sys_setuid// 1045 - data8 sys_getuid - data8 sys_geteuid - data8 sys_ptrace - data8 sys_access - data8 sys_sync // 1050 - data8 sys_fsync - data8 sys_fdatasync - data8 sys_kill - data8 sys_rename - data8 sys_mkdir // 1055 - data8 sys_rmdir - data8 sys_dup - data8 sys_ia64_pipe - data8 sys_times - data8 ia64_brk // 1060 - data8 sys_setgid - data8 sys_getgid - data8 sys_getegid - data8 sys_acct - data8 sys_ioctl // 1065 - data8 sys_fcntl - data8 sys_umask - data8 sys_chroot - data8 sys_ustat - data8 sys_dup2 // 1070 - data8 sys_setreuid - data8 sys_setregid - data8 sys_getresuid - data8 sys_setresuid - data8 sys_getresgid // 1075 - data8 sys_setresgid - data8 sys_getgroups - data8 sys_setgroups - data8 sys_getpgid - data8 sys_setpgid // 1080 - data8 sys_setsid - data8 sys_getsid - data8 sys_sethostname - data8 sys_setrlimit - data8 sys_getrlimit // 1085 - data8 sys_getrusage - data8 sys_gettimeofday - data8 sys_settimeofday - data8 sys_select - data8 sys_poll // 1090 - data8 sys_symlink - data8 sys_readlink - data8 sys_uselib - data8 sys_swapon - data8 sys_swapoff // 1095 - data8 sys_reboot - data8 sys_truncate - data8 sys_ftruncate - data8 sys_fchmod - data8 sys_fchown// 1100 - data8 ia64_getpriority - data8 sys_setpriority - data8 sys_statfs - data8 sys_fstatfs - data8 sys_gettid// 1105 - data8 sys_semget - data8 sys_semop - data8 sys_semctl - data8 sys_msgget - data8 sys_msgsnd// 1110 - data8 sys_msgrcv - data8 sys_msgctl - data8 sys_shmget - data8 sys_shmat - data8 sys_shmdt // 1115 - data8 sys_shmctl - data8 sys_syslog - data8 sys_setitimer - data8 sys_getitimer - data8 sys_ni_syscall// 1120 /* was: ia64_oldstat */ - data8 sys_ni_syscall/* was: ia64_oldlstat */ - data8 sys_ni_syscall/* was: ia64_oldfstat */ - data8 sys_vhangup - data8 sys_lchown - data8 sys_remap_file_pages // 1125 - data8 sys_wait4 - data8 sys_sysinfo - data8 sys_clone - data8 sys_setdomainname - data8 sys_newuname // 1130 - data8 sys_adjtimex - data8 sys_ni_syscall/* was: ia64_create_module */ - data8 sys_init_module - data8 sys_delete_module - data8 sys_ni_syscall// 1135
[PATCH v3 5/7] ia64: add system call table generation support
The system call tables are in different format in all architecture and it will be difficult to manually add or modify the system calls in the respective files. To make it easy by keeping a script and which'll generate the header file and syscall table file so this change will unify them across all architectures. The system call table generation script is added in syscalls directory which contain the script to generate both uapi header file system call table generation file and syscall.tbl file which'll be the input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. syscallhdr.sh and syscalltbl.sh will generate uapi header- unistd_64.h and syscall_table.h files respectively. File syscall_table.h is included by syscall_table.S - the real system call table. Both .sh files will parse the content syscall.tbl to generate the header and table files. ARM, s390 and x86 architecuture does have the similar support. I leverage their implementation to come up with a generic solution. And this is the ground work for y2038 issue. We need to change two dozons of system call implementation and this work will reduce the effort by simply modify two dozon entries in syscall.tbl. Signed-off-by: Firoz Khan --- arch/ia64/kernel/syscalls/Makefile | 39 arch/ia64/kernel/syscalls/syscall.tbl | 337 arch/ia64/kernel/syscalls/syscallhdr.sh | 35 arch/ia64/kernel/syscalls/syscalltbl.sh | 37 4 files changed, 448 insertions(+) create mode 100644 arch/ia64/kernel/syscalls/Makefile create mode 100644 arch/ia64/kernel/syscalls/syscall.tbl create mode 100644 arch/ia64/kernel/syscalls/syscallhdr.sh create mode 100644 arch/ia64/kernel/syscalls/syscalltbl.sh diff --git a/arch/ia64/kernel/syscalls/Makefile b/arch/ia64/kernel/syscalls/Makefile new file mode 100644 index 000..011cf31 --- /dev/null +++ b/arch/ia64/kernel/syscalls/Makefile @@ -0,0 +1,39 @@ +# SPDX-License-Identifier: GPL-2.0 +kapi := arch/$(SRCARCH)/include/generated/asm +uapi := arch/$(SRCARCH)/include/generated/uapi/asm + +_dummy := $(shell [ -d '$(uapi)' ] || mkdir -p '$(uapi)') \ + $(shell [ -d '$(kapi)' ] || mkdir -p '$(kapi)') + +syscall := $(srctree)/$(src)/syscall.tbl +syshdr := $(srctree)/$(src)/syscallhdr.sh +systbl := $(srctree)/$(src)/syscalltbl.sh + +quiet_cmd_syshdr = SYSHDR $@ + cmd_syshdr = $(CONFIG_SHELL) '$(syshdr)' '$<' '$@' \ + '$(syshdr_abi_$(basetarget))' \ + '$(syshdr_pfx_$(basetarget))' \ + '$(syshdr_offset_$(basetarget))' + +quiet_cmd_systbl = SYSTBL $@ + cmd_systbl = $(CONFIG_SHELL) '$(systbl)' '$<' '$@' \ + '$(systbl_abi_$(basetarget))' \ + '$(systbl_offset_$(basetarget))' + +syshdr_offset_unistd_64 := __NR_Linux +$(uapi)/unistd_64.h: $(syscall) $(syshdr) + $(call if_changed,syshdr) + +systbl_offset_syscall_table := 1024 +$(kapi)/syscall_table.h: $(syscall) $(systbl) + $(call if_changed,systbl) + +uapisyshdr-y += unistd_64.h +kapisyshdr-y += syscall_table.h + +targets+= $(uapisyshdr-y) $(kapisyshdr-y) + +PHONY += all +all: $(addprefix $(uapi)/,$(uapisyshdr-y)) +all: $(addprefix $(kapi)/,$(kapisyshdr-y)) + @: diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl new file mode 100644 index 000..6b64f60 --- /dev/null +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -0,0 +1,337 @@ +# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note +# +# Linux system call numbers and entry vectors for IA64 +# +# The format is: +# +# +# Add 1024 to will get the actual system call number +# +# The is always "common" for this file +# +0 common ni_syscall sys_ni_syscall +1 common exitsys_exit +2 common readsys_read +3 common write sys_write +4 common opensys_open +5 common close sys_close +6 common creat sys_creat +7 common linksys_link +8 common unlink sys_unlink +9 common execve ia64_execve +10 common chdir sys_chdir +11 common fchdir sys_fchdir +12 common utimes sys_utimes +13 common mknod sys_mknod +14 common chmod sys_chmod +15 common chown
[PATCH v3 6/7] ia64: uapi header and system call table file generation
System call table generation script must be run to generate unistd_64.h and syscall_table.h files. This patch will have changes which will invokes the script. This patch will generate unistd_64.h and syscall_table.h files by the syscall table generation script invoked by arch/ia64/Makefile and the generated files against the removed files will be identical. The generated uapi header file will be included in uapi/asm/unistd.h and generated system call table support file will be included by ia64/kernel/syscall_table.S file. Signed-off-by: Firoz Khan --- arch/ia64/Makefile | 3 + arch/ia64/include/asm/Kbuild| 1 + arch/ia64/include/uapi/asm/Kbuild | 1 + arch/ia64/include/uapi/asm/unistd.h | 332 +--- arch/ia64/kernel/syscall_table.S| 331 +-- 5 files changed, 9 insertions(+), 659 deletions(-) diff --git a/arch/ia64/Makefile b/arch/ia64/Makefile index 45f5980..320d86f 100644 --- a/arch/ia64/Makefile +++ b/arch/ia64/Makefile @@ -80,6 +80,9 @@ unwcheck: vmlinux archclean: $(Q)$(MAKE) $(clean)=$(boot) +archheaders: + $(Q)$(MAKE) $(build)=arch/ia64/kernel/syscalls all + CLEAN_FILES += vmlinux.gz bootloader boot: lib/lib.a vmlinux diff --git a/arch/ia64/include/asm/Kbuild b/arch/ia64/include/asm/Kbuild index 557bbc8..5b17695 100644 --- a/arch/ia64/include/asm/Kbuild +++ b/arch/ia64/include/asm/Kbuild @@ -7,3 +7,4 @@ generic-y += preempt.h generic-y += trace_clock.h generic-y += vtime.h generic-y += word-at-a-time.h +generic-y += syscall_table.h diff --git a/arch/ia64/include/uapi/asm/Kbuild b/arch/ia64/include/uapi/asm/Kbuild index 3982e67..5c30543 100644 --- a/arch/ia64/include/uapi/asm/Kbuild +++ b/arch/ia64/include/uapi/asm/Kbuild @@ -8,3 +8,4 @@ generic-y += msgbuf.h generic-y += poll.h generic-y += sembuf.h generic-y += shmbuf.h +generic-y += unistd_64.h diff --git a/arch/ia64/include/uapi/asm/unistd.h b/arch/ia64/include/uapi/asm/unistd.h index bd2575f..286349b 100644 --- a/arch/ia64/include/uapi/asm/unistd.h +++ b/arch/ia64/include/uapi/asm/unistd.h @@ -13,336 +13,6 @@ #define __BREAK_SYSCALL__IA64_BREAK_SYSCALL #define __NR_Linux 1024 -#define __NR_ni_syscall(__NR_Linux + 0) -#define __NR_exit (__NR_Linux + 1) -#define __NR_read (__NR_Linux + 2) -#define __NR_write (__NR_Linux + 3) -#define __NR_open (__NR_Linux + 4) -#define __NR_close (__NR_Linux + 5) -#define __NR_creat (__NR_Linux + 6) -#define __NR_link (__NR_Linux + 7) -#define __NR_unlink(__NR_Linux + 8) -#define __NR_execve(__NR_Linux + 9) -#define __NR_chdir (__NR_Linux + 10) -#define __NR_fchdir(__NR_Linux + 11) -#define __NR_utimes(__NR_Linux + 12) -#define __NR_mknod (__NR_Linux + 13) -#define __NR_chmod (__NR_Linux + 14) -#define __NR_chown (__NR_Linux + 15) -#define __NR_lseek (__NR_Linux + 16) -#define __NR_getpid(__NR_Linux + 17) -#define __NR_getppid (__NR_Linux + 18) -#define __NR_mount (__NR_Linux + 19) -#define __NR_umount(__NR_Linux + 20) -#define __NR_setuid(__NR_Linux + 21) -#define __NR_getuid(__NR_Linux + 22) -#define __NR_geteuid (__NR_Linux + 23) -#define __NR_ptrace(__NR_Linux + 24) -#define __NR_access(__NR_Linux + 25) -#define __NR_sync (__NR_Linux + 26) -#define __NR_fsync (__NR_Linux + 27) -#define __NR_fdatasync (__NR_Linux + 28) -#define __NR_kill (__NR_Linux + 29) -#define __NR_rename(__NR_Linux + 30) -#define __NR_mkdir (__NR_Linux + 31) -#define __NR_rmdir (__NR_Linux + 32) -#define __NR_dup (__NR_Linux + 33) -#define __NR_pipe (__NR_Linux + 34) -#define __NR_times (__NR_Linux + 35) -#define __NR_brk (__NR_Linux + 36) -#define __NR_setgid(__NR_Linux + 37) -#define __NR_getgid(__NR_Linux + 38) -#define __NR_getegid (__NR_Linux + 39) -#define __NR_acct (__NR_Linux + 40) -#define __NR_ioctl (__NR_Linux + 41) -#define __NR_fcntl (__NR_Linux + 42) -#define __NR_umask (__NR_Linux + 43) -#define __NR_chroot(__NR_Linux + 44) -#define __NR_ustat (__NR_Linux + 45) -#define __NR_dup2 (__NR_Linux + 46) -#define __NR_setreuid (__NR_Linux + 47) -#define __NR_setregid (__NR_Linux + 48) -#define __NR_getresuid (__NR_Linux + 49) -#define __NR_setresuid (__NR_Linux + 50) -#define __NR_getresgid (__NR_Linux + 51) -#define __NR_setresgid (__NR_Linux + 52) -#define __NR_getgroups (__NR_Linux + 53) -#define __NR_setgroups (__NR_Linux + 54) -#define __NR_getpgid (__NR_Linux + 55) -#define __NR_setpgid (__NR_Linux + 56) -#define __NR_setsid(__NR_Linux + 57) -#define __NR_getsid(__NR_Linux + 58) -#define __NR_sethostname (__NR_Linux + 59) -#define __NR_setrlimit (__NR_Linux + 60) -#define __NR_getrlimit (__NR_Linux + 61) -#define __NR_getrusage (__NR_Linux + 62) -#define __NR_gettimeofday (__NR_Linux + 63) -#define __NR_settimeofday (__NR_Linux + 64) -#define __NR_select
[PATCH v3 4/7] ia64: replace the system call table entries from entry.S
In IA64, system call table entries are the part of entry.S file. We need to keep it in a separate file so that one of the patch in this patch series contains a system call table generation script which can separately handle system call table entries. Replaced the system call table from entry.S to syscall_table.S, this is a new file. This change will unify the implementation across all the architecture and to simplify the implementation for system call table generation using the script. Signed-off-by: Firoz Khan --- arch/ia64/kernel/entry.S | 333 +- arch/ia64/kernel/syscall_table.S | 334 +++ 2 files changed, 335 insertions(+), 332 deletions(-) create mode 100644 arch/ia64/kernel/syscall_table.S diff --git a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S index 68362b3..249b2e9 100644 --- a/arch/ia64/kernel/entry.S +++ b/arch/ia64/kernel/entry.S @@ -1426,335 +1426,4 @@ END(ftrace_stub) #endif /* CONFIG_FUNCTION_TRACER */ - .rodata - .align 8 - .globl sys_call_table -sys_call_table: - data8 sys_ni_syscall// This must be sys_ni_syscall! See ivt.S. - data8 sys_exit // 1025 - data8 sys_read - data8 sys_write - data8 sys_open - data8 sys_close - data8 sys_creat // 1030 - data8 sys_link - data8 sys_unlink - data8 ia64_execve - data8 sys_chdir - data8 sys_fchdir// 1035 - data8 sys_utimes - data8 sys_mknod - data8 sys_chmod - data8 sys_chown - data8 sys_lseek // 1040 - data8 sys_getpid - data8 sys_getppid - data8 sys_mount - data8 sys_umount - data8 sys_setuid// 1045 - data8 sys_getuid - data8 sys_geteuid - data8 sys_ptrace - data8 sys_access - data8 sys_sync // 1050 - data8 sys_fsync - data8 sys_fdatasync - data8 sys_kill - data8 sys_rename - data8 sys_mkdir // 1055 - data8 sys_rmdir - data8 sys_dup - data8 sys_ia64_pipe - data8 sys_times - data8 ia64_brk // 1060 - data8 sys_setgid - data8 sys_getgid - data8 sys_getegid - data8 sys_acct - data8 sys_ioctl // 1065 - data8 sys_fcntl - data8 sys_umask - data8 sys_chroot - data8 sys_ustat - data8 sys_dup2 // 1070 - data8 sys_setreuid - data8 sys_setregid - data8 sys_getresuid - data8 sys_setresuid - data8 sys_getresgid // 1075 - data8 sys_setresgid - data8 sys_getgroups - data8 sys_setgroups - data8 sys_getpgid - data8 sys_setpgid // 1080 - data8 sys_setsid - data8 sys_getsid - data8 sys_sethostname - data8 sys_setrlimit - data8 sys_getrlimit // 1085 - data8 sys_getrusage - data8 sys_gettimeofday - data8 sys_settimeofday - data8 sys_select - data8 sys_poll // 1090 - data8 sys_symlink - data8 sys_readlink - data8 sys_uselib - data8 sys_swapon - data8 sys_swapoff // 1095 - data8 sys_reboot - data8 sys_truncate - data8 sys_ftruncate - data8 sys_fchmod - data8 sys_fchown// 1100 - data8 ia64_getpriority - data8 sys_setpriority - data8 sys_statfs - data8 sys_fstatfs - data8 sys_gettid// 1105 - data8 sys_semget - data8 sys_semop - data8 sys_semctl - data8 sys_msgget - data8 sys_msgsnd// 1110 - data8 sys_msgrcv - data8 sys_msgctl - data8 sys_shmget - data8 sys_shmat - data8 sys_shmdt // 1115 - data8 sys_shmctl - data8 sys_syslog - data8 sys_setitimer - data8 sys_getitimer - data8 sys_ni_syscall// 1120 /* was: ia64_oldstat */ - data8 sys_ni_syscall/* was: ia64_oldlstat */ - data8 sys_ni_syscall/* was: ia64_oldfstat */ - data8 sys_vhangup - data8 sys_lchown - data8 sys_remap_file_pages // 1125 - data8 sys_wait4 - data8 sys_sysinfo - data8 sys_clone - data8 sys_setdomainname - data8 sys_newuname // 1130 - data8 sys_adjtimex - data8 sys_ni_syscall/* was: ia64_create_module */ - data8 sys_init_module - data8 sys_delete_module - data8 sys_ni_syscall// 1135
[PATCH v3 5/7] ia64: add system call table generation support
The system call tables are in different format in all architecture and it will be difficult to manually add or modify the system calls in the respective files. To make it easy by keeping a script and which'll generate the header file and syscall table file so this change will unify them across all architectures. The system call table generation script is added in syscalls directory which contain the script to generate both uapi header file system call table generation file and syscall.tbl file which'll be the input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. syscallhdr.sh and syscalltbl.sh will generate uapi header- unistd_64.h and syscall_table.h files respectively. File syscall_table.h is included by syscall_table.S - the real system call table. Both .sh files will parse the content syscall.tbl to generate the header and table files. ARM, s390 and x86 architecuture does have the similar support. I leverage their implementation to come up with a generic solution. And this is the ground work for y2038 issue. We need to change two dozons of system call implementation and this work will reduce the effort by simply modify two dozon entries in syscall.tbl. Signed-off-by: Firoz Khan --- arch/ia64/kernel/syscalls/Makefile | 39 arch/ia64/kernel/syscalls/syscall.tbl | 337 arch/ia64/kernel/syscalls/syscallhdr.sh | 35 arch/ia64/kernel/syscalls/syscalltbl.sh | 37 4 files changed, 448 insertions(+) create mode 100644 arch/ia64/kernel/syscalls/Makefile create mode 100644 arch/ia64/kernel/syscalls/syscall.tbl create mode 100644 arch/ia64/kernel/syscalls/syscallhdr.sh create mode 100644 arch/ia64/kernel/syscalls/syscalltbl.sh diff --git a/arch/ia64/kernel/syscalls/Makefile b/arch/ia64/kernel/syscalls/Makefile new file mode 100644 index 000..011cf31 --- /dev/null +++ b/arch/ia64/kernel/syscalls/Makefile @@ -0,0 +1,39 @@ +# SPDX-License-Identifier: GPL-2.0 +kapi := arch/$(SRCARCH)/include/generated/asm +uapi := arch/$(SRCARCH)/include/generated/uapi/asm + +_dummy := $(shell [ -d '$(uapi)' ] || mkdir -p '$(uapi)') \ + $(shell [ -d '$(kapi)' ] || mkdir -p '$(kapi)') + +syscall := $(srctree)/$(src)/syscall.tbl +syshdr := $(srctree)/$(src)/syscallhdr.sh +systbl := $(srctree)/$(src)/syscalltbl.sh + +quiet_cmd_syshdr = SYSHDR $@ + cmd_syshdr = $(CONFIG_SHELL) '$(syshdr)' '$<' '$@' \ + '$(syshdr_abi_$(basetarget))' \ + '$(syshdr_pfx_$(basetarget))' \ + '$(syshdr_offset_$(basetarget))' + +quiet_cmd_systbl = SYSTBL $@ + cmd_systbl = $(CONFIG_SHELL) '$(systbl)' '$<' '$@' \ + '$(systbl_abi_$(basetarget))' \ + '$(systbl_offset_$(basetarget))' + +syshdr_offset_unistd_64 := __NR_Linux +$(uapi)/unistd_64.h: $(syscall) $(syshdr) + $(call if_changed,syshdr) + +systbl_offset_syscall_table := 1024 +$(kapi)/syscall_table.h: $(syscall) $(systbl) + $(call if_changed,systbl) + +uapisyshdr-y += unistd_64.h +kapisyshdr-y += syscall_table.h + +targets+= $(uapisyshdr-y) $(kapisyshdr-y) + +PHONY += all +all: $(addprefix $(uapi)/,$(uapisyshdr-y)) +all: $(addprefix $(kapi)/,$(kapisyshdr-y)) + @: diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl new file mode 100644 index 000..6b64f60 --- /dev/null +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -0,0 +1,337 @@ +# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note +# +# Linux system call numbers and entry vectors for IA64 +# +# The format is: +# +# +# Add 1024 to will get the actual system call number +# +# The is always "common" for this file +# +0 common ni_syscall sys_ni_syscall +1 common exitsys_exit +2 common readsys_read +3 common write sys_write +4 common opensys_open +5 common close sys_close +6 common creat sys_creat +7 common linksys_link +8 common unlink sys_unlink +9 common execve ia64_execve +10 common chdir sys_chdir +11 common fchdir sys_fchdir +12 common utimes sys_utimes +13 common mknod sys_mknod +14 common chmod sys_chmod +15 common chown
[PATCH v3 2/7] ia64: replace NR_syscalls macro from asm/unistd.h
NR_syscalls macro holds the number of system call exist in IA64 architecture. This macro is currently the part of asm/unistd.h file. We have to change the value of NR_syscalls, if we add or delete a system call. One of the patch in this patch series has a script which will generate a uapi header based on syscall.tbl file. The syscall.tbl file contains the number of system call information. So we have two option to update NR_syscalls value. 1. Update NR_syscalls in asm/unistd.h manually by counting the no.of system calls. No need to update NR_syscalls until we either add a new system call or delete an existing system call. 2. We can keep this feature it above mentioned script, that'll count the number of syscalls and keep it in a generated file. In this case we don't need to explicitly update NR_syscalls in asm/unistd.h file. The 2nd option will be the recommended one. For that, I come up with another macro - __NR_syscalls which will be updated by the script and it will be present in uapi/asm/unistd.h. The macro name changed form NR_syscalls to __NR_syscalls for making the name convention same across all architecture. While __NR_syscalls isn't strictly part of the uapi, having it as part of the generated header to simplifies the implementation. We also need to enclose this macro with #ifdef __KERNEL__ to avoid side effects. Signed-off-by: Firoz Khan --- arch/ia64/include/asm/unistd.h | 4 +--- arch/ia64/include/uapi/asm/unistd.h | 4 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/ia64/include/asm/unistd.h b/arch/ia64/include/asm/unistd.h index ffb705d..397b143 100644 --- a/arch/ia64/include/asm/unistd.h +++ b/arch/ia64/include/asm/unistd.h @@ -10,9 +10,7 @@ #include - - -#define NR_syscalls326 /* length of syscall table */ +#define NR_syscalls__NR_syscalls /* length of syscall table */ /* * The following defines stop scripts/checksyscalls.sh from complaining about diff --git a/arch/ia64/include/uapi/asm/unistd.h b/arch/ia64/include/uapi/asm/unistd.h index 4d590c9..4186dc2 100644 --- a/arch/ia64/include/uapi/asm/unistd.h +++ b/arch/ia64/include/uapi/asm/unistd.h @@ -341,4 +341,8 @@ #define __NR_preadv2 1348 #define __NR_pwritev2 1349 +#ifdef __KERNEL__ +#define __NR_syscalls 326 +#endif + #endif /* _UAPI_ASM_IA64_UNISTD_H */ -- 1.9.1
[PATCH v3 1/7] ia64: add __NR_old_getpagesize in uapi/asm/unistd.h
sys_getpagesize entry is present in entry.S file to support for old user interface. So we need to add an uapi entry too. Add __NR_old_getpagesize in order to not break old user space as it is reserved for backwards compatibility with old __NR_ getpagesize. Signed-off-by: Firoz Khan --- arch/ia64/include/uapi/asm/unistd.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/ia64/include/uapi/asm/unistd.h b/arch/ia64/include/uapi/asm/unistd.h index 5fe71d4..4d590c9 100644 --- a/arch/ia64/include/uapi/asm/unistd.h +++ b/arch/ia64/include/uapi/asm/unistd.h @@ -161,7 +161,7 @@ #define __NR_nanosleep 1168 #define __NR_nfsservctl1169 #define __NR_prctl 1170 -/* 1171 is reserved for backwards compatibility with old __NR_getpagesize */ +#define __NR_old_getpagesize1171 #define __NR_mmap2 1172 #define __NR_pciconfig_read1173 #define __NR_pciconfig_write 1174 -- 1.9.1
[PATCH v3 2/7] ia64: replace NR_syscalls macro from asm/unistd.h
NR_syscalls macro holds the number of system call exist in IA64 architecture. This macro is currently the part of asm/unistd.h file. We have to change the value of NR_syscalls, if we add or delete a system call. One of the patch in this patch series has a script which will generate a uapi header based on syscall.tbl file. The syscall.tbl file contains the number of system call information. So we have two option to update NR_syscalls value. 1. Update NR_syscalls in asm/unistd.h manually by counting the no.of system calls. No need to update NR_syscalls until we either add a new system call or delete an existing system call. 2. We can keep this feature it above mentioned script, that'll count the number of syscalls and keep it in a generated file. In this case we don't need to explicitly update NR_syscalls in asm/unistd.h file. The 2nd option will be the recommended one. For that, I come up with another macro - __NR_syscalls which will be updated by the script and it will be present in uapi/asm/unistd.h. The macro name changed form NR_syscalls to __NR_syscalls for making the name convention same across all architecture. While __NR_syscalls isn't strictly part of the uapi, having it as part of the generated header to simplifies the implementation. We also need to enclose this macro with #ifdef __KERNEL__ to avoid side effects. Signed-off-by: Firoz Khan --- arch/ia64/include/asm/unistd.h | 4 +--- arch/ia64/include/uapi/asm/unistd.h | 4 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/ia64/include/asm/unistd.h b/arch/ia64/include/asm/unistd.h index ffb705d..397b143 100644 --- a/arch/ia64/include/asm/unistd.h +++ b/arch/ia64/include/asm/unistd.h @@ -10,9 +10,7 @@ #include - - -#define NR_syscalls326 /* length of syscall table */ +#define NR_syscalls__NR_syscalls /* length of syscall table */ /* * The following defines stop scripts/checksyscalls.sh from complaining about diff --git a/arch/ia64/include/uapi/asm/unistd.h b/arch/ia64/include/uapi/asm/unistd.h index 4d590c9..4186dc2 100644 --- a/arch/ia64/include/uapi/asm/unistd.h +++ b/arch/ia64/include/uapi/asm/unistd.h @@ -341,4 +341,8 @@ #define __NR_preadv2 1348 #define __NR_pwritev2 1349 +#ifdef __KERNEL__ +#define __NR_syscalls 326 +#endif + #endif /* _UAPI_ASM_IA64_UNISTD_H */ -- 1.9.1
[PATCH v3 1/7] ia64: add __NR_old_getpagesize in uapi/asm/unistd.h
sys_getpagesize entry is present in entry.S file to support for old user interface. So we need to add an uapi entry too. Add __NR_old_getpagesize in order to not break old user space as it is reserved for backwards compatibility with old __NR_ getpagesize. Signed-off-by: Firoz Khan --- arch/ia64/include/uapi/asm/unistd.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/ia64/include/uapi/asm/unistd.h b/arch/ia64/include/uapi/asm/unistd.h index 5fe71d4..4d590c9 100644 --- a/arch/ia64/include/uapi/asm/unistd.h +++ b/arch/ia64/include/uapi/asm/unistd.h @@ -161,7 +161,7 @@ #define __NR_nanosleep 1168 #define __NR_nfsservctl1169 #define __NR_prctl 1170 -/* 1171 is reserved for backwards compatibility with old __NR_getpagesize */ +#define __NR_old_getpagesize1171 #define __NR_mmap2 1172 #define __NR_pciconfig_read1173 #define __NR_pciconfig_write 1174 -- 1.9.1
[PATCH v3 3/7] ia64: add an offset for system call number
The system call number in IA64 architecture starts with 1024. But most of the other architecute starts with 0. In order to come up with a common implementation to generate uapi header we need to add an offset - __NR_Linux with a value 1024. One of the patch in this patch series does have a script to generate uapi header which uses syscall.tbl file. In syscall.tbl contain system call number. With the use of __NR_Linux, we can start the number from 0 instead of 1024. Signed-off-by: Firoz Khan --- arch/ia64/include/uapi/asm/unistd.h | 658 ++-- 1 file changed, 329 insertions(+), 329 deletions(-) diff --git a/arch/ia64/include/uapi/asm/unistd.h b/arch/ia64/include/uapi/asm/unistd.h index 4186dc2..bd2575f 100644 --- a/arch/ia64/include/uapi/asm/unistd.h +++ b/arch/ia64/include/uapi/asm/unistd.h @@ -8,338 +8,338 @@ #ifndef _UAPI_ASM_IA64_UNISTD_H #define _UAPI_ASM_IA64_UNISTD_H - #include -#define __BREAK_SYSCALL__IA64_BREAK_SYSCALL +#define __BREAK_SYSCALL__IA64_BREAK_SYSCALL -#define __NR_ni_syscall1024 -#define __NR_exit 1025 -#define __NR_read 1026 -#define __NR_write 1027 -#define __NR_open 1028 -#define __NR_close 1029 -#define __NR_creat 1030 -#define __NR_link 1031 -#define __NR_unlink1032 -#define __NR_execve1033 -#define __NR_chdir 1034 -#define __NR_fchdir1035 -#define __NR_utimes1036 -#define __NR_mknod 1037 -#define __NR_chmod 1038 -#define __NR_chown 1039 -#define __NR_lseek 1040 -#define __NR_getpid1041 -#define __NR_getppid 1042 -#define __NR_mount 1043 -#define __NR_umount1044 -#define __NR_setuid1045 -#define __NR_getuid1046 -#define __NR_geteuid 1047 -#define __NR_ptrace1048 -#define __NR_access1049 -#define __NR_sync 1050 -#define __NR_fsync 1051 -#define __NR_fdatasync 1052 -#define __NR_kill 1053 -#define __NR_rename1054 -#define __NR_mkdir 1055 -#define __NR_rmdir 1056 -#define __NR_dup 1057 -#define __NR_pipe 1058 -#define __NR_times 1059 -#define __NR_brk 1060 -#define __NR_setgid1061 -#define __NR_getgid1062 -#define __NR_getegid 1063 -#define __NR_acct 1064 -#define __NR_ioctl 1065 -#define __NR_fcntl 1066 -#define __NR_umask 1067 -#define __NR_chroot1068 -#define __NR_ustat 1069 -#define __NR_dup2 1070 -#define __NR_setreuid 1071 -#define __NR_setregid 1072 -#define __NR_getresuid 1073 -#define __NR_setresuid 1074 -#define __NR_getresgid 1075 -#define __NR_setresgid 1076 -#define __NR_getgroups 1077 -#define __NR_setgroups 1078 -#define __NR_getpgid 1079 -#define __NR_setpgid 1080 -#define __NR_setsid1081 -#define __NR_getsid1082 -#define __NR_sethostname 1083 -#define __NR_setrlimit 1084 -#define __NR_getrlimit 1085 -#define __NR_getrusage 1086 -#define __NR_gettimeofday 1087 -#define __NR_settimeofday 1088 -#define __NR_select1089 -#define __NR_poll 1090 -#define __NR_symlink 1091 -#define __NR_readlink 1092 -#define __NR_uselib1093 -#define __NR_swapon1094 -#define __NR_swapoff 1095 -#define __NR_reboot1096 -#define __NR_truncate 1097 -#define __NR_ftruncate 1098 -#define __NR_fchmod1099 -#define __NR_fchown1100 -#define __NR_getpriority 1101 -#define __NR_setpriority 1102 -#define __NR_statfs1103 -#define __NR_fstatfs 1104 -#define __NR_gettid1105 -#define __NR_semget1106 -#define __NR_semop 1107 -#define __NR_semctl1108 -#define __NR_msgget1109 -#define __NR_msgsnd1110 -#define
[PATCH v3 3/7] ia64: add an offset for system call number
The system call number in IA64 architecture starts with 1024. But most of the other architecute starts with 0. In order to come up with a common implementation to generate uapi header we need to add an offset - __NR_Linux with a value 1024. One of the patch in this patch series does have a script to generate uapi header which uses syscall.tbl file. In syscall.tbl contain system call number. With the use of __NR_Linux, we can start the number from 0 instead of 1024. Signed-off-by: Firoz Khan --- arch/ia64/include/uapi/asm/unistd.h | 658 ++-- 1 file changed, 329 insertions(+), 329 deletions(-) diff --git a/arch/ia64/include/uapi/asm/unistd.h b/arch/ia64/include/uapi/asm/unistd.h index 4186dc2..bd2575f 100644 --- a/arch/ia64/include/uapi/asm/unistd.h +++ b/arch/ia64/include/uapi/asm/unistd.h @@ -8,338 +8,338 @@ #ifndef _UAPI_ASM_IA64_UNISTD_H #define _UAPI_ASM_IA64_UNISTD_H - #include -#define __BREAK_SYSCALL__IA64_BREAK_SYSCALL +#define __BREAK_SYSCALL__IA64_BREAK_SYSCALL -#define __NR_ni_syscall1024 -#define __NR_exit 1025 -#define __NR_read 1026 -#define __NR_write 1027 -#define __NR_open 1028 -#define __NR_close 1029 -#define __NR_creat 1030 -#define __NR_link 1031 -#define __NR_unlink1032 -#define __NR_execve1033 -#define __NR_chdir 1034 -#define __NR_fchdir1035 -#define __NR_utimes1036 -#define __NR_mknod 1037 -#define __NR_chmod 1038 -#define __NR_chown 1039 -#define __NR_lseek 1040 -#define __NR_getpid1041 -#define __NR_getppid 1042 -#define __NR_mount 1043 -#define __NR_umount1044 -#define __NR_setuid1045 -#define __NR_getuid1046 -#define __NR_geteuid 1047 -#define __NR_ptrace1048 -#define __NR_access1049 -#define __NR_sync 1050 -#define __NR_fsync 1051 -#define __NR_fdatasync 1052 -#define __NR_kill 1053 -#define __NR_rename1054 -#define __NR_mkdir 1055 -#define __NR_rmdir 1056 -#define __NR_dup 1057 -#define __NR_pipe 1058 -#define __NR_times 1059 -#define __NR_brk 1060 -#define __NR_setgid1061 -#define __NR_getgid1062 -#define __NR_getegid 1063 -#define __NR_acct 1064 -#define __NR_ioctl 1065 -#define __NR_fcntl 1066 -#define __NR_umask 1067 -#define __NR_chroot1068 -#define __NR_ustat 1069 -#define __NR_dup2 1070 -#define __NR_setreuid 1071 -#define __NR_setregid 1072 -#define __NR_getresuid 1073 -#define __NR_setresuid 1074 -#define __NR_getresgid 1075 -#define __NR_setresgid 1076 -#define __NR_getgroups 1077 -#define __NR_setgroups 1078 -#define __NR_getpgid 1079 -#define __NR_setpgid 1080 -#define __NR_setsid1081 -#define __NR_getsid1082 -#define __NR_sethostname 1083 -#define __NR_setrlimit 1084 -#define __NR_getrlimit 1085 -#define __NR_getrusage 1086 -#define __NR_gettimeofday 1087 -#define __NR_settimeofday 1088 -#define __NR_select1089 -#define __NR_poll 1090 -#define __NR_symlink 1091 -#define __NR_readlink 1092 -#define __NR_uselib1093 -#define __NR_swapon1094 -#define __NR_swapoff 1095 -#define __NR_reboot1096 -#define __NR_truncate 1097 -#define __NR_ftruncate 1098 -#define __NR_fchmod1099 -#define __NR_fchown1100 -#define __NR_getpriority 1101 -#define __NR_setpriority 1102 -#define __NR_statfs1103 -#define __NR_fstatfs 1104 -#define __NR_gettid1105 -#define __NR_semget1106 -#define __NR_semop 1107 -#define __NR_semctl1108 -#define __NR_msgget1109 -#define __NR_msgsnd1110 -#define
[PATCH v3 0/7] ia64: system call table generation support
The purpose of this patch series is: 1. We can easily add/modify/delete system call by changing entry in syscall.tbl file. No need to manually edit many files. 2. It is easy to unify the system call implementation across all the architectures. The system call tables are in different format in all architecture and it will be difficult to manually add or modify the system calls in the respective files manually. To make it easy by keeping a script and which'll generate the header file and syscall table file so this change will unify them across all architectures. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. ARM, s390 and x86 architecuture does exist the similar support. I leverage their implementation to come up with a generic solution. I have done the same support for work for alpha, microblaze, sparc, mips, parisc, powerpc, sh, sparc, and xtensa. But I started sending the patch for one architecuture for review. Below mentioned git repository contains more details. Git repo:- https://github.com/frzkhn/system_call_table_generator/ In v3 patch series, I wired up perf_event_open, seccomp, pkey_ mprotect, pkey_alloc, pkey_free, statx, io_pgetevents and rseq system calls. This require an architecture specific implementation as it not present now. Finally, this is the ground work for solving the Y2038 issue. We need to add/change two dozen of system calls to solve Y2038 issue. So this patch series will help to easily modify from existing system call to Y2038 compatible system calls. Firoz Khan (7): ia64: add __NR_old_getpagesize in uapi/asm/unistd.h ia64: replace NR_syscalls macro from asm/unistd.h ia64: add an offset for system call number ia64: replace the system call table entries from entry.S ia64: add system call table generation support ia64: uapi header and system call table file generation ia64: wire up system calls arch/ia64/Makefile | 3 + arch/ia64/include/asm/Kbuild| 1 + arch/ia64/include/asm/unistd.h | 4 +- arch/ia64/include/uapi/asm/Kbuild | 1 + arch/ia64/include/uapi/asm/unistd.h | 332 +- arch/ia64/kernel/entry.S| 333 +- arch/ia64/kernel/syscall_table.S| 9 + arch/ia64/kernel/syscalls/Makefile | 39 arch/ia64/kernel/syscalls/syscall.tbl | 353 arch/ia64/kernel/syscalls/syscallhdr.sh | 35 arch/ia64/kernel/syscalls/syscalltbl.sh | 37 11 files changed, 483 insertions(+), 664 deletions(-) create mode 100644 arch/ia64/kernel/syscall_table.S create mode 100644 arch/ia64/kernel/syscalls/Makefile create mode 100644 arch/ia64/kernel/syscalls/syscall.tbl create mode 100644 arch/ia64/kernel/syscalls/syscallhdr.sh create mode 100644 arch/ia64/kernel/syscalls/syscalltbl.sh -- 1.9.1
[PATCH v3 0/7] ia64: system call table generation support
The purpose of this patch series is: 1. We can easily add/modify/delete system call by changing entry in syscall.tbl file. No need to manually edit many files. 2. It is easy to unify the system call implementation across all the architectures. The system call tables are in different format in all architecture and it will be difficult to manually add or modify the system calls in the respective files manually. To make it easy by keeping a script and which'll generate the header file and syscall table file so this change will unify them across all architectures. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. ARM, s390 and x86 architecuture does exist the similar support. I leverage their implementation to come up with a generic solution. I have done the same support for work for alpha, microblaze, sparc, mips, parisc, powerpc, sh, sparc, and xtensa. But I started sending the patch for one architecuture for review. Below mentioned git repository contains more details. Git repo:- https://github.com/frzkhn/system_call_table_generator/ In v3 patch series, I wired up perf_event_open, seccomp, pkey_ mprotect, pkey_alloc, pkey_free, statx, io_pgetevents and rseq system calls. This require an architecture specific implementation as it not present now. Finally, this is the ground work for solving the Y2038 issue. We need to add/change two dozen of system calls to solve Y2038 issue. So this patch series will help to easily modify from existing system call to Y2038 compatible system calls. Firoz Khan (7): ia64: add __NR_old_getpagesize in uapi/asm/unistd.h ia64: replace NR_syscalls macro from asm/unistd.h ia64: add an offset for system call number ia64: replace the system call table entries from entry.S ia64: add system call table generation support ia64: uapi header and system call table file generation ia64: wire up system calls arch/ia64/Makefile | 3 + arch/ia64/include/asm/Kbuild| 1 + arch/ia64/include/asm/unistd.h | 4 +- arch/ia64/include/uapi/asm/Kbuild | 1 + arch/ia64/include/uapi/asm/unistd.h | 332 +- arch/ia64/kernel/entry.S| 333 +- arch/ia64/kernel/syscall_table.S| 9 + arch/ia64/kernel/syscalls/Makefile | 39 arch/ia64/kernel/syscalls/syscall.tbl | 353 arch/ia64/kernel/syscalls/syscallhdr.sh | 35 arch/ia64/kernel/syscalls/syscalltbl.sh | 37 11 files changed, 483 insertions(+), 664 deletions(-) create mode 100644 arch/ia64/kernel/syscall_table.S create mode 100644 arch/ia64/kernel/syscalls/Makefile create mode 100644 arch/ia64/kernel/syscalls/syscall.tbl create mode 100644 arch/ia64/kernel/syscalls/syscallhdr.sh create mode 100644 arch/ia64/kernel/syscalls/syscalltbl.sh -- 1.9.1
[PATCH v15 1/2] leds: core: Introduce LED pattern trigger
This patch adds a new led trigger that LED device can employ software or hardware pattern engine. Consumers can write 'pattern' file to enable the software pattern which alters the brightness for the specified duration with one software timer. Moreover consumers can write 'hw_pattern' file to enable the hardware pattern for some LED controllers which can autonomously control brightness over time, according to some preprogrammed hardware patterns. Signed-off-by: Raphael Teysseyre Signed-off-by: Baolin Wang --- Changes from v14: - Improve the commit message and ABI documentation. - Fix some coding style issues. - Do not limit the tuple's duration larger than 50 ms, and treat is as a a step change of brightness. Changes from v13: - Add duration validation for gradual dimming. - Coding style optimization. Changes from v12: - Add gradual dimming support for software pattern. Changes from v11: - Change -1 means repeat indefinitely. Changes from v10: - Change 'int' to 'u32' for delta_t field. Changes from v9: - None. Changes from v8: - None. Changes from v7: - Move the SC27XX hardware patterns description into its own ABI file. Changes from v6: - Improve commit message. - Optimize the description of the hw_pattern file. - Simplify some logics. Changes from v5: - Add one 'hw_pattern' file for hardware patterns. Changes from v4: - Change the repeat file to return the originally written number. - Improve comments. - Fix some build warnings. Changes from v3: - Reset pattern number to 0 if user provides incorrect pattern string. - Support one pattern. Changes from v2: - Remove hardware_pattern boolen. - Chnage the pattern string format. Changes from v1: - Use ATTRIBUTE_GROUPS() to define attributes. - Introduce hardware_pattern flag to determine if software pattern or hardware pattern. - Re-implement pattern_trig_store_pattern() function. - Remove pattern_get() interface. - Improve comments. - Other small optimization. --- .../ABI/testing/sysfs-class-led-trigger-pattern| 82 drivers/leds/trigger/Kconfig |7 + drivers/leds/trigger/Makefile |1 + drivers/leds/trigger/ledtrig-pattern.c | 411 include/linux/leds.h | 15 + 5 files changed, 516 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-class-led-trigger-pattern create mode 100644 drivers/leds/trigger/ledtrig-pattern.c diff --git a/Documentation/ABI/testing/sysfs-class-led-trigger-pattern b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern new file mode 100644 index 000..fb3d1e0 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern @@ -0,0 +1,82 @@ +What: /sys/class/leds//pattern +Date: September 2018 +KernelVersion: 4.20 +Description: + Specify a software pattern for the LED, that supports altering + the brightness for the specified duration with one software + timer. It can do gradual dimming and step change of brightness. + + The pattern is given by a series of tuples, of brightness and + duration (ms). The LED is expected to traverse the series and + each brightness value for the specified duration. Duration of + 0 means brightness should immediately change to new value, and + writing malformed pattern deactivates any active one. + + 1. For gradual dimming, the dimming interval now is set as 50 + milliseconds. So the tuple with duration less than dimming + interval (50ms) is treated as a step change of brightness, + i.e. the subsequent brightness will be applied without adding + intervening dimming intervals. + + The gradual dimming format of the software pattern values should be: + "brightness_1 duration_1 brightness_2 duration_2 brightness_3 + duration_3 ...". For example: + + echo 0 1000 255 2000 > pattern + + It will make the LED go gradually from zero-intensity to max (255) + intensity in 1000 milliseconds, then back to zero intensity in 2000 + milliseconds: + + LED brightness + ^ + 255-| / \/ \/ + | /\ /\ / + | / \ / \ / + |/ \ / \ / + 0-| / \/ \/ + +---0123456> time (s) + + 2. To make the LED go instantly from one brigntess value to another, + we should use use zero-time lengths (the brightness must be same as + the previous tuple's). So the format should be: + "brightness_1
[PATCH v15 1/2] leds: core: Introduce LED pattern trigger
This patch adds a new led trigger that LED device can employ software or hardware pattern engine. Consumers can write 'pattern' file to enable the software pattern which alters the brightness for the specified duration with one software timer. Moreover consumers can write 'hw_pattern' file to enable the hardware pattern for some LED controllers which can autonomously control brightness over time, according to some preprogrammed hardware patterns. Signed-off-by: Raphael Teysseyre Signed-off-by: Baolin Wang --- Changes from v14: - Improve the commit message and ABI documentation. - Fix some coding style issues. - Do not limit the tuple's duration larger than 50 ms, and treat is as a a step change of brightness. Changes from v13: - Add duration validation for gradual dimming. - Coding style optimization. Changes from v12: - Add gradual dimming support for software pattern. Changes from v11: - Change -1 means repeat indefinitely. Changes from v10: - Change 'int' to 'u32' for delta_t field. Changes from v9: - None. Changes from v8: - None. Changes from v7: - Move the SC27XX hardware patterns description into its own ABI file. Changes from v6: - Improve commit message. - Optimize the description of the hw_pattern file. - Simplify some logics. Changes from v5: - Add one 'hw_pattern' file for hardware patterns. Changes from v4: - Change the repeat file to return the originally written number. - Improve comments. - Fix some build warnings. Changes from v3: - Reset pattern number to 0 if user provides incorrect pattern string. - Support one pattern. Changes from v2: - Remove hardware_pattern boolen. - Chnage the pattern string format. Changes from v1: - Use ATTRIBUTE_GROUPS() to define attributes. - Introduce hardware_pattern flag to determine if software pattern or hardware pattern. - Re-implement pattern_trig_store_pattern() function. - Remove pattern_get() interface. - Improve comments. - Other small optimization. --- .../ABI/testing/sysfs-class-led-trigger-pattern| 82 drivers/leds/trigger/Kconfig |7 + drivers/leds/trigger/Makefile |1 + drivers/leds/trigger/ledtrig-pattern.c | 411 include/linux/leds.h | 15 + 5 files changed, 516 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-class-led-trigger-pattern create mode 100644 drivers/leds/trigger/ledtrig-pattern.c diff --git a/Documentation/ABI/testing/sysfs-class-led-trigger-pattern b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern new file mode 100644 index 000..fb3d1e0 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern @@ -0,0 +1,82 @@ +What: /sys/class/leds//pattern +Date: September 2018 +KernelVersion: 4.20 +Description: + Specify a software pattern for the LED, that supports altering + the brightness for the specified duration with one software + timer. It can do gradual dimming and step change of brightness. + + The pattern is given by a series of tuples, of brightness and + duration (ms). The LED is expected to traverse the series and + each brightness value for the specified duration. Duration of + 0 means brightness should immediately change to new value, and + writing malformed pattern deactivates any active one. + + 1. For gradual dimming, the dimming interval now is set as 50 + milliseconds. So the tuple with duration less than dimming + interval (50ms) is treated as a step change of brightness, + i.e. the subsequent brightness will be applied without adding + intervening dimming intervals. + + The gradual dimming format of the software pattern values should be: + "brightness_1 duration_1 brightness_2 duration_2 brightness_3 + duration_3 ...". For example: + + echo 0 1000 255 2000 > pattern + + It will make the LED go gradually from zero-intensity to max (255) + intensity in 1000 milliseconds, then back to zero intensity in 2000 + milliseconds: + + LED brightness + ^ + 255-| / \/ \/ + | /\ /\ / + | / \ / \ / + |/ \ / \ / + 0-| / \/ \/ + +---0123456> time (s) + + 2. To make the LED go instantly from one brigntess value to another, + we should use use zero-time lengths (the brightness must be same as + the previous tuple's). So the format should be: + "brightness_1
[PATCH v15 2/2] leds: sc27xx: Add pattern_set/clear interfaces for LED controller
This patch implements the 'pattern_set'and 'pattern_clear' interfaces to support SC27XX LED breathing mode. Signed-off-by: Baolin Wang Acked-by: Pavel Machek --- Chnages from v14: - None. Changes from v13: - None. Changes from v12: - None. Changes from v11: - None. Changes from v10: - Add duration alignment function suggested by Jacek. - Add acked tag from Pavel. Changes from v9: - Optimize the ABI documentation file. - Update the brightness value in hardware pattern mode. Changes from v8: - Optimize the ABI documentation file. Changes from v7: - Add its own ABI documentation file. Changes from v6: - None. Changes from v5: - None. Changes from v4: - None. Changes from v3: - None. Changes from v2: - None. Changes from v1: - Remove pattern_get interface. --- .../ABI/testing/sysfs-class-led-driver-sc27xx | 22 drivers/leds/leds-sc27xx-bltc.c| 121 2 files changed, 143 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-class-led-driver-sc27xx diff --git a/Documentation/ABI/testing/sysfs-class-led-driver-sc27xx b/Documentation/ABI/testing/sysfs-class-led-driver-sc27xx new file mode 100644 index 000..45b1e60 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-led-driver-sc27xx @@ -0,0 +1,22 @@ +What: /sys/class/leds//hw_pattern +Date: September 2018 +KernelVersion: 4.20 +Description: + Specify a hardware pattern for the SC27XX LED. For the SC27XX + LED controller, it only supports 4 stages to make a single + hardware pattern, which is used to configure the rise time, + high time, fall time and low time for the breathing mode. + + For the breathing mode, the SC27XX LED only expects one brightness + for the high stage. To be compatible with the hardware pattern + format, we should set brightness as 0 for rise stage, fall + stage and low stage. + + Min stage duration: 125 ms + Max stage duration: 31875 ms + + Since the stage duration step is 125 ms, the duration should be + a multiplier of 125, like 125ms, 250ms, 375ms, 500ms ... 31875ms. + + Thus the format of the hardware pattern values should be: + "0 rise_duration brightness high_duration 0 fall_duration 0 low_duration". diff --git a/drivers/leds/leds-sc27xx-bltc.c b/drivers/leds/leds-sc27xx-bltc.c index 9d9b7aa..fecf27f 100644 --- a/drivers/leds/leds-sc27xx-bltc.c +++ b/drivers/leds/leds-sc27xx-bltc.c @@ -32,8 +32,18 @@ #define SC27XX_DUTY_MASK GENMASK(15, 0) #define SC27XX_MOD_MASKGENMASK(7, 0) +#define SC27XX_CURVE_SHIFT 8 +#define SC27XX_CURVE_L_MASKGENMASK(7, 0) +#define SC27XX_CURVE_H_MASKGENMASK(15, 8) + #define SC27XX_LEDS_OFFSET 0x10 #define SC27XX_LEDS_MAX3 +#define SC27XX_LEDS_PATTERN_CNT4 +/* Stage duration step, in milliseconds */ +#define SC27XX_LEDS_STEP 125 +/* Minimum and maximum duration, in milliseconds */ +#define SC27XX_DELTA_T_MIN SC27XX_LEDS_STEP +#define SC27XX_DELTA_T_MAX (SC27XX_LEDS_STEP * 255) struct sc27xx_led { char name[LED_MAX_NAME_SIZE]; @@ -122,6 +132,113 @@ static int sc27xx_led_set(struct led_classdev *ldev, enum led_brightness value) return err; } +static void sc27xx_led_clamp_align_delta_t(u32 *delta_t) +{ + u32 v, offset, t = *delta_t; + + v = t + SC27XX_LEDS_STEP / 2; + v = clamp_t(u32, v, SC27XX_DELTA_T_MIN, SC27XX_DELTA_T_MAX); + offset = v - SC27XX_DELTA_T_MIN; + offset = SC27XX_LEDS_STEP * (offset / SC27XX_LEDS_STEP); + + *delta_t = SC27XX_DELTA_T_MIN + offset; +} + +static int sc27xx_led_pattern_clear(struct led_classdev *ldev) +{ + struct sc27xx_led *leds = to_sc27xx_led(ldev); + struct regmap *regmap = leds->priv->regmap; + u32 base = sc27xx_led_get_offset(leds); + u32 ctrl_base = leds->priv->base + SC27XX_LEDS_CTRL; + u8 ctrl_shift = SC27XX_CTRL_SHIFT * leds->line; + int err; + + mutex_lock(>priv->lock); + + /* Reset the rise, high, fall and low time to zero. */ + regmap_write(regmap, base + SC27XX_LEDS_CURVE0, 0); + regmap_write(regmap, base + SC27XX_LEDS_CURVE1, 0); + + err = regmap_update_bits(regmap, ctrl_base, + (SC27XX_LED_RUN | SC27XX_LED_TYPE) << ctrl_shift, 0); + + ldev->brightness = LED_OFF; + + mutex_unlock(>priv->lock); + + return err; +} + +static int sc27xx_led_pattern_set(struct led_classdev *ldev, + struct led_pattern *pattern, + u32 len, int repeat) +{ + struct sc27xx_led *leds = to_sc27xx_led(ldev); + u32 base = sc27xx_led_get_offset(leds); + u32 ctrl_base = leds->priv->base + SC27XX_LEDS_CTRL; + u8 ctrl_shift = SC27XX_CTRL_SHIFT *
[PATCH v15 2/2] leds: sc27xx: Add pattern_set/clear interfaces for LED controller
This patch implements the 'pattern_set'and 'pattern_clear' interfaces to support SC27XX LED breathing mode. Signed-off-by: Baolin Wang Acked-by: Pavel Machek --- Chnages from v14: - None. Changes from v13: - None. Changes from v12: - None. Changes from v11: - None. Changes from v10: - Add duration alignment function suggested by Jacek. - Add acked tag from Pavel. Changes from v9: - Optimize the ABI documentation file. - Update the brightness value in hardware pattern mode. Changes from v8: - Optimize the ABI documentation file. Changes from v7: - Add its own ABI documentation file. Changes from v6: - None. Changes from v5: - None. Changes from v4: - None. Changes from v3: - None. Changes from v2: - None. Changes from v1: - Remove pattern_get interface. --- .../ABI/testing/sysfs-class-led-driver-sc27xx | 22 drivers/leds/leds-sc27xx-bltc.c| 121 2 files changed, 143 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-class-led-driver-sc27xx diff --git a/Documentation/ABI/testing/sysfs-class-led-driver-sc27xx b/Documentation/ABI/testing/sysfs-class-led-driver-sc27xx new file mode 100644 index 000..45b1e60 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-led-driver-sc27xx @@ -0,0 +1,22 @@ +What: /sys/class/leds//hw_pattern +Date: September 2018 +KernelVersion: 4.20 +Description: + Specify a hardware pattern for the SC27XX LED. For the SC27XX + LED controller, it only supports 4 stages to make a single + hardware pattern, which is used to configure the rise time, + high time, fall time and low time for the breathing mode. + + For the breathing mode, the SC27XX LED only expects one brightness + for the high stage. To be compatible with the hardware pattern + format, we should set brightness as 0 for rise stage, fall + stage and low stage. + + Min stage duration: 125 ms + Max stage duration: 31875 ms + + Since the stage duration step is 125 ms, the duration should be + a multiplier of 125, like 125ms, 250ms, 375ms, 500ms ... 31875ms. + + Thus the format of the hardware pattern values should be: + "0 rise_duration brightness high_duration 0 fall_duration 0 low_duration". diff --git a/drivers/leds/leds-sc27xx-bltc.c b/drivers/leds/leds-sc27xx-bltc.c index 9d9b7aa..fecf27f 100644 --- a/drivers/leds/leds-sc27xx-bltc.c +++ b/drivers/leds/leds-sc27xx-bltc.c @@ -32,8 +32,18 @@ #define SC27XX_DUTY_MASK GENMASK(15, 0) #define SC27XX_MOD_MASKGENMASK(7, 0) +#define SC27XX_CURVE_SHIFT 8 +#define SC27XX_CURVE_L_MASKGENMASK(7, 0) +#define SC27XX_CURVE_H_MASKGENMASK(15, 8) + #define SC27XX_LEDS_OFFSET 0x10 #define SC27XX_LEDS_MAX3 +#define SC27XX_LEDS_PATTERN_CNT4 +/* Stage duration step, in milliseconds */ +#define SC27XX_LEDS_STEP 125 +/* Minimum and maximum duration, in milliseconds */ +#define SC27XX_DELTA_T_MIN SC27XX_LEDS_STEP +#define SC27XX_DELTA_T_MAX (SC27XX_LEDS_STEP * 255) struct sc27xx_led { char name[LED_MAX_NAME_SIZE]; @@ -122,6 +132,113 @@ static int sc27xx_led_set(struct led_classdev *ldev, enum led_brightness value) return err; } +static void sc27xx_led_clamp_align_delta_t(u32 *delta_t) +{ + u32 v, offset, t = *delta_t; + + v = t + SC27XX_LEDS_STEP / 2; + v = clamp_t(u32, v, SC27XX_DELTA_T_MIN, SC27XX_DELTA_T_MAX); + offset = v - SC27XX_DELTA_T_MIN; + offset = SC27XX_LEDS_STEP * (offset / SC27XX_LEDS_STEP); + + *delta_t = SC27XX_DELTA_T_MIN + offset; +} + +static int sc27xx_led_pattern_clear(struct led_classdev *ldev) +{ + struct sc27xx_led *leds = to_sc27xx_led(ldev); + struct regmap *regmap = leds->priv->regmap; + u32 base = sc27xx_led_get_offset(leds); + u32 ctrl_base = leds->priv->base + SC27XX_LEDS_CTRL; + u8 ctrl_shift = SC27XX_CTRL_SHIFT * leds->line; + int err; + + mutex_lock(>priv->lock); + + /* Reset the rise, high, fall and low time to zero. */ + regmap_write(regmap, base + SC27XX_LEDS_CURVE0, 0); + regmap_write(regmap, base + SC27XX_LEDS_CURVE1, 0); + + err = regmap_update_bits(regmap, ctrl_base, + (SC27XX_LED_RUN | SC27XX_LED_TYPE) << ctrl_shift, 0); + + ldev->brightness = LED_OFF; + + mutex_unlock(>priv->lock); + + return err; +} + +static int sc27xx_led_pattern_set(struct led_classdev *ldev, + struct led_pattern *pattern, + u32 len, int repeat) +{ + struct sc27xx_led *leds = to_sc27xx_led(ldev); + u32 base = sc27xx_led_get_offset(leds); + u32 ctrl_base = leds->priv->base + SC27XX_LEDS_CTRL; + u8 ctrl_shift = SC27XX_CTRL_SHIFT *
[PATCH 3/5] arch/powerpc/mm: Nest MMU workaround for mprotect/autonuma RW upgrade.
NestMMU requires us to mark the pte invalid and flush the tlb when we do a RW upgrade of pte. We fixed a variant of this in the fault path in commit Fixes: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang") Do the same for mprotect and autonuma upgrades. Hugetlb is handled in the next patch. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/pgtable.h | 18 +++ arch/powerpc/mm/pgtable-book3s64.c | 34 2 files changed, 52 insertions(+) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index f108e2ce7f64..c55468eaedc7 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -1324,6 +1324,24 @@ static inline const int pud_pfn(pud_t pud) BUILD_BUG(); return 0; } +#define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION +pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *); +void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, +pte_t *, pte_t, pte_t); + +/* + * Returns true for a Read or Write upgrade of pte. + */ +static inline bool is_pte_upgrade(unsigned long old_val, unsigned long new_val) +{ + if ((!(old_val & _PAGE_READ)) && (new_val & _PAGE_READ)) + return true; + + if ((!(old_val & _PAGE_WRITE)) && (new_val & _PAGE_WRITE)) + return true; + + return false; +} #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */ diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c index 43e99e1d947b..43f71125249b 100644 --- a/arch/powerpc/mm/pgtable-book3s64.c +++ b/arch/powerpc/mm/pgtable-book3s64.c @@ -481,3 +481,37 @@ void arch_report_meminfo(struct seq_file *m) atomic_long_read(_pages_count[MMU_PAGE_1G]) << 20); } #endif /* CONFIG_PROC_FS */ + +pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, +pte_t *ptep) +{ + unsigned long pte_val; + + /* +* Clear the _PAGE_PRESENT so that no hardware parallel update is +* possible. Also keep the pte_present true so that we don't take +* wrong fault. +*/ + pte_val = pte_update(vma->vm_mm, addr, ptep, _PAGE_PRESENT, _PAGE_INVALID, 0); + + return __pte(pte_val); + +} +EXPORT_SYMBOL(ptep_modify_prot_start); + +void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, +pte_t *ptep, pte_t old_pte, pte_t pte) +{ + struct mm_struct *mm = vma->vm_mm; + + /* +* To avoid NMMU hang while relaxing access we need to flush the tlb before +* we set the new value. +*/ + if (is_pte_upgrade(pte_val(old_pte), pte_val(pte)) && + (atomic_read(>context.copros) > 0)) + flush_tlb_page(vma, addr); + + set_pte_at(mm, addr, ptep, pte); +} +EXPORT_SYMBOL(ptep_modify_prot_commit); -- 2.17.1
[PATCH 2/5] mm: update ptep_modify_prot_commit to take old pte value as arg
Architectures like ppc64 requires to do a conditional tlb flush based on the old and new value of pte. Enable that by passing old pte value as the arg. Signed-off-by: Aneesh Kumar K.V --- arch/s390/include/asm/pgtable.h | 3 ++- arch/s390/mm/pgtable.c | 2 +- arch/x86/include/asm/paravirt.h | 2 +- fs/proc/task_mmu.c | 8 +--- include/asm-generic/pgtable.h | 2 +- mm/memory.c | 8 mm/mprotect.c | 6 +++--- 7 files changed, 17 insertions(+), 14 deletions(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 8e7f26dfedc6..626250436897 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1036,7 +1036,8 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *); -void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, pte_t *, pte_t); +void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, +pte_t *, pte_t, pte_t); #define __HAVE_ARCH_PTEP_CLEAR_FLUSH static inline pte_t ptep_clear_flush(struct vm_area_struct *vma, diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index 29c0a21cd34a..b283b92722cc 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -322,7 +322,7 @@ pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, EXPORT_SYMBOL(ptep_modify_prot_start); void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, -pte_t *ptep, pte_t pte) +pte_t *ptep, pte_t old_pte, pte_t pte) { pgste_t pgste; struct mm_struct *mm = vma->vm_mm; diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index c5d203a51e50..17214e074286 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -434,7 +434,7 @@ static inline pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned } static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, - pte_t *ptep, pte_t pte) + pte_t *ptep, pte_t old_pte, pte_t pte) { struct mm_struct *mm = vma->vm_mm; diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 229df16e7ad0..505aa21d04df 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -938,10 +938,12 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma, pte_t ptent = *pte; if (pte_present(ptent)) { - ptent = ptep_modify_prot_start(vma, addr, pte); - ptent = pte_wrprotect(ptent); + pte_t old_pte; + + old_pte = ptep_modify_prot_start(vma, addr, pte); + ptent = pte_wrprotect(old_pte); ptent = pte_clear_soft_dirty(ptent); - ptep_modify_prot_commit(vma, addr, pte, ptent); + ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent); } else if (is_swap_pte(ptent)) { ptent = pte_swp_clear_soft_dirty(ptent); set_pte_at(vma->vm_mm, addr, pte, ptent); diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 021b94cd3260..4e4723f6be5e 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -619,7 +619,7 @@ static inline pte_t ptep_modify_prot_start(struct vm_area_struct *vma, */ static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, - pte_t *ptep, pte_t pte) + pte_t *ptep, pte_t old_pte, pte_t pte) { __ptep_modify_prot_commit(vma->vm_mm, addr, ptep, pte); } diff --git a/mm/memory.c b/mm/memory.c index 261d30f51499..211df764f232 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3786,7 +3786,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) int last_cpupid; int target_nid; bool migrated = false; - pte_t pte; + pte_t pte, old_pte; bool was_writable = pte_savedwrite(vmf->orig_pte); int flags = 0; @@ -3806,12 +3806,12 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) * Make it present again, Depending on how arch implementes non * accessible ptes, some can allow access by kernel mode. */ - pte = ptep_modify_prot_start(vma, vmf->address, vmf->pte); - pte = pte_modify(pte, vma->vm_page_prot); + old_pte = ptep_modify_prot_start(vma, vmf->address, vmf->pte); + pte = pte_modify(old_pte, vma->vm_page_prot); pte = pte_mkyoung(pte); if (was_writable) pte = pte_mkwrite(pte); - ptep_modify_prot_commit(vma, vmf->address, vmf->pte, pte);
[PATCH 5/5] arch/powerpc/mm/hugetlb: NestMMU workaround for hugetlb mprotect RW upgrade
NestMMU requires us to mark the pte invalid and flush the tlb when we do a RW upgrade of pte. We fixed a variant of this in the fault path in commit Fixes: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang") Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/hugetlb.h | 8 + arch/powerpc/include/asm/hugetlb.h | 2 +- arch/powerpc/mm/hugetlbpage.c| 35 3 files changed, 44 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h b/arch/powerpc/include/asm/book3s/64/hugetlb.h index 5b0177733994..a12bde29a5f0 100644 --- a/arch/powerpc/include/asm/book3s/64/hugetlb.h +++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h @@ -42,4 +42,12 @@ static inline bool gigantic_page_supported(void) /* hugepd entry valid bit */ #define HUGEPD_VAL_BITS(0x8000UL) +#define huge_ptep_modify_prot_start huge_ptep_modify_prot_start +extern pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, +unsigned long addr, pte_t *ptep); + +#define huge_ptep_modify_prot_commit huge_ptep_modify_prot_commit +extern void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, +unsigned long addr, pte_t *ptep, +pte_t old_pte, pte_t new_pte); #endif diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index 2d00cc530083..60c1d37e446a 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -4,7 +4,6 @@ #ifdef CONFIG_HUGETLB_PAGE #include -#include extern struct kmem_cache *hugepte_cache; @@ -176,6 +175,7 @@ static inline void arch_clear_hugepage_flags(struct page *page) { } +#include #else /* ! CONFIG_HUGETLB_PAGE */ static inline void flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index a7226ed9cae6..8b098bedaff5 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -913,3 +913,38 @@ int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, return 1; } + +#ifdef CONFIG_PPC_BOOK3S_64 +pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep) +{ + unsigned long pte_val; + /* +* Clear the _PAGE_PRESENT so that no hardware parallel update is +* possible. Also keep the pte_present true so that we don't take +* wrong fault. +*/ + pte_val = pte_update(vma->vm_mm, addr, ptep, +_PAGE_PRESENT, _PAGE_INVALID, 1); + + return __pte(pte_val); +} +EXPORT_SYMBOL(huge_ptep_modify_prot_start); + +void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, + pte_t *ptep, pte_t old_pte, pte_t pte) +{ + struct mm_struct *mm = vma->vm_mm; + + /* +* To avoid NMMU hang while relaxing access we need to flush the tlb before +* we set the new value. +*/ + if (is_pte_upgrade(pte_val(old_pte), pte_val(pte)) && + (atomic_read(>context.copros) > 0)) + flush_hugetlb_page(vma, addr); + + set_huge_pte_at(vma->vm_mm, addr, ptep, pte); +} +EXPORT_SYMBOL(huge_ptep_modify_prot_commit); +#endif -- 2.17.1
[PATCH 4/5] mm/hugetlb: Add prot_modify_start/commit sequence for hugetlb update
Signed-off-by: Aneesh Kumar K.V --- include/linux/hugetlb.h | 18 ++ mm/hugetlb.c| 8 +--- 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 087fd5f48c91..e2a3b0c854eb 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -543,6 +543,24 @@ static inline void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr set_huge_pte_at(mm, addr, ptep, pte); } #endif + +#ifndef huge_ptep_modify_prot_start +static inline pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep) +{ + return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); +} +#endif + +#ifndef huge_ptep_modify_prot_commit +static inline void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + pte_t old_pte, pte_t pte) +{ + set_huge_pte_at(vma->vm_mm, addr, ptep, pte); +} +#endif + #else /* CONFIG_HUGETLB_PAGE */ struct hstate {}; #define alloc_huge_page(v, a, r) NULL diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5c390f5a5207..1f3a4df95b2e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4367,10 +4367,12 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, continue; } if (!huge_pte_none(pte)) { - pte = huge_ptep_get_and_clear(mm, address, ptep); - pte = pte_mkhuge(huge_pte_modify(pte, newprot)); + pte_t old_pte; + + old_pte = huge_ptep_modify_prot_start(vma, address, ptep); + pte = pte_mkhuge(huge_pte_modify(old_pte, newprot)); pte = arch_make_huge_pte(pte, vma, NULL, 0); - set_huge_pte_at(mm, address, ptep, pte); + huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte); pages++; } spin_unlock(ptl); -- 2.17.1
[PATCH 3/5] arch/powerpc/mm: Nest MMU workaround for mprotect/autonuma RW upgrade.
NestMMU requires us to mark the pte invalid and flush the tlb when we do a RW upgrade of pte. We fixed a variant of this in the fault path in commit Fixes: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang") Do the same for mprotect and autonuma upgrades. Hugetlb is handled in the next patch. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/pgtable.h | 18 +++ arch/powerpc/mm/pgtable-book3s64.c | 34 2 files changed, 52 insertions(+) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index f108e2ce7f64..c55468eaedc7 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -1324,6 +1324,24 @@ static inline const int pud_pfn(pud_t pud) BUILD_BUG(); return 0; } +#define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION +pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *); +void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, +pte_t *, pte_t, pte_t); + +/* + * Returns true for a Read or Write upgrade of pte. + */ +static inline bool is_pte_upgrade(unsigned long old_val, unsigned long new_val) +{ + if ((!(old_val & _PAGE_READ)) && (new_val & _PAGE_READ)) + return true; + + if ((!(old_val & _PAGE_WRITE)) && (new_val & _PAGE_WRITE)) + return true; + + return false; +} #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */ diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c index 43e99e1d947b..43f71125249b 100644 --- a/arch/powerpc/mm/pgtable-book3s64.c +++ b/arch/powerpc/mm/pgtable-book3s64.c @@ -481,3 +481,37 @@ void arch_report_meminfo(struct seq_file *m) atomic_long_read(_pages_count[MMU_PAGE_1G]) << 20); } #endif /* CONFIG_PROC_FS */ + +pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, +pte_t *ptep) +{ + unsigned long pte_val; + + /* +* Clear the _PAGE_PRESENT so that no hardware parallel update is +* possible. Also keep the pte_present true so that we don't take +* wrong fault. +*/ + pte_val = pte_update(vma->vm_mm, addr, ptep, _PAGE_PRESENT, _PAGE_INVALID, 0); + + return __pte(pte_val); + +} +EXPORT_SYMBOL(ptep_modify_prot_start); + +void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, +pte_t *ptep, pte_t old_pte, pte_t pte) +{ + struct mm_struct *mm = vma->vm_mm; + + /* +* To avoid NMMU hang while relaxing access we need to flush the tlb before +* we set the new value. +*/ + if (is_pte_upgrade(pte_val(old_pte), pte_val(pte)) && + (atomic_read(>context.copros) > 0)) + flush_tlb_page(vma, addr); + + set_pte_at(mm, addr, ptep, pte); +} +EXPORT_SYMBOL(ptep_modify_prot_commit); -- 2.17.1
[PATCH 2/5] mm: update ptep_modify_prot_commit to take old pte value as arg
Architectures like ppc64 requires to do a conditional tlb flush based on the old and new value of pte. Enable that by passing old pte value as the arg. Signed-off-by: Aneesh Kumar K.V --- arch/s390/include/asm/pgtable.h | 3 ++- arch/s390/mm/pgtable.c | 2 +- arch/x86/include/asm/paravirt.h | 2 +- fs/proc/task_mmu.c | 8 +--- include/asm-generic/pgtable.h | 2 +- mm/memory.c | 8 mm/mprotect.c | 6 +++--- 7 files changed, 17 insertions(+), 14 deletions(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 8e7f26dfedc6..626250436897 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1036,7 +1036,8 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *); -void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, pte_t *, pte_t); +void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, +pte_t *, pte_t, pte_t); #define __HAVE_ARCH_PTEP_CLEAR_FLUSH static inline pte_t ptep_clear_flush(struct vm_area_struct *vma, diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index 29c0a21cd34a..b283b92722cc 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -322,7 +322,7 @@ pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, EXPORT_SYMBOL(ptep_modify_prot_start); void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, -pte_t *ptep, pte_t pte) +pte_t *ptep, pte_t old_pte, pte_t pte) { pgste_t pgste; struct mm_struct *mm = vma->vm_mm; diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index c5d203a51e50..17214e074286 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -434,7 +434,7 @@ static inline pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned } static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, - pte_t *ptep, pte_t pte) + pte_t *ptep, pte_t old_pte, pte_t pte) { struct mm_struct *mm = vma->vm_mm; diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 229df16e7ad0..505aa21d04df 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -938,10 +938,12 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma, pte_t ptent = *pte; if (pte_present(ptent)) { - ptent = ptep_modify_prot_start(vma, addr, pte); - ptent = pte_wrprotect(ptent); + pte_t old_pte; + + old_pte = ptep_modify_prot_start(vma, addr, pte); + ptent = pte_wrprotect(old_pte); ptent = pte_clear_soft_dirty(ptent); - ptep_modify_prot_commit(vma, addr, pte, ptent); + ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent); } else if (is_swap_pte(ptent)) { ptent = pte_swp_clear_soft_dirty(ptent); set_pte_at(vma->vm_mm, addr, pte, ptent); diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 021b94cd3260..4e4723f6be5e 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -619,7 +619,7 @@ static inline pte_t ptep_modify_prot_start(struct vm_area_struct *vma, */ static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, - pte_t *ptep, pte_t pte) + pte_t *ptep, pte_t old_pte, pte_t pte) { __ptep_modify_prot_commit(vma->vm_mm, addr, ptep, pte); } diff --git a/mm/memory.c b/mm/memory.c index 261d30f51499..211df764f232 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3786,7 +3786,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) int last_cpupid; int target_nid; bool migrated = false; - pte_t pte; + pte_t pte, old_pte; bool was_writable = pte_savedwrite(vmf->orig_pte); int flags = 0; @@ -3806,12 +3806,12 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) * Make it present again, Depending on how arch implementes non * accessible ptes, some can allow access by kernel mode. */ - pte = ptep_modify_prot_start(vma, vmf->address, vmf->pte); - pte = pte_modify(pte, vma->vm_page_prot); + old_pte = ptep_modify_prot_start(vma, vmf->address, vmf->pte); + pte = pte_modify(old_pte, vma->vm_page_prot); pte = pte_mkyoung(pte); if (was_writable) pte = pte_mkwrite(pte); - ptep_modify_prot_commit(vma, vmf->address, vmf->pte, pte);
[PATCH 5/5] arch/powerpc/mm/hugetlb: NestMMU workaround for hugetlb mprotect RW upgrade
NestMMU requires us to mark the pte invalid and flush the tlb when we do a RW upgrade of pte. We fixed a variant of this in the fault path in commit Fixes: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang") Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/hugetlb.h | 8 + arch/powerpc/include/asm/hugetlb.h | 2 +- arch/powerpc/mm/hugetlbpage.c| 35 3 files changed, 44 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h b/arch/powerpc/include/asm/book3s/64/hugetlb.h index 5b0177733994..a12bde29a5f0 100644 --- a/arch/powerpc/include/asm/book3s/64/hugetlb.h +++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h @@ -42,4 +42,12 @@ static inline bool gigantic_page_supported(void) /* hugepd entry valid bit */ #define HUGEPD_VAL_BITS(0x8000UL) +#define huge_ptep_modify_prot_start huge_ptep_modify_prot_start +extern pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, +unsigned long addr, pte_t *ptep); + +#define huge_ptep_modify_prot_commit huge_ptep_modify_prot_commit +extern void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, +unsigned long addr, pte_t *ptep, +pte_t old_pte, pte_t new_pte); #endif diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index 2d00cc530083..60c1d37e446a 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -4,7 +4,6 @@ #ifdef CONFIG_HUGETLB_PAGE #include -#include extern struct kmem_cache *hugepte_cache; @@ -176,6 +175,7 @@ static inline void arch_clear_hugepage_flags(struct page *page) { } +#include #else /* ! CONFIG_HUGETLB_PAGE */ static inline void flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index a7226ed9cae6..8b098bedaff5 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -913,3 +913,38 @@ int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, return 1; } + +#ifdef CONFIG_PPC_BOOK3S_64 +pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep) +{ + unsigned long pte_val; + /* +* Clear the _PAGE_PRESENT so that no hardware parallel update is +* possible. Also keep the pte_present true so that we don't take +* wrong fault. +*/ + pte_val = pte_update(vma->vm_mm, addr, ptep, +_PAGE_PRESENT, _PAGE_INVALID, 1); + + return __pte(pte_val); +} +EXPORT_SYMBOL(huge_ptep_modify_prot_start); + +void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, + pte_t *ptep, pte_t old_pte, pte_t pte) +{ + struct mm_struct *mm = vma->vm_mm; + + /* +* To avoid NMMU hang while relaxing access we need to flush the tlb before +* we set the new value. +*/ + if (is_pte_upgrade(pte_val(old_pte), pte_val(pte)) && + (atomic_read(>context.copros) > 0)) + flush_hugetlb_page(vma, addr); + + set_huge_pte_at(vma->vm_mm, addr, ptep, pte); +} +EXPORT_SYMBOL(huge_ptep_modify_prot_commit); +#endif -- 2.17.1
[PATCH 4/5] mm/hugetlb: Add prot_modify_start/commit sequence for hugetlb update
Signed-off-by: Aneesh Kumar K.V --- include/linux/hugetlb.h | 18 ++ mm/hugetlb.c| 8 +--- 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 087fd5f48c91..e2a3b0c854eb 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -543,6 +543,24 @@ static inline void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr set_huge_pte_at(mm, addr, ptep, pte); } #endif + +#ifndef huge_ptep_modify_prot_start +static inline pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep) +{ + return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); +} +#endif + +#ifndef huge_ptep_modify_prot_commit +static inline void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + pte_t old_pte, pte_t pte) +{ + set_huge_pte_at(vma->vm_mm, addr, ptep, pte); +} +#endif + #else /* CONFIG_HUGETLB_PAGE */ struct hstate {}; #define alloc_huge_page(v, a, r) NULL diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5c390f5a5207..1f3a4df95b2e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4367,10 +4367,12 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, continue; } if (!huge_pte_none(pte)) { - pte = huge_ptep_get_and_clear(mm, address, ptep); - pte = pte_mkhuge(huge_pte_modify(pte, newprot)); + pte_t old_pte; + + old_pte = huge_ptep_modify_prot_start(vma, address, ptep); + pte = pte_mkhuge(huge_pte_modify(old_pte, newprot)); pte = arch_make_huge_pte(pte, vma, NULL, 0); - set_huge_pte_at(mm, address, ptep, pte); + huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte); pages++; } spin_unlock(ptl); -- 2.17.1
[PATCH 0/5] NestMMU pte upgrade workaround for mprotect and autonuma
We can upgrade pte access (R -> RW transition) via mprotect or autonuma. We need to make sure we follow the recommended pte update sequence as outlined in commit: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang") for such updates. This patch series do that. Aneesh Kumar K.V (5): mm: Update ptep_modify_prot_start/commit to take vm_area_struct as arg mm: update ptep_modify_prot_commit to take old pte value as arg arch/powerpc/mm: Nest MMU workaround for mprotect/autonuma RW upgrade. mm/hugetlb: Add prot_modify_start/commit sequence for hugetlb update arch/powerpc/mm/hugetlb: NestMMU workaround for hugetlb mprotect RW upgrade arch/powerpc/include/asm/book3s/64/hugetlb.h | 8 + arch/powerpc/include/asm/book3s/64/pgtable.h | 18 ++ arch/powerpc/include/asm/hugetlb.h | 2 +- arch/powerpc/mm/hugetlbpage.c| 35 arch/powerpc/mm/pgtable-book3s64.c | 34 +++ arch/s390/include/asm/pgtable.h | 5 +-- arch/s390/mm/pgtable.c | 8 +++-- arch/x86/include/asm/paravirt.h | 9 +++-- fs/proc/task_mmu.c | 8 +++-- include/asm-generic/pgtable.h| 10 +++--- include/linux/hugetlb.h | 18 ++ mm/hugetlb.c | 8 +++-- mm/memory.c | 8 ++--- mm/mprotect.c| 6 ++-- 14 files changed, 150 insertions(+), 27 deletions(-) -- 2.17.1
[PATCH 1/5] mm: Update ptep_modify_prot_start/commit to take vm_area_struct as arg
Some architecture may want to call flush_tlb_range from these helpers. Signed-off-by: Aneesh Kumar K.V --- arch/s390/include/asm/pgtable.h | 4 ++-- arch/s390/mm/pgtable.c | 6 -- arch/x86/include/asm/paravirt.h | 7 +-- fs/proc/task_mmu.c | 4 ++-- include/asm-generic/pgtable.h | 8 mm/memory.c | 4 ++-- mm/mprotect.c | 4 ++-- 7 files changed, 21 insertions(+), 16 deletions(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 0e7cb0dc9c33..8e7f26dfedc6 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1035,8 +1035,8 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, } #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION -pte_t ptep_modify_prot_start(struct mm_struct *, unsigned long, pte_t *); -void ptep_modify_prot_commit(struct mm_struct *, unsigned long, pte_t *, pte_t); +pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *); +void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, pte_t *, pte_t); #define __HAVE_ARCH_PTEP_CLEAR_FLUSH static inline pte_t ptep_clear_flush(struct vm_area_struct *vma, diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index f2cc7da473e4..29c0a21cd34a 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -301,12 +301,13 @@ pte_t ptep_xchg_lazy(struct mm_struct *mm, unsigned long addr, } EXPORT_SYMBOL(ptep_xchg_lazy); -pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr, +pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pgste_t pgste; pte_t old; int nodat; + struct mm_struct *mm = vma->vm_mm; preempt_disable(); pgste = ptep_xchg_start(mm, addr, ptep); @@ -320,10 +321,11 @@ pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr, } EXPORT_SYMBOL(ptep_modify_prot_start); -void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr, +void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t pte) { pgste_t pgste; + struct mm_struct *mm = vma->vm_mm; if (!MACHINE_HAS_NX) pte_val(pte) &= ~_PAGE_NOEXEC; diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index e375d4266b53..c5d203a51e50 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -421,10 +421,11 @@ static inline pgdval_t pgd_val(pgd_t pgd) } #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION -static inline pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr, +static inline pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pteval_t ret; + struct mm_struct *mm = vma->vm_mm; ret = PVOP_CALL3(pteval_t, pv_mmu_ops.ptep_modify_prot_start, mm, addr, ptep); @@ -432,9 +433,11 @@ static inline pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long a return (pte_t) { .pte = ret }; } -static inline void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr, +static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t pte) { + struct mm_struct *mm = vma->vm_mm; + if (sizeof(pteval_t) > sizeof(long)) /* 5 arg words */ pv_mmu_ops.ptep_modify_prot_commit(mm, addr, ptep, pte); diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 5ea1d64cb0b4..229df16e7ad0 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -938,10 +938,10 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma, pte_t ptent = *pte; if (pte_present(ptent)) { - ptent = ptep_modify_prot_start(vma->vm_mm, addr, pte); + ptent = ptep_modify_prot_start(vma, addr, pte); ptent = pte_wrprotect(ptent); ptent = pte_clear_soft_dirty(ptent); - ptep_modify_prot_commit(vma->vm_mm, addr, pte, ptent); + ptep_modify_prot_commit(vma, addr, pte, ptent); } else if (is_swap_pte(ptent)) { ptent = pte_swp_clear_soft_dirty(ptent); set_pte_at(vma->vm_mm, addr, pte, ptent); diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 88ebc6102c7c..021b94cd3260 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -606,22 +606,22 @@ static inline void __ptep_modify_prot_commit(struct mm_struct *mm, * queue the update to be done at some later time. The update must be * actually committed before the pte lock is released, however. */ -static inline pte_t ptep_modify_prot_start(struct
[PATCH] mtd: spi-nor: Add support for SPI boot flash access for AMD Family 16h
Add support to expose the SPI boot flash on AMD Family 16h CPUs as a standard mtd device to give userspace BIOS updaters greater feature support. The BIOS and Kernel Developer's Guide refers to this as the 'SPI ROM' controller and so the driver follows that naming convention for consistency. Signed-off-by: Brett Grandbois --- drivers/mtd/spi-nor/Kconfig | 15 + drivers/mtd/spi-nor/Makefile | 1 + drivers/mtd/spi-nor/amd-spirom.c | 805 +++ 3 files changed, 821 insertions(+) create mode 100644 drivers/mtd/spi-nor/amd-spirom.c diff --git a/drivers/mtd/spi-nor/Kconfig b/drivers/mtd/spi-nor/Kconfig index 6cc9c929ff57..f99b40ec0fef 100644 --- a/drivers/mtd/spi-nor/Kconfig +++ b/drivers/mtd/spi-nor/Kconfig @@ -129,4 +129,19 @@ config SPI_STM32_QUADSPI This enables support for the STM32 Quad SPI controller. We only connect the NOR to this controller. +config SPI_AMD_SPIROM + tristate "AMD Hudson FCH SPI flash drvier (DANGEROUS)" + depends on X86 && PCI + help + This enables support for the AMD Family 16h SPI flash controller to + access the boot flash from Linux as an mtd device. + + Using this driver it is possible to upgrade BIOS directly from Linux. + + Say N here unless you know what you are doing. Overwriting the + SPI flash may render the system unbootable. + + To compile this driver as a module, choose M here: the module + will be called amd-spirom. + endif # MTD_SPI_NOR diff --git a/drivers/mtd/spi-nor/Makefile b/drivers/mtd/spi-nor/Makefile index f4c61d282abd..e49ea4e619c1 100644 --- a/drivers/mtd/spi-nor/Makefile +++ b/drivers/mtd/spi-nor/Makefile @@ -11,3 +11,4 @@ obj-$(CONFIG_SPI_INTEL_SPI) += intel-spi.o obj-$(CONFIG_SPI_INTEL_SPI_PCI)+= intel-spi-pci.o obj-$(CONFIG_SPI_INTEL_SPI_PLATFORM) += intel-spi-platform.o obj-$(CONFIG_SPI_STM32_QUADSPI)+= stm32-quadspi.o +obj-$(CONFIG_SPI_AMD_SPIROM) += amd-spirom.o diff --git a/drivers/mtd/spi-nor/amd-spirom.c b/drivers/mtd/spi-nor/amd-spirom.c new file mode 100644 index ..514e67edc9cd --- /dev/null +++ b/drivers/mtd/spi-nor/amd-spirom.c @@ -0,0 +1,805 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (C) 2018, Opengear + * + * AMD Family 16h Hudson FCH SPI flash driver. + * + * When the FCH is strapped to SPI boot ROM mode 'SPIROM' + * the FCH will do a flash auto-probe and self-configure + * for read operations to the ROM address range(s). + * For any command outside of read/write (chip erase, etc) + * you need to go through the alternate program method. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* FCH Device LPC Bridge Configuration Registers */ +#define PCI_DEVICE_ID_AMD_FCH_LPC_BRIDGE 0x780E + +#define FCH_PCI_CONTROL0x40 +#define FCH_INTEGRATED_EC_PRESENT 0x80 +#define FCH_EC_SEM 0x40 +#define FCH_BIOS_SEM 0x20 +#define FCH_LEGACY_DMA 0x04 + +#define FCH_ROM_ADDR_RANGE_2 0x6C + +#define FCH_SPI_BASE_ADDR 0xA0 +#define FCH_SPI_BASE_ADDR_MASK 0xFFC0 +#define FCH_SPI_ROUTE_TPM_SPI 0x08 +#define FCH_SPI_ROM_ENABLE 0x02 + +/* up through FIFO [C6:80] */ +#define SPI_IO_REGION_LEN 256 + +/* SPI Registers, the labels come from the BKDG */ +#define SPI_CNTRL0 0x00 +#define SPI_CNTRL0_FIFO_PTR_CLEAR 0x0010 +#define SPI_CNTRL0_FIFO_PTR_CLEAR_MASK 0xFFEF +#define SPI_CNTRL0_SPI_ARB_ENABLE 0x0008 +#define SPI_CNTRL0_SPI_ARB_ENABLE_MASK 0xFFF7 + +#define ALT_SPI_CS 0x1D +#define ALT_SPI_CS_MASK0xFC +#define ALT_SPI_CS_WR_BUF_EN 0x04 + +#define SPI100_ENABLE 0x20 +#define SPI100_SPEED_CONFIG0x22 + + +/* SPI control shadow registers */ +#define CMD_CODE 0x45 + +#define CMD_TRIGGER0x47 +#define CMD_TRIGGER_EXECUTE0x80 + +#define TX_BYTE_COUNT 0x48 + +#define RX_BYTE_COUNT 0x4B + +#define SPI_STATUS 0x4C +#define SPI_STATUS_BUSY_MASK 0x8000 + +#define SPI_FIFO 0x80 + +static const struct pci_device_id amd_fch_lpc_pci_device_ids[] = { + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_FCH_LPC_BRIDGE) }, + {} +}; +MODULE_DEVICE_TABLE(pci, amd_fch_lpc_pci_device_ids); + +static short norm_speed = -1; +module_param(norm_speed, short, 0444); +MODULE_PARM_DESC(norm_speed, "Specify SPI speed for normal read. This sets NormSpeedNew[3:0] from BKDG. -1 means use
[PATCH 0/5] NestMMU pte upgrade workaround for mprotect and autonuma
We can upgrade pte access (R -> RW transition) via mprotect or autonuma. We need to make sure we follow the recommended pte update sequence as outlined in commit: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang") for such updates. This patch series do that. Aneesh Kumar K.V (5): mm: Update ptep_modify_prot_start/commit to take vm_area_struct as arg mm: update ptep_modify_prot_commit to take old pte value as arg arch/powerpc/mm: Nest MMU workaround for mprotect/autonuma RW upgrade. mm/hugetlb: Add prot_modify_start/commit sequence for hugetlb update arch/powerpc/mm/hugetlb: NestMMU workaround for hugetlb mprotect RW upgrade arch/powerpc/include/asm/book3s/64/hugetlb.h | 8 + arch/powerpc/include/asm/book3s/64/pgtable.h | 18 ++ arch/powerpc/include/asm/hugetlb.h | 2 +- arch/powerpc/mm/hugetlbpage.c| 35 arch/powerpc/mm/pgtable-book3s64.c | 34 +++ arch/s390/include/asm/pgtable.h | 5 +-- arch/s390/mm/pgtable.c | 8 +++-- arch/x86/include/asm/paravirt.h | 9 +++-- fs/proc/task_mmu.c | 8 +++-- include/asm-generic/pgtable.h| 10 +++--- include/linux/hugetlb.h | 18 ++ mm/hugetlb.c | 8 +++-- mm/memory.c | 8 ++--- mm/mprotect.c| 6 ++-- 14 files changed, 150 insertions(+), 27 deletions(-) -- 2.17.1
[PATCH 1/5] mm: Update ptep_modify_prot_start/commit to take vm_area_struct as arg
Some architecture may want to call flush_tlb_range from these helpers. Signed-off-by: Aneesh Kumar K.V --- arch/s390/include/asm/pgtable.h | 4 ++-- arch/s390/mm/pgtable.c | 6 -- arch/x86/include/asm/paravirt.h | 7 +-- fs/proc/task_mmu.c | 4 ++-- include/asm-generic/pgtable.h | 8 mm/memory.c | 4 ++-- mm/mprotect.c | 4 ++-- 7 files changed, 21 insertions(+), 16 deletions(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 0e7cb0dc9c33..8e7f26dfedc6 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1035,8 +1035,8 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, } #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION -pte_t ptep_modify_prot_start(struct mm_struct *, unsigned long, pte_t *); -void ptep_modify_prot_commit(struct mm_struct *, unsigned long, pte_t *, pte_t); +pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *); +void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, pte_t *, pte_t); #define __HAVE_ARCH_PTEP_CLEAR_FLUSH static inline pte_t ptep_clear_flush(struct vm_area_struct *vma, diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index f2cc7da473e4..29c0a21cd34a 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -301,12 +301,13 @@ pte_t ptep_xchg_lazy(struct mm_struct *mm, unsigned long addr, } EXPORT_SYMBOL(ptep_xchg_lazy); -pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr, +pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pgste_t pgste; pte_t old; int nodat; + struct mm_struct *mm = vma->vm_mm; preempt_disable(); pgste = ptep_xchg_start(mm, addr, ptep); @@ -320,10 +321,11 @@ pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr, } EXPORT_SYMBOL(ptep_modify_prot_start); -void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr, +void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t pte) { pgste_t pgste; + struct mm_struct *mm = vma->vm_mm; if (!MACHINE_HAS_NX) pte_val(pte) &= ~_PAGE_NOEXEC; diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index e375d4266b53..c5d203a51e50 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -421,10 +421,11 @@ static inline pgdval_t pgd_val(pgd_t pgd) } #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION -static inline pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr, +static inline pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pteval_t ret; + struct mm_struct *mm = vma->vm_mm; ret = PVOP_CALL3(pteval_t, pv_mmu_ops.ptep_modify_prot_start, mm, addr, ptep); @@ -432,9 +433,11 @@ static inline pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long a return (pte_t) { .pte = ret }; } -static inline void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr, +static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t pte) { + struct mm_struct *mm = vma->vm_mm; + if (sizeof(pteval_t) > sizeof(long)) /* 5 arg words */ pv_mmu_ops.ptep_modify_prot_commit(mm, addr, ptep, pte); diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 5ea1d64cb0b4..229df16e7ad0 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -938,10 +938,10 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma, pte_t ptent = *pte; if (pte_present(ptent)) { - ptent = ptep_modify_prot_start(vma->vm_mm, addr, pte); + ptent = ptep_modify_prot_start(vma, addr, pte); ptent = pte_wrprotect(ptent); ptent = pte_clear_soft_dirty(ptent); - ptep_modify_prot_commit(vma->vm_mm, addr, pte, ptent); + ptep_modify_prot_commit(vma, addr, pte, ptent); } else if (is_swap_pte(ptent)) { ptent = pte_swp_clear_soft_dirty(ptent); set_pte_at(vma->vm_mm, addr, pte, ptent); diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 88ebc6102c7c..021b94cd3260 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -606,22 +606,22 @@ static inline void __ptep_modify_prot_commit(struct mm_struct *mm, * queue the update to be done at some later time. The update must be * actually committed before the pte lock is released, however. */ -static inline pte_t ptep_modify_prot_start(struct
[PATCH] mtd: spi-nor: Add support for SPI boot flash access for AMD Family 16h
Add support to expose the SPI boot flash on AMD Family 16h CPUs as a standard mtd device to give userspace BIOS updaters greater feature support. The BIOS and Kernel Developer's Guide refers to this as the 'SPI ROM' controller and so the driver follows that naming convention for consistency. Signed-off-by: Brett Grandbois --- drivers/mtd/spi-nor/Kconfig | 15 + drivers/mtd/spi-nor/Makefile | 1 + drivers/mtd/spi-nor/amd-spirom.c | 805 +++ 3 files changed, 821 insertions(+) create mode 100644 drivers/mtd/spi-nor/amd-spirom.c diff --git a/drivers/mtd/spi-nor/Kconfig b/drivers/mtd/spi-nor/Kconfig index 6cc9c929ff57..f99b40ec0fef 100644 --- a/drivers/mtd/spi-nor/Kconfig +++ b/drivers/mtd/spi-nor/Kconfig @@ -129,4 +129,19 @@ config SPI_STM32_QUADSPI This enables support for the STM32 Quad SPI controller. We only connect the NOR to this controller. +config SPI_AMD_SPIROM + tristate "AMD Hudson FCH SPI flash drvier (DANGEROUS)" + depends on X86 && PCI + help + This enables support for the AMD Family 16h SPI flash controller to + access the boot flash from Linux as an mtd device. + + Using this driver it is possible to upgrade BIOS directly from Linux. + + Say N here unless you know what you are doing. Overwriting the + SPI flash may render the system unbootable. + + To compile this driver as a module, choose M here: the module + will be called amd-spirom. + endif # MTD_SPI_NOR diff --git a/drivers/mtd/spi-nor/Makefile b/drivers/mtd/spi-nor/Makefile index f4c61d282abd..e49ea4e619c1 100644 --- a/drivers/mtd/spi-nor/Makefile +++ b/drivers/mtd/spi-nor/Makefile @@ -11,3 +11,4 @@ obj-$(CONFIG_SPI_INTEL_SPI) += intel-spi.o obj-$(CONFIG_SPI_INTEL_SPI_PCI)+= intel-spi-pci.o obj-$(CONFIG_SPI_INTEL_SPI_PLATFORM) += intel-spi-platform.o obj-$(CONFIG_SPI_STM32_QUADSPI)+= stm32-quadspi.o +obj-$(CONFIG_SPI_AMD_SPIROM) += amd-spirom.o diff --git a/drivers/mtd/spi-nor/amd-spirom.c b/drivers/mtd/spi-nor/amd-spirom.c new file mode 100644 index ..514e67edc9cd --- /dev/null +++ b/drivers/mtd/spi-nor/amd-spirom.c @@ -0,0 +1,805 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (C) 2018, Opengear + * + * AMD Family 16h Hudson FCH SPI flash driver. + * + * When the FCH is strapped to SPI boot ROM mode 'SPIROM' + * the FCH will do a flash auto-probe and self-configure + * for read operations to the ROM address range(s). + * For any command outside of read/write (chip erase, etc) + * you need to go through the alternate program method. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* FCH Device LPC Bridge Configuration Registers */ +#define PCI_DEVICE_ID_AMD_FCH_LPC_BRIDGE 0x780E + +#define FCH_PCI_CONTROL0x40 +#define FCH_INTEGRATED_EC_PRESENT 0x80 +#define FCH_EC_SEM 0x40 +#define FCH_BIOS_SEM 0x20 +#define FCH_LEGACY_DMA 0x04 + +#define FCH_ROM_ADDR_RANGE_2 0x6C + +#define FCH_SPI_BASE_ADDR 0xA0 +#define FCH_SPI_BASE_ADDR_MASK 0xFFC0 +#define FCH_SPI_ROUTE_TPM_SPI 0x08 +#define FCH_SPI_ROM_ENABLE 0x02 + +/* up through FIFO [C6:80] */ +#define SPI_IO_REGION_LEN 256 + +/* SPI Registers, the labels come from the BKDG */ +#define SPI_CNTRL0 0x00 +#define SPI_CNTRL0_FIFO_PTR_CLEAR 0x0010 +#define SPI_CNTRL0_FIFO_PTR_CLEAR_MASK 0xFFEF +#define SPI_CNTRL0_SPI_ARB_ENABLE 0x0008 +#define SPI_CNTRL0_SPI_ARB_ENABLE_MASK 0xFFF7 + +#define ALT_SPI_CS 0x1D +#define ALT_SPI_CS_MASK0xFC +#define ALT_SPI_CS_WR_BUF_EN 0x04 + +#define SPI100_ENABLE 0x20 +#define SPI100_SPEED_CONFIG0x22 + + +/* SPI control shadow registers */ +#define CMD_CODE 0x45 + +#define CMD_TRIGGER0x47 +#define CMD_TRIGGER_EXECUTE0x80 + +#define TX_BYTE_COUNT 0x48 + +#define RX_BYTE_COUNT 0x4B + +#define SPI_STATUS 0x4C +#define SPI_STATUS_BUSY_MASK 0x8000 + +#define SPI_FIFO 0x80 + +static const struct pci_device_id amd_fch_lpc_pci_device_ids[] = { + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_FCH_LPC_BRIDGE) }, + {} +}; +MODULE_DEVICE_TABLE(pci, amd_fch_lpc_pci_device_ids); + +static short norm_speed = -1; +module_param(norm_speed, short, 0444); +MODULE_PARM_DESC(norm_speed, "Specify SPI speed for normal read. This sets NormSpeedNew[3:0] from BKDG. -1 means use
Re: [PATCH security-next v5 00/30] LSM: Explict ordering
On Wed, 10 Oct 2018, Kees Cook wrote: > v5: > - redesigned to use CONFIG_LSM= and lsm= for both ordering and enabling > - dropped various Reviewed-bys due to rather large refactoring Patches 1-10 applied to git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next-general and next-testing. -- James Morris
Re: [PATCH security-next v5 00/30] LSM: Explict ordering
On Wed, 10 Oct 2018, Kees Cook wrote: > v5: > - redesigned to use CONFIG_LSM= and lsm= for both ordering and enabling > - dropped various Reviewed-bys due to rather large refactoring Patches 1-10 applied to git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next-general and next-testing. -- James Morris
Re: [f2fs-dev] [PATCH] f2fs: fix data corruption issue with hardware encryption
On Wed, Oct 10, 2018 at 08:05:44PM -0700, Jaegeuk Kim wrote: > On 10/10, Jaegeuk Kim wrote: > > On 10/11, Sahitya Tummala wrote: > > > On Wed, Oct 10, 2018 at 02:34:02PM -0700, Jaegeuk Kim wrote: > > > > On 10/10, Sahitya Tummala wrote: > > > > > Direct IO can be used in case of hardware encryption. The following > > > > > scenario results into data corruption issue in this path - > > > > > > > > > > Thread A - Thread B- > > > > > -> write file#1 in direct IO > > > > > -> GC gets kicked in > > > > > -> GC submitted bio on meta > > > > > mapping > > > > > for file#1, but pending > > > > > completion > > > > > -> write file#1 again with new data > > > > >in direct IO > > > > > -> GC bio gets completed now > > > > > -> GC writes old data to the new > > > > >location and thus file#1 is > > > > > corrupted. > > > > > > > > > > Fix this by submitting and waiting for pending io on meta mapping > > > > > for direct IO case in f2fs_map_blocks(). > > > > > > > > > > Signed-off-by: Sahitya Tummala > > > > > --- > > > > > fs/f2fs/data.c | 12 > > > > > 1 file changed, 12 insertions(+) > > > > > > > > > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > > > > > index 9ef6f1f..7b2fef0 100644 > > > > > --- a/fs/f2fs/data.c > > > > > +++ b/fs/f2fs/data.c > > > > > @@ -1028,6 +1028,12 @@ int f2fs_map_blocks(struct inode *inode, > > > > > struct f2fs_map_blocks *map, > > > > > map->m_pblk = ei.blk + pgofs - ei.fofs; > > > > > map->m_len = min((pgoff_t)maxblocks, ei.fofs + ei.len - > > > > > pgofs); > > > > > map->m_flags = F2FS_MAP_MAPPED; > > > > > + /* for HW encryption, but to avoid potential issue in > > > > > future */ > > > > > + if (flag == F2FS_GET_BLOCK_DIO) { > > > > > + blkaddr = map->m_pblk; > > > > > + for (; blkaddr < map->m_pblk + map->m_len; > > > > > blkaddr++) > > > > > + f2fs_wait_on_block_writeback(sbi, > > > > > blkaddr); > > > > > > > > Do we need this? IIRC, DIO would give create=1. > > > > > > Yes, we need it. When we are overwriting an existing file, DIO calls > > > f2fs_map_blocks() with create=0. From the DIO code, I see that this > > > happens > > > because blockdev_direct_IO() passes this dio flag DIO_SKIP_HOLES. And then > > > in get_more_blocks(), below code updates create=0, when we are overwriting > > > an existing file. > > > > > > create = dio->op == REQ_OP_WRITE; > > > if (dio->flags & DIO_SKIP_HOLES) { > > > if (fs_startblk <= ((i_size_read(dio->inode) - 1) > > > >> > > > i_blkbits)) > > > create = 0; > > > } > > > > > > ret = (*sdio->get_block)(dio->inode, fs_startblk, > > > map_bh, create); > > > > > > > Got it. > > How about this? > > > > Sorry, this is v2. Looks good to me. Thanks for updating it :) > > From b78dd7b2e0317be18716b9496269e9792829f63e Mon Sep 17 00:00:00 2001 > From: Sahitya Tummala > Date: Wed, 10 Oct 2018 10:56:22 +0530 > Subject: [PATCH] f2fs: fix data corruption issue with hardware encryption > > Direct IO can be used in case of hardware encryption. The following > scenario results into data corruption issue in this path - > > Thread A - Thread B- > -> write file#1 in direct IO > -> GC gets kicked in > -> GC submitted bio on meta mapping > for file#1, but pending completion > -> write file#1 again with new data >in direct IO > -> GC bio gets completed now > -> GC writes old data to the new >location and thus file#1 is > corrupted. > > Fix this by submitting and waiting for pending io on meta mapping > for direct IO case in f2fs_map_blocks(). > > Signed-off-by: Sahitya Tummala > Signed-off-by: Jaegeuk Kim > --- > fs/f2fs/data.c| 11 +++ > fs/f2fs/f2fs.h| 2 ++ > fs/f2fs/segment.c | 9 + > 3 files changed, 22 insertions(+) > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > index be19257d9e36..8952f2d610a6 100644 > --- a/fs/f2fs/data.c > +++ b/fs/f2fs/data.c > @@ -1030,6 +1030,11 @@ int f2fs_map_blocks(struct inode *inode, struct > f2fs_map_blocks *map, > map->m_flags = F2FS_MAP_MAPPED; > if (map->m_next_extent) > *map->m_next_extent
Re: [f2fs-dev] [PATCH] f2fs: fix data corruption issue with hardware encryption
On Wed, Oct 10, 2018 at 08:05:44PM -0700, Jaegeuk Kim wrote: > On 10/10, Jaegeuk Kim wrote: > > On 10/11, Sahitya Tummala wrote: > > > On Wed, Oct 10, 2018 at 02:34:02PM -0700, Jaegeuk Kim wrote: > > > > On 10/10, Sahitya Tummala wrote: > > > > > Direct IO can be used in case of hardware encryption. The following > > > > > scenario results into data corruption issue in this path - > > > > > > > > > > Thread A - Thread B- > > > > > -> write file#1 in direct IO > > > > > -> GC gets kicked in > > > > > -> GC submitted bio on meta > > > > > mapping > > > > > for file#1, but pending > > > > > completion > > > > > -> write file#1 again with new data > > > > >in direct IO > > > > > -> GC bio gets completed now > > > > > -> GC writes old data to the new > > > > >location and thus file#1 is > > > > > corrupted. > > > > > > > > > > Fix this by submitting and waiting for pending io on meta mapping > > > > > for direct IO case in f2fs_map_blocks(). > > > > > > > > > > Signed-off-by: Sahitya Tummala > > > > > --- > > > > > fs/f2fs/data.c | 12 > > > > > 1 file changed, 12 insertions(+) > > > > > > > > > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > > > > > index 9ef6f1f..7b2fef0 100644 > > > > > --- a/fs/f2fs/data.c > > > > > +++ b/fs/f2fs/data.c > > > > > @@ -1028,6 +1028,12 @@ int f2fs_map_blocks(struct inode *inode, > > > > > struct f2fs_map_blocks *map, > > > > > map->m_pblk = ei.blk + pgofs - ei.fofs; > > > > > map->m_len = min((pgoff_t)maxblocks, ei.fofs + ei.len - > > > > > pgofs); > > > > > map->m_flags = F2FS_MAP_MAPPED; > > > > > + /* for HW encryption, but to avoid potential issue in > > > > > future */ > > > > > + if (flag == F2FS_GET_BLOCK_DIO) { > > > > > + blkaddr = map->m_pblk; > > > > > + for (; blkaddr < map->m_pblk + map->m_len; > > > > > blkaddr++) > > > > > + f2fs_wait_on_block_writeback(sbi, > > > > > blkaddr); > > > > > > > > Do we need this? IIRC, DIO would give create=1. > > > > > > Yes, we need it. When we are overwriting an existing file, DIO calls > > > f2fs_map_blocks() with create=0. From the DIO code, I see that this > > > happens > > > because blockdev_direct_IO() passes this dio flag DIO_SKIP_HOLES. And then > > > in get_more_blocks(), below code updates create=0, when we are overwriting > > > an existing file. > > > > > > create = dio->op == REQ_OP_WRITE; > > > if (dio->flags & DIO_SKIP_HOLES) { > > > if (fs_startblk <= ((i_size_read(dio->inode) - 1) > > > >> > > > i_blkbits)) > > > create = 0; > > > } > > > > > > ret = (*sdio->get_block)(dio->inode, fs_startblk, > > > map_bh, create); > > > > > > > Got it. > > How about this? > > > > Sorry, this is v2. Looks good to me. Thanks for updating it :) > > From b78dd7b2e0317be18716b9496269e9792829f63e Mon Sep 17 00:00:00 2001 > From: Sahitya Tummala > Date: Wed, 10 Oct 2018 10:56:22 +0530 > Subject: [PATCH] f2fs: fix data corruption issue with hardware encryption > > Direct IO can be used in case of hardware encryption. The following > scenario results into data corruption issue in this path - > > Thread A - Thread B- > -> write file#1 in direct IO > -> GC gets kicked in > -> GC submitted bio on meta mapping > for file#1, but pending completion > -> write file#1 again with new data >in direct IO > -> GC bio gets completed now > -> GC writes old data to the new >location and thus file#1 is > corrupted. > > Fix this by submitting and waiting for pending io on meta mapping > for direct IO case in f2fs_map_blocks(). > > Signed-off-by: Sahitya Tummala > Signed-off-by: Jaegeuk Kim > --- > fs/f2fs/data.c| 11 +++ > fs/f2fs/f2fs.h| 2 ++ > fs/f2fs/segment.c | 9 + > 3 files changed, 22 insertions(+) > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > index be19257d9e36..8952f2d610a6 100644 > --- a/fs/f2fs/data.c > +++ b/fs/f2fs/data.c > @@ -1030,6 +1030,11 @@ int f2fs_map_blocks(struct inode *inode, struct > f2fs_map_blocks *map, > map->m_flags = F2FS_MAP_MAPPED; > if (map->m_next_extent) > *map->m_next_extent
Re: [PATCH] scsi: arcmsr: clean up clang warning on extraneous parentheses
Colin, > There are extraneous parantheses that are causing clang to produce a > warning so remove these. > > Clean up 3 clang warnings: > equality comparison with extraneous parentheses [-Wparentheses-equality] Applied to 4.20/scsi-queue, thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi: arcmsr: clean up clang warning on extraneous parentheses
Colin, > There are extraneous parantheses that are causing clang to produce a > warning so remove these. > > Clean up 3 clang warnings: > equality comparison with extraneous parentheses [-Wparentheses-equality] Applied to 4.20/scsi-queue, thanks! -- Martin K. Petersen Oracle Linux Engineering